Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Last update: Sep 16, 2022

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Official implementation of ACC, described in the paper "Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning". The source code is based on the pytorch implementation of TQC, which again is based on TD3. We thank the authors for making their source code publicly available.

Requirements

Install MuJoCo

Download and install MuJoCo 1.50 from the MuJoCo website. We assume that the MuJoCo files are extracted to the default location (~/.mujoco/mjpro150).
Copy your MuJoCo license key (mjkey.txt) to ~/.mujoco/mjkey.txt:

Install

We recommend to use an anaconda environment. In our experiments we used python 3.7 and the following dependencies

pip install gym==0.17.2 mujoco-py==1.50.1.68 numpy==1.19.1 torch==1.6.0 torchvision==0.7.0

Running ACC

You can run ACC for TQC on one of the gym continuous control environments by calling

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --seed 0

To run the data efficient variant with 4 critic update steps per environment step you can call

python main.py --env "HalfCheetah-v3" --max_timesteps 1000000 --num_critic_updates 4 --seed 0

An example script that runs the experiments for 10 seeds and all environments is in run_experiment.sh and run_experiment_data_efficient.sh.

You can speed up the experiments by using fewer networks in the ensemble of TQC. This trades off a little bit of performance for a faster runtime (see the Appendix of the paper). The number of networks can be controlled with the flag --n_nets. For example

python main.py --env "HalfCheetah-v3" --max_timesteps 5000000 --n_nets 2--seed 0

Source code for Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Related tags

Overview

Adaptively Calibrated Critic Estimates for Deep Reinforcement Learning

Requirements

Install MuJoCo

Install

Running ACC

Owner

Stochastic Scene-Aware Motion Prediction

Hyperbolic Hierarchical Clustering.

modelvshuman is a Python library to benchmark the gap between human and machine vision

Pytorch implementation for DFN: Distributed Feedback Network for Single-Image Deraining.

Improving Compound Activity Classification via Deep Transfer and Representation Learning

Codebase for the paper titled "Continual learning with local module selection"

Arch-Net: Model Distillation for Architecture Agnostic Model Deployment

Face-Recognition-Attendence-System - This face recognition Attendence system using Python

Code and data for ACL2021 paper Cross-Lingual Abstractive Summarization with Limited Parallel Resources.

This is a yolo3 implemented via tensorflow 2.7

Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

Uni-Fold: Training your own deep protein-folding models

Cryptocurrency Prediction with Artificial Intelligence (Deep Learning via LSTM Neural Networks)

This is a Keras-based Python implementation of DeepMask- a complex deep neural network for learning object segmentation masks

lightweight python wrapper for vowpal wabbit

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

Telegram chatbot created with deep learning model (LSTM) and telebot library.

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

SafePicking: Learning Safe Object Extraction via Object-Level Mapping, ICRA 2022