Learning Off-Policy with Online Planning, CoRL 2021

Last update: Nov 22, 2022

Related tags

Deep Learning LOOP

Overview

LOOP: Learning Off-Policy with Online Planning

Accepted in Conference of Robot Learning (CoRL) 2021.

Harshit Sikchi, Wenxuan Zhou, David Held

Paper

Install

PyTorch 1.5
OpenAI Gym
MuJoCo
tqdm
D4RL dataset

File Structure

LOOP (Core method)
- Training code (Online RL): train_loop_sac.py
- Training code (Offline RL): train_loop_offline.py
- Training code (safe RL): train_loop_safety.py
- Policies (online/offline/safety): policies.py
- ARC/H-step lookahead policy: controllers/
Environments: envs/
Configurations: configs/

Instructions

All the experiments are to be run under the root folder.
Config files in configs/ are used to specify hyperparameters for controllers and dynamics.
Please keep all the other values in yml files consistent with hyperparamters given in paper to reproduce the results in our paper.

Experiments

Sec 6.1 LOOP for Online RL

python train_loop_sac.py --env=<env_name> --policy=LOOP_SAC_ARC --start_timesteps=<initial exploration steps> --exp_name=<location_to_logs>

Environments wrappers with their termination condition can be found under envs/

Sec 6.2 LOOP for Offline RL

Download CRR trained models from Link into the root folder.

python train_loop_offline.py --env=<env_name> --policy=LOOP_OFFLINE_ARC --exp_name=<location_to_logs>  --offline_algo=CRR --prior_type=CRR

Currently supported for d4rl MuJoCo locomotions tasks only.

Sec 6.3 LOOP for Safe RL

python train_loop_safety.py --env=<env_name> --policy=safeLOOP_ARC --exp_name=<location_to_logs>

Safety environments can be found under envs/safety_envs.py

References

Parts of the codes are used from the references mentioned below:

@article{SpinningUp2018,
    author = {Achiam, Joshua},
    title = {{Spinning Up in Deep Reinforcement Learning}},
    year = {2018}
}

https://github.com/Xingyu-Lin/mbpo_pytorch

Comments

Environment reproducibility

Hi, I am trying to run your code. However, I am trying to get packages prepared on newest version and have been encountering errors such as with mpi4py which does not install correctly in my environment.

Is it possible for you guys to provide a requirements.txt file for me to generate the python virtual environment that will set up the dependencies to run the code? Otherwise a container image such as docker will also be great!

opened by pranjaldhole 0

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

96 Dec 22, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out

3k Jan 9, 2023

3k Dec 31, 2022

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

8 Sep 5, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022

An all-in-one application to visualize multiple different local path planning algorithms

Learning Off-Policy with Online Planning, CoRL 2021

Related tags

Overview

LOOP: Learning Off-Policy with Online Planning

Install

File Structure

Instructions

Experiments

Sec 6.1 LOOP for Online RL

Sec 6.2 LOOP for Offline RL

Sec 6.3 LOOP for Safe RL

References

You might also like...

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Simple streamlit app to demonstrate HERE Tour Planning

An all-in-one application to visualize multiple different local path planning algorithms

GNPy: Optical Route Planning and DWDM Network Optimization

Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

Comments

Environment reproducibility

Releases(v0.0.0)

v0.0.0(Aug 27, 2022)

Owner

Harshit Sikchi

利用Tensorflow实现基于CNN的中文短文本分类

Tutorial: Introduction to Graph Machine Learning, with Jupyter notebooks

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Anti-Adversarially Manipulated Attributions for Weakly and Semi-Supervised Semantic Segmentation (CVPR 2021)

MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

[ICML 2020] Prediction-Guided Multi-Objective Reinforcement Learning for Continuous Robot Control

Implementation for our AAAI2021 paper (Entity Structure Within and Throughout: Modeling Mention Dependencies for Document-Level Relation Extraction).

B2EA: An Evolutionary Algorithm Assisted by Two Bayesian Optimization Modules for Neural Architecture Search

Liecasadi - liecasadi implements Lie groups operation written in CasADi

MASA-SR: Matching Acceleration and Spatial Adaptation for Reference-Based Image Super-Resolution (CVPR2021)

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

Interactive Image Generation via Generative Adversarial Networks

ComputerVision - This repository aims at realized easy network architecture

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

Job-Recommend-Competition - Vectorwise Interpretable Attentions for Multimodal Tabular Data

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

LeafSnap replicated using deep neural networks to test accuracy compared to traditional computer vision methods.

Propose a principled and practically effective framework for unsupervised accuracy estimation and error detection tasks with theoretical analysis and state-of-the-art performance.

Prototype for Baby Action Detection and Classification