PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Last update: Jan 01, 2023

Overview

ExORL: Exploratory Data for Offline Reinforcement Learning

This is an original PyTorch implementation of the ExORL framework from

Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning by

Denis Yarats*, David Brandfonbrener*, Hao Liu, Misha Laskin, Pieter Abbeel, Alessandro Lazaric, and Lerrel Pinto.

*Equal contribution.

Prerequisites

Install MuJoCo if it is not already the case:

Download MuJoCo binaries here.
Unzip the downloaded archive into ~/.mujoco/.
Append the MuJoCo subdirectory bin path into the env variable LD_LIBRARY_PATH.

Install the following libraries:

sudo apt update
sudo apt install libosmesa6-dev libgl1-mesa-glx libglfw3 unzip

Install dependencies:

conda env create -f conda_env.yml
conda activate exorl

Datasets

We provide exploratory datasets for 6 DeepMind Control Stuite domains

Domain	Dataset name	Available task names
Cartpole	`cartpole`	`cartpole_balance`, `cartpole_balance_sparse`, `cartpole_swingup`, `cartpole_swingup_sparse`
Cheetah	`cheetah`	`cheetah_run`, `cheetah_run_backward`
Jaco Arm	`jaco`	`jaco_reach_top_left`, `jaco_reach_top_right`, `jaco_reach_bottom_left`, `jaco_reach_bottom_right`
Point Mass Maze	`point_mass_maze`	`point_mass_maze_reach_top_left`, `point_mass_maze_reach_top_right`, `point_mass_maze_reach_bottom_left`, `point_mass_maze_reach_bottom_right`
Quadruped	`quadruped`	`quadruped_walk`, `quadruped_run`
Walker	`walker`	`walker_stand`, `walker_walk`, `walker_run`

For each domain we collected datasets by running 9 unsupervised RL algorithms from URLB for total of 10M steps. Here is the list of algorithms

Unsupervised RL method	Name	Paper
APS	`aps`	paper
APT(ICM)	`icm_apt`	paper
DIAYN	`diayn`	paper
Disagreement	`disagreement`	paper
ICM	`icm`	paper
ProtoRL	`proto`	paper
Random	`random`	N/A
RND	`rnd`	paper
SMM	`smm`	paper

You can download a dataset by running ./download.sh, for example to download ProtoRL dataset for Walker, run

./download.sh walker proto

The script will download the dataset from S3 and store it under datasets/walker/proto/, where you can find episodes (under buffer) and episode videos (under video).

Offline RL training

We also provide implementation of 5 offline RL algorithms for evaluating the datasets

Offline RL method	Name	Paper
Behavior Cloning	`bc`	paper
CQL	`cql`	paper
CRR	`crr`	paper
TD3+BC	`td3_bc`	paper
TD3	`td3`	paper

After downloading required datasets, you can evaluate it using offline RL methon for a specific task. For example, to evaluate a dataset collected by ProtoRL on Walker for the waling task using TD3+BC you can run

python train_offline.py agent=td3_bc expl_agent=proto task=walker_walk

Logs are stored in the output folder. To launch tensorboard run:

tensorboard --logdir output

Citation

If you use this repo in your research, please consider citing the paper as follows:

@article{yarats2022exorl,
  title={Don't Change the Algorithm, Change the Data: Exploratory Data for Offline Reinforcement Learning},
  author={Denis Yarats, David Brandfonbrener, Hao Liu, Michael Laskin, Pieter Abbeel, Alessandro Lazaric, Lerrel Pinto},
  journal={arXiv preprint arXiv:2201.13425},
  year={2022}
}

License

The majority of ExORL is licensed under the MIT license, however portions of the project are available under separate license terms: DeepMind is licensed under the Apache 2.0 license.

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Related tags

Overview

ExORL: Exploratory Data for Offline Reinforcement Learning

Prerequisites

Datasets

Offline RL training

Citation

License

Owner

Denis Yarats

Implementation of our NeurIPS 2021 paper "A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs".

[TOG 2021] PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling.

[ICCV'21] Neural Radiance Flow for 4D View Synthesis and Video Processing

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.

Source code for EquiDock: Independent SE(3)-Equivariant Models for End-to-End Rigid Protein Docking (ICLR 2022)

Sample Prior Guided Robust Model Learning to Suppress Noisy Labels

A plug-and-play library for neural networks written in Python

QuALITY: Question Answering with Long Input Texts, Yes!

Public implementation of the Convolutional Motif Kernel Network (CMKN) architecture

Deep learning for Engineers - Physics Informed Deep Learning

Lunar is a neural network aimbot that uses real-time object detection accelerated with CUDA on Nvidia GPUs.

Official PyTorch implementation of the paper "Graph-based Generative Face Anonymisation with Pose Preservation" in ICIAP 2021

Traffic4D: Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision

A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization

PyTorch implementation of 1712.06087 "Zero-Shot" Super-Resolution using Deep Internal Learning

Implementation of "With a Little Help from my Temporal Context: Multimodal Egocentric Action Recognition, BMVC, 2021" in PyTorch

PyTorch code for our ECCV 2018 paper "Image Super-Resolution Using Very Deep Residual Channel Attention Networks"

Creating Multi Task Models With Keras

Animate molecular orbital transitions using Psi4 and Blender

Pytorch implementation of Deep Recursive Residual Network for Super Resolution (DRRN)