Reinforcement Learning via Supervised Learning

Last update: Nov 28, 2022

Related tags

Overview

Reinforcement Learning via Supervised Learning

Installation

Run

pip install -e .

in an environment with Python >= 3.7.0, <3.9.

The code depends on MuJoCo 2.1.0 (for mujoco-py) and MuJoCo 2.1.1 (for dm-control). Here are instructions for installing MuJoCo 2.1.0 and instructions for installing MuJoCo 2.1.1.

If you use the provided Dockerfile, it will automatically handle the MuJoCo dependencies for you. For example:

docker build -t rvs:latest .
docker run -it --rm -v $(pwd):/rvs rvs:latest bash
cd rvs
pip install -e .

Reproducing Experiments

The experiments directory contains a launch script for each environment suite. For example, to reproduce the RvS-R results in D4RL Gym locomotion, run

bash experiments/launch_gym_rvs_r.sh

Each launch script corresponds to a configuration file in experiments/config which serves as a reference for the hyperparameters associated with each experiment.

Adding New Environments

To run RvS on an environment of your own, you need to create a suitable dataset class. For example, in src/rvs/dataset.py, we have a dataset class for the GCSL environments, a dataset class for RvS-R in D4RL, and a dataset class for RvS-G in D4RL. In particular, the D4RLRvSGDataModule allows for conditioning on arbitrary dimensions of the goal state using the goal_columns attribute; for AntMaze, we set goal_columns to (0, 1) to condition only on the x and y coordinates of the goal state.

Baseline Numbers

We replicated CQL using this codebase, which was recommended to us by the CQL authors. All hyperparameters and logs from our replication runs can be viewed at our CQL-R Weights & Biases project.

We replicated Decision Transformer using our fork of the author's codebase, which we customized to add AntMaze. All hyperparameters and logs from our replication runs can be viewed at our DT Weights & Biases project.

Citing RvS

To cite RvS, you can use the following BibTeX entry:

@misc{emmons2021rvs,
      title={RvS: What is Essential for Offline RL via Supervised Learning?}, 
      author={Scott Emmons and Benjamin Eysenbach and Ilya Kostrikov and Sergey Levine},
      year={2021},
      eprint={2112.10751},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Reinforcement Learning via Supervised Learning

Related tags

Overview

Reinforcement Learning via Supervised Learning

Installation

Reproducing Experiments

Adding New Environments

Baseline Numbers

Citing RvS

Owner

Scott Emmons

Mail classification with tensorflow and MS Exchange Server (ham or spam).

Code Release for ICCV 2021 (oral), "AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds"

Source code for "UniRE: A Unified Label Space for Entity Relation Extraction.", ACL2021.

A Graph Neural Network Tool for Recovering Dense Sub-graphs in Random Dense Graphs.

A setup script to generate ITK Python Wheels

This project intends to use SVM supervised learning to determine whether or not an individual is diabetic given certain attributes.

Hypersearch weight debugging and losses tutorial

Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

disentanglement_lib is an open-source library for research on learning disentangled representations.

Learning nonlinear operators via DeepONet

[ICLR 2022] Contact Points Discovery for Soft-Body Manipulations with Differentiable Physics

A naive ROS interface for visualDet3D.

Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.

Gluon CV Toolkit

CPT: A Pre-Trained Unbalanced Transformer for Both Chinese Language Understanding and Generation

Speech Recognition is an important feature in several applications used such as home automation, artificial intelligence

Python Multi-Agent Reinforcement Learning framework

Robot Servers and Server Manager software for robo-gym

Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"