Disagreement-Regularized Imitation Learning

Last update: Apr 28, 2022

Overview

Due to a normalization bug the expert trajectories have lower performance than the rl_baseline_zoo reported experts. Please see the following link in codebase for where the bug was fixed at. [link]

Disagreement-Regularized Imitation Learning

Code to train the models described in the paper "Disagreement-Regularized Imitation Learning", by Kianté Brantley, Wen Sun and Mikael Henaff.

Usage:

Install using pip

Install the DRIL package

pip install -e .

Software Dependencies

"stable-baselines", "rl-baselines-zoo", "baselines", "gym", "pytorch", "pybullet"

Data

We provide a python script to generate expert data from per-trained models using the "rl-baselines-zoo" repository. Click "Here" to see all of the pre-trained agents available and their respective perfromance. Replace <name-of-environment> with the name of the pre-trained agent environment you would like to collect expert data for.

python -u generate_demonstration_data.py --seed <seed-number> --env-name <name-of-environment> --rl_baseline_zoo_dir <location-to-top-level-directory>

Training

DRIL requires a per-trained ensemble model and a per-trained behavior-cloning model.

Note that <location-to-rl-baseline-zoo-directory> is the full-path to the top-level directory to the rl_baseline_zoo repository.

To train only a behavior-cloning model run:

python -u main.py --env-name <name-of-environment> --num-trajs <number-of-trajectories> --behavior_cloning --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>'

To train only a ensemble model run:

python -u main.py --env-name <name-of-environment> --num-trajs <number-of-trajectories> --pretrain_ensemble_only --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>'

To train a DRIL model run the command below. Note that command below first checks that both the behavior cloning model and the ensemble model are trained, if they are not the script will automatically train both the ensemble and behavior-cloning model.

python -u main.py --env-name <name-of-environment> --default_experiment_params <type-of-env>  --num-trajs <number-of-trajectories> --rl_baseline_zoo_dir <location-to-rl-baseline-zoo-directory> --seed <seed-number>  --dril

--default_experiment_params are the default parameters we use in the DRIL experiments and has two options: atari and continous-control

Visualization

After training the models, the results are stored in a folder called trained_results. Run the command below to reproduce the plots in our paper. If you change any of the hyperparameters, you will need to change the hyperparameters in the plot file naming convention.

python -u plot.py -env <name-of-environment>

Empirical evaluation

Atari

Results on Atari environments.

Continous Control

Results on continuous control tasks.

Acknowledgement:

We would like to thank Ilya Kostrikov for creating this "repo" that our codebase builds on.

Disagreement-Regularized Imitation Learning

Related tags

Overview

Disagreement-Regularized Imitation Learning

Usage:

Install using pip

Software Dependencies

Data

Training

Visualization

Empirical evaluation

Atari

Continous Control

Acknowledgement:

Owner

Kianté Brantley

Python Blood Vessel Topology Analysis

Deep Learning Package based on TensorFlow

Score refinement for confidence-based 3D multi-object tracking

Interpretable-contrastive-word-mover-s-embedding

VIsually-Pivoted Audio and(N) Text

PyTorch Implementations for DeeplabV3 and PSPNet

Speedy Implementation of Instance-based Learning (IBL) agents in Python

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

OCRA (Object-Centric Recurrent Attention) source code

Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

Examples of using f2py to get high-speed Fortran integrated with Python easily

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

alfred-py: A deep learning utility library for human

Pyeventbus: a publish/subscribe event bus

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

People log into different sites every day to get information and browse through these sites one by one

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

Disagreement-Regularized Imitation Learning

Related tags

Overview

Disagreement-Regularized Imitation Learning

Usage:

Install using pip

Software Dependencies

Data

Training

Visualization

Empirical evaluation

Atari

Continous Control

Acknowledgement:

Owner

Kianté Brantley

Python Blood Vessel Topology Analysis

Deep Learning Package based on TensorFlow

Score refinement for confidence-based 3D multi-object tracking

Interpretable-contrastive-word-mover-s-embedding

VIsually-Pivoted Audio and(N) Text

PyTorch Implementations for DeeplabV3 and PSPNet

Speedy Implementation of Instance-based Learning (IBL) agents in Python

Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

OCRA (Object-Centric Recurrent Attention) source code

Unrolled Variational Bayesian Algorithm for Image Blind Deconvolution

Examples of using f2py to get high-speed Fortran integrated with Python easily

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park*, Rares Ambrus*, Vitor Guizilini, Jie Li, and Adrien Gaidon.

alfred-py: A deep learning utility library for **human**

Pyeventbus: a publish/subscribe event bus

Source codes of CenterTrack++ in 2021 ICME Workshop on Big Surveillance Data Processing and Analysis

People log into different sites every day to get information and browse through these sites one by one

Official implementation of the paper Image Generators with Conditionally-Independent Pixel Synthesis https://arxiv.org/abs/2011.13775

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

The PyTorch implementation for paper "Neural Texture Extraction and Distribution for Controllable Person Image Synthesis" (CVPR2022 Oral)

Official PyTorch implementation of DD3D: Is Pseudo-Lidar needed for Monocular 3D Object detection? (ICCV 2021), Dennis Park, Rares Ambrus, Vitor Guizilini, Jie Li, and Adrien Gaidon.

alfred-py: A deep learning utility library for human