OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Last update: Nov 25, 2022

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

We here provide a video demo from confounded Enduro environment (see Figure 8 of the main draft). We also visualize the spatial attention map from a convolutional encoder trained with BC (medium) and OREO (right).

Installation

OREO requires CUDA 10.1 to run.

Install the dependencies:

conda install pytorch torchvision torchaudio cudatoolkit=10.1 -c pytorch
pip install dopamine_rl sklearn tqdm kornia dropblock atari-py==0.2.6 gsutil

Download DQN Replay dataset for expert demonstrations on Atari environments:

mkdir DATAPATH
cp download.sh DATAPATH
cd DATAPATH
sh download.sh

Pre-training

We here provide beta-VAE (for CCIL) and VQ-VAE (for CRLR and OREO) pretraining scripts. For other datasets, change the --env option.

beta-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_beta_vae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1 --ch_div 4 --lmd 10

VQ-VAE

CUDA_VISIBLE_DEVICES=0,1,2,3 python atari_vqvae.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --seed 1

Training BC policy

We here provide training scripts for baselines and OREO. For other datasets, change the --env, --beta_vae_path, and --vqvae_path options.

Behavioral cloning

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --num_episodes 20 --num_eval_episodes 100

Dropout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --original_dropout --prob 0.5 --num_episodes 20 --num_eval_episodes 100

DropBlock

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --dropblock --prob 0.3 --num_episodes 20 --num_eval_episodes 100

Cutout

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --input_cutout --num_episodes 20 --num_eval_episodes 100

RandomShift

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor.py --env=KungFuMaster --datapath DATAPATH --seed 1 --eval_interval 1000 --random_shift --num_episodes 20 --num_eval_episodes 100

CCIL (w/o interaction)

CUDA_VISIBLE_DEVICES=0 python atari_beta_vae_actor.py --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --ch_div 4 --beta_vae_path models_beta_vae_coord_conv_chdiv4_actor_lmd10.0/KungFuMaster_s1_epi20_con1_seed1_zdim50_beta4_kltol0_ep1000_beta_vae.pth

CRLR

CUDA_VISIBLE_DEVICES=0 python atari_cnn_actor_crlr.py --fixed_size 15000 --num_sub_iters 10 --eval_interval 10 --save_interval 10 --n_epochs 10 --env=KungFuMaster --datapath DATAPATH --num_episodes 20 --num_eval_episodes 100 --seed 1 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO

CUDA_VISIBLE_DEVICES=0 python atari_vqvae_oreo.py --env=KungFuMaster --datapath DATAPATH --num_mask 5 --num_episodes 20 --num_eval_episodes 100 --seed 1 --eval_interval 1000 --prob 0.5 --vqvae_path models_vqvae/KungFuMaster_s1_epi20_con1_seed1_ne512_c0.25_ep1000_vqvae.pth

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Related tags

Overview

OREO: Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning (NeurIPS 2021)

Video demo

Installation

Pre-training

beta-VAE

VQ-VAE

Training BC policy

Behavioral cloning

Dropout

DropBlock

Cutout

RandomShift

CCIL (w/o interaction)

CRLR

OREO

Owner

PyTorch Autoencoders - Implementing a Variational Autoencoder (VAE) Series in Pytorch.

Python implementation of MULTIseq barcode alignment using fuzzy string matching and GMM barcode assignment

A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥

Direct design of biquad filter cascades with deep learning by sampling random polynomials.

3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

Everything you need to know about NumPy( Creating Arrays, Indexing, Math,Statistics,Reshaping).

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

Filtering variational quantum algorithms for combinatorial optimization

GANimation: Anatomically-aware Facial Animation from a Single Image (ECCV'18 Oral) [PyTorch]

Improving Query Representations for DenseRetrieval with Pseudo Relevance Feedback:A Reproducibility Study.

Generative Flow Networks for Discrete Probabilistic Modeling

Automatic tool focused on deriving metallicities of open clusters

K-FACE Analysis Project on Pytorch

This repo provides the base code for pytorch-lightning and weight and biases simultaneous integration.

Official implementation of "SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers"

This YoloV5 based model is fit to detect people and different types of land vehicles, and displaying their density on a fitted map, according to their coordinates and detected labels.

An updated version of virtual model making

Data and code for the paper "Importance of Kernel Bandwidth in Quantum Machine Learning"

Tensorflow implementation of soft-attention mechanism for video caption generation.

Stitch it in Time: GAN-Based Facial Editing of Real Videos