RE3: State Entropy Maximization with Random Encoders for Efficient Exploration

Related tags

Deep LearningRE3
Overview

State Entropy Maximization with Random Encoders for Efficient Exploration (RE3) (ICML 2021)

Code for State Entropy Maximization with Random Encoders for Efficient Exploration.

In this repository, we provide code for RE3 algorithm described in the paper linked above. We provide code in three sub-directories: rad_re3 containing code for the combination of RE3 and RAD, dreamer_re3 containing code for the combination of RE3 and Dreamer, and a2c_re3 containing code for the combination of RE3 and A2C.

We also provide raw data(.csv) and code for visualization in the data directory.

If you find this repository useful for your research, please cite:

@inproceedings{seo2021state,
  title={State Entropy Maximization with Random Encoders for Efficient Exploration},
  author={Seo, Younggyo and Chen, Lili and Shin, Jinwoo and Lee, Honglak and Abbeel, Pieter and Lee, Kimin},
  booktitle={International Conference on Machine Learning},
  year={2021}
}

RAD + RE3

Our code is built on top of the DrQ repository.

Installation

You could install all dependencies by following command:

conda env install -f conda_env.yml

You should also install custom version of dm_control to run experiments on Walker Run Sparse and Cheetah Run Sparse. You could do this by following command:

cd ../envs/dm_control
pip install .

Instructions

RAD

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3 use_state_entropy=false

RAD + RE3

python train.py env=hopper_hop batch_size=512 action_repeat=2 logdir=runs_rad_re3

We provide all scripts to reproduce Figure 4 (RAD, RAD + RE3) in scripts directory.

Dreamer + RE3

Our code is built on top of the Dreamer repository.

Installation

You could install all dependencies by following command:

pip3 install --user tensorflow-gpu==2.2.0
pip3 install --user tensorflow_probability
pip3 install --user git+git://github.com/deepmind/dm_control.git
pip3 install --user pandas
pip3 install --user matplotlib

# Install custom dm_control environments for walker_run_sparse / cheetah_run_sparse
cd ../envs/dm_control
pip3 install .

Instructions

Dreamer

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer/12345 --task dmc_pendulum_swingup --precision 32 --beta 0.0 --seed 12345

Dreamer + RE3

python dreamer.py --logdir ./logdir/dmc_pendulum_swingup/dreamer_re3/12345 --task dmc_pendulum_swingup --precision 32 --k 53 --beta 0.1 --seed 12345

We provide all scripts to reproduce Figure 4 (Dreamer, Dreamer + RE3) in scripts directory.

A2C + RE3

Training code can be found in rl-starter-files directory, which is forked from rl-starter-files, which uses a modified A2C implementation from torch-ac. Note that currently there is only support for A2C.

Installation

All of the dependencies are in the requirements.txt file in rl-starter-files. They can be installed manually or with the following command:

pip3 install -r requirements.txt

You will also need to install our cloned version of torch-ac with these commands:

cd torch-ac
pip3 install -e .

Instructions

See instructions in rl-starter-files directory. Example scripts can be found in rl-starter-files/rl-starter-files/run_sent.sh.

Owner
Younggyo Seo
Ph.D Student @ Graduate School of AI, KAIST
Younggyo Seo
Repository for "Toward Practical Monocular Indoor Depth Estimation" (CVPR 2022)

Toward Practical Monocular Indoor Depth Estimation Cho-Ying Wu, Jialiang Wang, Michael Hall, Ulrich Neumann, Shuochen Su [arXiv] [project site] DistDe

Meta Research 122 Dec 13, 2022
Code for the CVPR2021 paper "Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition"

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition This repository contains code for the CVPR2021 paper "Patch-NetV

QVPR 368 Jan 06, 2023
Official and maintained implementation of the paper "OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data" [BMVC 2021].

OSS-Net: Memory Efficient High Resolution Semantic Segmentation of 3D Medical Data Christoph Reich, Tim Prangemeier, Özdemir Cetin & Heinz Koeppl | Pr

Christoph Reich 23 Sep 21, 2022
A simple and lightweight genetic algorithm for optimization of any machine learning model

geneticml This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model. Installation Use pip to ins

Allan Barcelos 8 Aug 10, 2022
A python library for face detection and features extraction based on mediapipe library

FaceAnalyzer A python library for face detection and features extraction based on mediapipe library Introduction FaceAnalyzer is a library based on me

Saifeddine ALOUI 14 Dec 30, 2022
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021
This is the repository of our article published on MDPI Entropy "Feature Selection for Recommender Systems with Quantum Computing".

Collaborative-driven Quantum Feature Selection This repository was developed by Riccardo Nembrini, PhD student at Politecnico di Milano. See the websi

Quantum Computing Lab @ Politecnico di Milano 10 Apr 21, 2022
DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective.

Microsoft 8.4k Jan 01, 2023
CPF: Learning a Contact Potential Field to Model the Hand-object Interaction

Contact Potential Field This repo contains model, demo, and test codes of our paper: CPF: Learning a Contact Potential Field to Model the Hand-object

Lixin YANG 99 Dec 26, 2022
Official code for the publication "HyFactor: Hydrogen-count labelled graph-based defactorization Autoencoder".

HyFactor Graph-based architectures are becoming increasingly popular as a tool for structure generation. Here, we introduce a novel open-source archit

Laboratoire-de-Chemoinformatique 11 Oct 10, 2022
This is the official code release for the paper Shape and Material Capture at Home

This is the official code release for the paper Shape and Material Capture at Home. The code enables you to reconstruct a 3D mesh and Cook-Torrance BRDF from one or more images captured with a flashl

89 Dec 10, 2022
Projects of Andfun Yangon

AndFunYangon Projects of Andfun Yangon First Commit We can use gsearch.py to sea

Htin Aung Lu 1 Dec 28, 2021
Fuzzing JavaScript Engines with Aspect-preserving Mutation

DIE Repository for "Fuzzing JavaScript Engines with Aspect-preserving Mutation" (in S&P'20). You can check the paper for technical details. Environmen

gts3.org (<a href=[email protected])"> 190 Dec 11, 2022
Local Attention - Flax module for Jax

Local Attention - Flax Autoregressive Local Attention - Flax module for Jax Install $ pip install local-attention-flax Usage from jax import random fr

Phil Wang 16 Jun 16, 2022
Towards Part-Based Understanding of RGB-D Scans

Towards Part-Based Understanding of RGB-D Scans (CVPR 2021) We propose the task of part-based scene understanding of real-world 3D environments: from

26 Nov 23, 2022
Official PyTorch implementation of Segmenter: Transformer for Semantic Segmentation

Segmenter: Transformer for Semantic Segmentation Segmenter: Transformer for Semantic Segmentation by Robin Strudel*, Ricardo Garcia*, Ivan Laptev and

594 Jan 06, 2023
Python implementation of the multistate Bennett acceptance ratio (MBAR)

pymbar Python implementation of the multistate Bennett acceptance ratio (MBAR) method for estimating expectations and free energy differences from equ

Chodera lab // Memorial Sloan Kettering Cancer Center 169 Dec 02, 2022
Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset

SW-CV-ModelZoo Repo for my Tensorflow/Keras CV experiments. Mostly revolving around the Danbooru20xx dataset Framework: TF/Keras 2.7 Training SQLite D

20 Dec 27, 2022
g2o: A General Framework for Graph Optimization

g2o - General Graph Optimization Linux: Windows: g2o is an open-source C++ framework for optimizing graph-based nonlinear error functions. g2o has bee

Rainer Kümmerle 2.5k Dec 30, 2022
FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment

FaceQgen FaceQgen: Semi-Supervised Deep Learning for Face Image Quality Assessment This repository is based on the paper: "FaceQgen: Semi-Supervised D

Javier Hernandez-Ortega 3 Aug 04, 2022