The Unsupervised Reinforcement Learning Benchmark (URLB)

Overview

The Unsupervised Reinforcement Learning Benchmark (URLB)

URLB provides a set of leading algorithms for unsupervised reinforcement learning where agents first pre-train without access to extrinsic rewards and then are finetuned to downstream tasks.

Requirements

We assume you have access to a GPU that can run CUDA 10.2 and CUDNN 8. Then, the simplest way to install all required dependencies is to create an anaconda environment by running

conda env create -f conda_env.yml

After the instalation ends you can activate your environment with

conda activate urlb

Implemented Agents

Agent Command Implementation Author(s) Paper
ICM agent=icm Denis paper
ProtoRL agent=proto Denis paper
DIAYN agent=diayn Misha paper
APT(ICM) agent=icm_apt Hao, Kimin paper
APT(Ind) agent=ind_apt Hao, Kimin paper
APS agent=aps Hao, Kimin paper
SMM agent=smm Albert paper
RND agent=rnd Kevin paper
Disagreement agent=disagreement Catherine paper

Available Domains

We support the following domains.

Domain Tasks
walker stand, walk, run, flip
quadruped walk, run, stand, jump
jaco reach_top_left, reach_top_right, reach_bottom_left, reach_bottom_right

Domain observation mode

Each domain supports two observation modes: states and pixels.

Model Command
states obs_type=states
pixels obs_type=pixels

Instructions

Pre-training

To run pre-training use the pretrain.py script

python pretrain.py agent=icm domain=walker

or, if you want to train a skill-based agent, like DIAYN, run:

python pretrain.py agent=diayn domain=walker

This script will produce several agent snapshots after training for 100k, 500k, 1M, and 2M frames. The snapshots will be stored under the following directory:

./pretrained_models/<obs_type>/<domain>/<agent>/

For example:

./pretrained_models/states/walker/icm/

Fine-tuning

Once you have pre-trained your method, you can use the saved snapshots to initialize the DDPG agent and fine-tune it on a downstream task. For example, let's say you have pre-trained ICM, you can fine-tune it on walker_run by running the following command:

python finetune.py pretrained_agent=icm task=walker_run snapshot_ts=1000000 obs_type=states

This will load a snapshot stored in ./pretrained_models/states/walker/icm/snapshot_1000000.pt, initialize DDPG with it (both the actor and critic), and start training on walker_run using the extrinsic reward of the task.

For methods that use skills, include the agent, and the reward_free tag to false.

python finetune.py pretrained_agent=smm task=walker_run snapshot_ts=1000000 obs_type=states agent=smm reward_free=false

Monitoring

Logs are stored in the exp_local folder. To launch tensorboard run:

tensorboard --logdir exp_local

The console output is also available in a form:

| train | F: 6000 | S: 3000 | E: 6 | L: 1000 | R: 5.5177 | FPS: 96.7586 | T: 0:00:42

a training entry decodes as

F  : total number of environment frames
S  : total number of agent steps
E  : total number of episodes
R  : episode return
FPS: training throughput (frames per second)
T  : total training time
PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM)

Neuro-Symbolic Sudoku Solver PyTorch implementation for the Neuro-Symbolic Sudoku Solver leveraging the power of Neural Logic Machines (NLM). Please n

Ashutosh Hathidara 60 Dec 10, 2022
Compare outputs between layers written in Tensorflow and layers written in Pytorch

Compare outputs of Wasserstein GANs between TensorFlow vs Pytorch This is our testing module for the implementation of improved WGAN in Pytorch Prereq

Hung Nguyen 72 Dec 20, 2022
A PyTorch Implementation of the Luna: Linear Unified Nested Attention

Unofficial PyTorch implementation of Luna: Linear Unified Nested Attention The quadratic computational and memory complexities of the Transformer’s at

Soohwan Kim 32 Nov 07, 2022
[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

<a href=[email protected]"> 55 Jan 03, 2023
🐸STT integration examples

🐸 STT 0.9.x Examples These are various examples on how to use or integrate 🐸 STT using our packages. It is a good way to just try out 🐸 STT before

coqui 92 Dec 19, 2022
Teaches a student network from the knowledge obtained via training of a larger teacher network

Distilling-the-knowledge-in-neural-network Teaches a student network from the knowledge obtained via training of a larger teacher network This is an i

Abhishek Sinha 146 Dec 11, 2022
The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

The NEOSSat is a dual-mission microsatellite designed to detect potentially hazardous Earth-orbit-crossing asteroids and track objects that reside in deep space

John Salib 2 Jan 30, 2022
Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker

Example Of Fine-Tuning BERT For Named-Entity Recognition Task And Preparing For Cloud Deployment Using Flask, React, And Docker This repository contai

Nikita 12 Dec 14, 2022
Research on Event Accumulator Settings for Event-Based SLAM

Research on Event Accumulator Settings for Event-Based SLAM This is the source code for paper "Research on Event Accumulator Settings for Event-Based

Robin Shaun 26 Dec 21, 2022
TransReID: Transformer-based Object Re-Identification

TransReID: Transformer-based Object Re-Identification [arxiv] The official repository for TransReID: Transformer-based Object Re-Identification achiev

569 Dec 30, 2022
Visualization toolkit for neural networks in PyTorch! Demo -->

FlashTorch A Python visualization toolkit, built with PyTorch, for neural networks in PyTorch. Neural networks are often described as "black box". The

Misa Ogura 692 Dec 29, 2022
MLJetReconstruction - using machine learning to reconstruct jets for CMS

MLJetReconstruction - using machine learning to reconstruct jets for CMS The C++ data extraction code used here was based heavily on that foundv here.

ALPhA Davidson 0 Nov 17, 2021
Multiple custom object count and detection using YOLOv3-Tiny method

Electronic-Component-YOLOv3 Introduce This project created to detect, count, and recognize multiple custom object using YOLOv3-Tiny method. The target

Derwin Mahardika 2 Nov 14, 2022
The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022

DG-TrajGen The official repository for paper ''Domain Generalization for Vision-based Driving Trajectory Generation'' submitted to ICRA 2022. Our Meth

Wang 25 Sep 26, 2022
MultiTaskLearning - Multi Task Learning for 3D segmentation

Multi Task Learning for 3D segmentation Perception stack of an Autonomous Drivin

2 Sep 22, 2022
An open-source, low-cost, image-based weed detection device for fallow scenarios.

Welcome to the OpenWeedLocator (OWL) project, an opensource hardware and software green-on-brown weed detector that uses entirely off-the-shelf compon

Guy Coleman 145 Jan 05, 2023
Detection of PCBA defect

Detection_of_PCBA_defect Detection_of_PCBA_defect Use yolov5 to train. $pip install -r requirements.txt Detect.py will detect file(jpg,mp4...) in cu

6 Nov 28, 2022
A Research-oriented Federated Learning Library and Benchmark Platform for Graph Neural Networks. Accepted to ICLR'2021 - DPML and MLSys'21 - GNNSys workshops.

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks A Research-oriented Federated Learning Library and Benchmark Platform

FedML-AI 175 Dec 01, 2022
This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》

CoraNet This is the 3D Implementation of 《Inconsistency-aware Uncertainty Estimation for Semi-supervised Medical Image Segmentation》 Environment pytor

25 Nov 08, 2022
a baseline to practice

ccks2021_track3_baseline a baseline to practice 路径可能会有问题,自己改改 torch==1.7.1 pyhton==3.7.1 transformers==4.7.0 cuda==11.0 this is a baseline, you can fi

45 Nov 23, 2022