Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

Source for the paper "Universal Activation Function for machine learning"

Pytorch implementation for ACMMM2021 paper "I2V-GAN: Unpaired Infrared-to-Visible Video Translation".

[CVPR 2021] "Multimodal Motion Prediction with Stacked Transformers": official code implementation and project page.

Cascading Feature Extraction for Fast Point Cloud Registration (BMVC 2021)

A whale detector design for the Kaggle whale-detector challenge!

Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Apache Spark - A unified analytics engine for large-scale data processing

MultiTaskLearning - Multi Task Learning for 3D segmentation

[ICML 2020] DrRepair: Learning to Repair Programs from Error Messages

Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

Official implementation of deep-multi-trajectory-based single object tracking (IEEE T-CSVT 2021).

Keras udrl - Keras implementation of Upside Down Reinforcement Learning

Finetuning Pipeline

The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

ML-PersonalWork - Big assignment PersonalWork in Machine Learning, 2021 autumn BUAA.

Wide Residual Networks (WideResNets) in PyTorch

A unified framework to jointly model images, text, and human attention traces.

A hue shift helper for OBS

A simple interface for editing natural photos with generative neural networks.

Implementation of the HMAX model of vision in PyTorch