Offline Reinforcement Learning with Implicit Q-Learning

This repository contains the official implementation of Offline Reinforcement Learning with Implicit Q-Learning by Ilya Kostrikov, Ashvin Nair, and Sergey Levine.

If you use this code for your research, please consider citing the paper:

@article{kostrikov2021iql,
    title={Offline Reinforcement Learning with Implicit Q-Learning},
    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},
    year={2021},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

How to run the code

Install dependencies

pip install -r requirements.txt

See instructions for CUDA.

Run training

Locomotion

python train_offline.py --env_name=halfcheetah-medium-expert-v2 --config=configs/mujoco_config.py

AntMaze

python train_offline.py --env_name=antmaze-large-play-v0 --config=configs/antmaze_config.py --eval_episodes=100 --eval_interval=100000

Kitchen and Adroit

python train_offline.py --env_name=pen-human-v0 --config=configs/kitchen_config.py

Misc

The implementation is based on JAXRL.

Offline Reinforcement Learning with Implicit Q-Learning

Related tags

Overview

Offline Reinforcement Learning with Implicit Q-Learning

How to run the code

Install dependencies

Run training

Misc

Owner

Ilya Kostrikov

DropNAS: Grouped Operation Dropout for Differentiable Architecture Search

TICC is a python solver for efficiently segmenting and clustering a multivariate time series

Liecasadi - liecasadi implements Lie groups operation written in CasADi

[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure

Koopman operator identification library in Python

X-VLM: Multi-Grained Vision Language Pre-Training

Final project code: Implementing BicycleGAN, for CIS680 FA21 at University of Pennsylvania

Recurrent Conditional Query Learning

A Python module for the generation and training of an entry-level feedforward neural network.

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

AI-generated-characters for Learning and Wellbeing

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

Toward Multimodal Image-to-Image Translation

PyTorch code for SENTRY: Selective Entropy Optimization via Committee Consistency for Unsupervised DA

Code for Blind Image Decomposition (BID) and Blind Image Decomposition network (BIDeN).

GDSC-ML Team Interview Task

FLVIS: Feedback Loop Based Visual Initial SLAM

[NeurIPS 2021] Large Scale Learning on Non-Homophilous Graphs: New Benchmarks and Strong Simple Methods

Shitty gaze mouse controller