ilpyt: imitation learning library with modular, baseline implementations in Pytorch

Overview

ilpyt

The imitation learning toolbox (ilpyt) contains modular implementations of common deep imitation learning algorithms in PyTorch, with unified infrastructure supporting key imitation learning and reinforcement learning algorithms. You can read more about ilpyt in our white paper.

Documentation is available here.

Table of Contents

Main Features

  • Implementation of baseline imitation learning algorithms: BC, DAgger, AppL, GCL, GAIL.
  • Implementation of baseline reinforcement learning algorithms, for comparison purposes: DQN, A2C, PPO2.
  • Modular, extensible framework for training, evaluating, and testing imitation learning (and reinforcement learning) algorithms.
  • Simple algorithm API which exposes train and test methods, allowing for quick library setup and use (a basic usage of the library requires less than ten lines of code to have a fully functioning train and test pipeline).
  • A modular infrastructure for easy modification and reuse of existing components for novel algorithm implementations.
  • Parallel and serialization modes, allowing for faster, optimized operations or serial operations for debugging.
  • Compatibility with the OpenAI Gym environment interface for access to many existing benchmark learning environments, as well as the flexibility to create custom environments.

Installation

Note: ilpyt has only been tested on Ubuntu 20.04, and with Python 3.8.5.

  1. In order to install ilpyt, there are a few prerequisites required. The following commands will setup all the basics so you can run ilpyt with the OpenAI Gym environments:
# Install system-based packages
apt-get install cmake python3-pip python3-testresources freeglut3-dev xvfb

# Install Wheel
pip3 install --no-cache-dir --no-warn-script-location wheel
  1. Install ilpyt using pip:
pip3 install ilpyt

# Or to install from source:
# pip3 install -e .
  1. (Optional) Run the associated Python tests to confirm the package has installed successfully:
git clone https://github.com/mitre/ilpyt.git
cd ilpyt/

# To run all the tests
# If running headless, prepend the pytest command with `xvfb-run -a -s "-screen 0 1400x900x24 +extension RANDR" --`
pytest tests/

# Example: to run an individual test, like DQN
pytest tests/test_dqn.py 

Getting Started

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. Documentation is available here.

Basic Usage

Various sample Python script(s) of how to run the toolbox can be found within the examples directory. A minimal train and test snippet for an imitation learning algorithm takes less than 10 lines of code in ilpyt. In this basic example, we are training a behavioral cloning algorithm for 10,000 epochs before testing the best policy for 100 episodes.

import ilpyt
from ilpyt.agents.imitation_agent import ImitationAgent
from ilpyt.algos.bc import BC

env = ilpyt.envs.build_env(env_id='LunarLander-v2',  num_env=16)
net = ilpyt.nets.choose_net(env)
agent = ImitationAgent(net=net, lr=0.0001)

algo = BC(agent=agent, env=env)
algo.train(num_epochs=10000, expert_demos='demos/LunarLander-v2/demos.pkl')
algo.test(num_episodes=100)

Code Organization

workflow

At a high-level, the algorithm orchestrates the training and testing of our agent in a particular environment. During these training or testing loops, a runner will execute the agent and environment in a loop to collect (state, action, reward, next state) transitions. The individual components of a transition (e.g., state or action) are typically torch Tensors. The agent can then use this batch of transitions to update its network and move towards an optimal action policy.

Customization

To implement a new algorithm, one simply has to extend the BaseAlgorithm and BaseAgent abstract classes (for even further customization, one can even make custom networks by extending the BaseNetwork interface). Each of these components is modular (see code organization for more details), allowing components to be easily swapped out. (For example, the agent.generator used in the GAIL algorithm can be easily swapped between PPOAgent, DQNAgent, or A2Cagent. In a similar way, new algorithm implementations can utilize existing implemented classes as building blocks, or extend the class interfaces for more customization.)

Adding a custom environment is as simple as extending the OpenAI Gym Environment interface and registering it within your local gym environment registry.

See agents/base_agent.py, algos/base_algo.py, nets/base_net.py for more details.

Supported Algorithms and Environments

The following imitation learning (IL) algorithms are supported:

The following reinforcement learning (RL) algorithms are supported:

The following OpenAI Gym Environments are supported. Environments with:

  • Observation space: Box(x,) and Box(x,y,z)
  • Action space: Discrete(x) and Box(x,)

NOTE: To create your own custom environment, just follow the OpenAI Gym Environment interface. i.e., your environment must implement the following methods (and inherit from the OpenAI Gym Class). More detailed instructions can be found on the OpenAI GitHub repository page on creating custom Gym environments.

Benchmarks

Sample train and test results of the baseline algorithms on some environments:

CartPole-v0 MountainCar-v0 MountainCarContinuous-v0 LunarLander-v2 LunarLanderContinuous-v2
Threshold 200 -110 90 200 200
Expert (Mean/Std) 200.00 / 0.00 -98.71 / 7.83 93.36 / 0.05 268.09 / 21.18 283.83 / 17.70
BC (Mean/Std) 200.00 / 0.00 -100.800 / 13.797 93.353 / 0.113 244.295 / 97.765 285.895 / 14.584
DAgger (Mean/Std) 200.00 / 0.00 -102.36 / 15.38 93.20 / 0.17 230.15 / 122.604 285.85 / 14.61
GAIL (Mean/Std) 200.00 / 0.00 -104.31 / 17.21 79.78 / 6.23 201.88 / 93.82 282.00 / 31.73
GCL 200.00 / 0.00 - - 212.321 / 119.933 255.414 / 76.917
AppL(Mean/Std) 200.00 / 0.00 -108.60 / 22.843 - - -
DQN (Mean/Std) - - - 281.96 / 24.57 -
A2C (Mean/Std) - - 201.26 / 62.52 -
PPO (Mean/Std) - - - 249.72 / 75.05 -

The pre-trained weights for these models can be found in our Model Zoo.

Citation

If you use ilpyt for your work, please cite our white paper:

@misc{ilpyt_2021,
  author = {Vu, Amanda and Tapley, Alex and Bissey, Brett},
  title = {ilpyt: Imitation Learning Research Code Base in PyTorch},
  year = {2021},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/mitre/ilpyt}},
}
Owner
The MITRE Corporation
Open Source Software from the MITRE Corporation
The MITRE Corporation
Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21.

Final Project for the CS238: Decision Making Under Uncertainty course at Stanford University in Autumn '21. We optimized wind turbine placement in a wind farm, subject to wake effects, using Q-learni

Manasi Sharma 2 Sep 27, 2022
Revealing and Protecting Labels in Distributed Training

Revealing and Protecting Labels in Distributed Training

Google Interns 0 Nov 09, 2022
(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain Mingchen Zhuge*, Dehong Gao*, Deng-Ping Fan#, Linbo Jin, Ben Chen, Haoming Zhou, Minghui

248 Dec 04, 2022
Evaluating saliency methods on artificial data with different background types

Evaluating saliency methods on artificial data with different background types This repository contains the relevant code for the MedNeurips 2021 subm

2 Jul 05, 2022
Vector.ai assignment

fabio-tests-nisargatman Low Level Approach: ###Tables: continents: id*, name, population, area, createdAt, updatedAt countries: id*, name, population,

Ravi Pullagurla 1 Nov 09, 2021
PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation

PyTorch implementation of Interpretable Explanations of Black Boxes by Meaningful Perturbation The paper: https://arxiv.org/abs/1704.03296 What makes

Jacob Gildenblat 322 Dec 17, 2022
Implementation of paper "Graph Condensation for Graph Neural Networks"

GCond A PyTorch implementation of paper "Graph Condensation for Graph Neural Networks" Code will be released soon. Stay tuned :) Abstract We propose a

Wei Jin 66 Dec 04, 2022
Lucid library adapted for PyTorch

Lucent PyTorch + Lucid = Lucent The wonderful Lucid library adapted for the wonderful PyTorch! Lucent is not affiliated with Lucid or OpenAI's Clarity

Lim Swee Kiat 520 Dec 26, 2022
use machine learning to recognize gesture on raspberrypi

Raspberrypi_Gesture-Recognition use machine learning to recognize gesture on raspberrypi 說明 利用 tensorflow lite 訓練手部辨識模型 分辨 "剪刀"、"石頭"、"布" 之手勢 再將訓練模型匯入

1 Dec 10, 2021
Text Generation by Learning from Demonstrations

Text Generation by Learning from Demonstrations The README was last updated on March 7, 2021. The repo is based on fairseq (v0.9.?). Paper arXiv Prere

38 Oct 21, 2022
learned_optimization: Training and evaluating learned optimizers in JAX

learned_optimization: Training and evaluating learned optimizers in JAX learned_optimization is a research codebase for training learned optimizers. I

Google 533 Dec 30, 2022
Serverless proxy for Spark cluster

Hydrosphere Mist Hydrosphere Mist is a serverless proxy for Spark cluster. Mist provides a new functional programming framework and deployment model f

hydrosphere.io 317 Dec 01, 2022
A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

ICBI 145 Jan 03, 2023
Diffusion Probabilistic Models for 3D Point Cloud Generation (CVPR 2021)

Diffusion Probabilistic Models for 3D Point Cloud Generation [Paper] [Code] The official code repository for our CVPR 2021 paper "Diffusion Probabilis

Shitong Luo 323 Jan 05, 2023
An Open-Source Package for Information Retrieval.

OpenMatch An Open-Source Package for Information Retrieval. 😃 What's New Top Spot on TREC-COVID Challenge (May 2020, Round2) The twin goals of the ch

THUNLP 439 Dec 27, 2022
potpourri3d - An invigorating blend of 3D geometry tools in Python.

A Python library of various algorithms and utilities for 3D triangle meshes and point clouds. Managed by Nicholas Sharp, with new tools added lazily as needed. Currently, mainly bindings to C++ tools

Nicholas Sharp 295 Jan 05, 2023
PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training”

A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiased

Kaihua Tang 824 Jan 03, 2023
Explaining neural decisions contrastively to alternative decisions.

Contrastive Explanations for Model Interpretability This is the repository for the paper "Contrastive Explanations for Model Interpretability", about

AI2 16 Oct 16, 2022
Official Pytorch implementation of RePOSE (ICCV2021)

RePOSE: Iterative Rendering and Refinement for 6D Object Detection (ICCV2021) [Link] Abstract We present RePOSE, a fast iterative refinement method fo

Shun Iwase 68 Nov 15, 2022
Readings for "A Unified View of Relational Deep Learning for Polypharmacy Side Effect, Combination Therapy, and Drug-Drug Interaction Prediction."

Polypharmacy - DDI - Synergy Survey The Survey Paper This repository accompanies our survey paper A Unified View of Relational Deep Learning for Polyp

AstraZeneca 79 Jan 05, 2023