TensorFlow implementation of Deep Reinforcement Learning papers

Last update: Jan 03, 2023

Overview

Deep Reinforcement Learning in TensorFlow

TensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains:

[1] Playing Atari with Deep Reinforcement Learning
[2] Human-Level Control through Deep Reinforcement Learning
[3] Deep Reinforcement Learning with Double Q-learning
[4] Dueling Network Architectures for Deep Reinforcement Learning
[5] Prioritized Experience Replay (in progress)
[6] Deep Exploration via Bootstrapped DQN (in progress)
[7] Asynchronous Methods for Deep Reinforcement Learning (in progress)
[8] Continuous Deep q-Learning with Model-based Acceleration (in progress)

Requirements

Usage

First, install prerequisites with:

$ pip install -U 'gym[all]' tqdm scipy

Don't forget to also install the latest TensorFlow. Also note that you need to install the dependences of doom-py which is required by gym[all]

Train with DQN model described in [1] without gpu:

$ python main.py --network_header_type=nips --env_name=Breakout-v0 --use_gpu=False

Train with DQN model described in [2]:

$ python main.py --network_header_type=nature --env_name=Breakout-v0

Train with Double DQN model described in [3]:

$ python main.py --double_q=True --env_name=Breakout-v0

Train with Deuling network with Double Q-learning described in [4]:

$ python main.py --double_q=True --network_output_type=dueling --env_name=Breakout-v0

Train with MLP model described in [4] with corridor environment (useful for debugging):

$ python main.py --network_header_type=mlp --network_output_type=normal --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=normal --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=dueling --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025
$ python main.py --network_header_type=mlp --network_output_type=dueling --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025

Results

Result of Corridor-v5 in [4] for DQN (purple), DDQN (red), Dueling DQN (green), Dueling DDQN (blue).

Result of `Breakout-v0' for DQN without frame-skip (white-blue), DQN with frame-skip (light purple), Dueling DDQN (dark blue).

The hyperparameters and gradient clipping are not implemented as it is as [4].

References

Author

Taehoon Kim / @carpedm20

TensorFlow implementation of Deep Reinforcement Learning papers

Related tags

Overview

Deep Reinforcement Learning in TensorFlow

Requirements

Usage

Results

References

Author

Owner

Taehoon Kim

StarGAN2 for practice

The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

The `rtdl` library + The official implementation of the paper

Project for music generation system based on object tracking and CGAN

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Jittor implementation of PCT:Point Cloud Transformer

Scaling Vision with Sparse Mixture of Experts

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

A texturizer that I just made. Nothing special here.

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Official implementation of the paper Do pedestrians pay attention? Eye contact detection for autonomous driving

Minimal PyTorch implementation of YOLOv3

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos

Offical code for the paper: "Growing 3D Artefacts and Functional Machines with Neural Cellular Automata" https://arxiv.org/abs/2103.08737

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

Few-shot NLP benchmark for unified, rigorous eval

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

PyMatting: A Python Library for Alpha Matting

TensorFlow implementation of Deep Reinforcement Learning papers

Related tags

Overview

Deep Reinforcement Learning in TensorFlow

Requirements

Usage

Results

References

Author

Owner

Taehoon Kim

StarGAN2 for practice

The Balloon Learning Environment - flying stratospheric balloons with deep reinforcement learning.

The `rtdl` library + The official implementation of the paper

Project for music generation system based on object tracking and CGAN

Code for "Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation". [AAAI 2021]

Jittor implementation of PCT:Point Cloud Transformer

Scaling Vision with Sparse Mixture of Experts

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

A texturizer that I just made. Nothing special here.

This repository contains the code used for Predicting Patient Outcomes with Graph Representation Learning (https://arxiv.org/abs/2101.03940).

Official implementation of the paper Do pedestrians pay attention? Eye contact detection for autonomous driving

Minimal PyTorch implementation of YOLOv3

3D-CariGAN: An End-to-End Solution to 3D Caricature Generation from Normal Face Photos

Offical code for the paper: "Growing 3D Artefacts and Functional Machines with Neural Cellular Automata" https://arxiv.org/abs/2103.08737

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

Few-shot NLP benchmark for unified, rigorous eval

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

PyMatting: A Python Library for Alpha Matting

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.