PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Last update: Dec 08, 2022

Overview

Advantage async actor-critic Algorithms (A3C) in PyTorch

@inproceedings{mnih2016asynchronous,
  title={Asynchronous methods for deep reinforcement learning},
  author={Mnih, Volodymyr and Badia, Adria Puigdomenech and Mirza, Mehdi and Graves, Alex and Lillicrap, Timothy P and Harley, Tim and Silver, David and Kavukcuoglu, Koray},
  booktitle={International Conference on Machine Learning},
  year={2016}}

This repository contains an implementation of Adavantage async Actor-Critic (A3C) in PyTorch based on the original paper by the authors and the PyTorch implementation by Ilya Kostrikov.

A3C is the state-of-art Deep Reinforcement Learning method.

Dependencies

Python 2.7
PyTorch
gym (OpenAI)
universe (OpenAI)
opencv (for env state processing)
visdom (for visualization)

Training

./train_lstm.sh

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

./test_lstm.sh 169000

A test result video is available.

PyTorch implementation of Advantage async actor-critic Algorithms (A3C) in PyTorch

Related tags

Overview

Advantage async actor-critic Algorithms (A3C) in PyTorch

Dependencies

Training

Test wigh trained weight after 169000 updates for PongDeterminisitc-v3.

Check the loss curves of all threads in http://localhost:8097

References

Owner

LEI TAI

Isaac Gym Reinforcement Learning Environments

Brax is a differentiable physics engine that simulates environments made up of rigid bodies, joints, and actuators

A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility

A Python framework for conversational search

[ACL 20] Probing Linguistic Features of Sentence-level Representations in Neural Relation Extraction

Do Smart Glasses Dream of Sentimental Visions? Deep Emotionship Analysis for Eyewear Devices

Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

Adjusting for Autocorrelated Errors in Neural Networks for Time Series

Sign Language Translation with Transformers (COLING'2020, ECCV'20 SLRTP Workshop)

Bagua is a flexible and performant distributed training algorithm development framework.

Einshape: DSL-based reshaping library for JAX and other frameworks.

An executor that loads ONNX models and embeds documents using the ONNX runtime.

The Generic Manipulation Driver Package - Implements a ROS Interface over the robotics toolbox for Python

Real-time 3D multi-person detection made easy with OpenPose and the ZED

Image augmentation library in Python for machine learning.

Contextualized Perturbation for Textual Adversarial Attack, NAACL 2021

Learning to Self-Train for Semi-Supervised Few-Shot

Code for HLA-Face: Joint High-Low Adaptation for Low Light Face Detection (CVPR21)

Ascend your Jupyter Notebook usage