A clean and robust Pytorch implementation of PPO on continuous action space.

Last update: Dec 16, 2022

Related tags

Overview

PPO-Continuous-Pytorch

I found the current implementation of PPO on continuous action space is whether somewhat complicated or not stable.
And this is a clean and robust Pytorch implementation of PPO on continuous action space. Here is the result:

All the experiments are trained with same hyperparameters.

Dependencies

gym==0.18.3
box2d==2.3.10
numpy==1.21.2
pytorch==1.8.1

How to use my code

Play with trained model

run 'python main.py --write False --render True --Loadmodel True --ModelIdex 400'

Train from scratch

run 'python main.py', where the default enviroment is Pendulum-v0.

Change Enviroment

If you want to train on different enviroments, just run 'python main.py --EnvIdex 0'.
The --EnvIdex can be set to be 0~5, where
'--EnvIdex 0' for 'BipedalWalker-v3'
'--EnvIdex 1' for 'BipedalWalkerHardcore-v3'
'--EnvIdex 2' for 'LunarLanderContinuous-v2'
'--EnvIdex 3' for 'Pendulum-v0'
'--EnvIdex 4' for 'Humanoid-v2'
'--EnvIdex 5' for 'HalfCheetah-v2'

Visualize the training curve

You can use the tensorboard to visualize the training curve. History training curve is saved at '\runs'

Hyperparameter Setting

For more details of Hyperparameter Setting, please check 'main.py'

A clean and robust Pytorch implementation of PPO on continuous action space.

Related tags

Overview

PPO-Continuous-Pytorch

Dependencies

How to use my code

Play with trained model

Train from scratch

Change Enviroment

Visualize the training curve

Hyperparameter Setting

Owner

XinJingHao

Diverse Branch Block: Building a Convolution as an Inception-like Unit

Strongly local p-norm-cut algorithms for semi-supervised learning and local graph clustering

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Quadruped-command-tracking-controller - Quadruped command tracking controller (flat terrain)

Zeyuan Chen, Yangchao Wang, Yang Yang and Dong Liu.

Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

A clean and robust Pytorch implementation of PPO on continuous action space.

A Free and Open Source Python Library for Multiobjective Optimization

EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

PyTorch implementation of hand mesh reconstruction described in CMR and MobRecon.

領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

Generative Adversarial Networks for High Energy Physics extended to a multi-layer calorimeter simulation

A unified framework for machine learning with time series

Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Rank1 Conversation Emotion Detection Task

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

A PyTorch re-implementation of the paper 'Exploring Simple Siamese Representation Learning'. Reproduced the 67.8% Top1 Acc on ImageNet.

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

Direct Multi-view Multi-person 3D Human Pose Estimation