PyTorch implementation of Constrained Policy Optimization

Last update: Dec 08, 2022

Overview

PyTorch implementation of Constrained Policy Optimization (CPO)

This repository has a simple to understand and use implementation of CPO in PyTorch. A dummy constraint function is included and can be adapted based on your needs.

Pre-requisites

PyTorch (The code is tested on PyTorch 1.2.0.)
OpenAI Gym.
MuJoCo (mujoco-py)
If working with a GPU, set OMP_NUM_THREADS to 1 using:

export OMP_NUM_THREADS=1

Features

Tensorboard integration to track learning.
Best model is tracked and saved using the value and standard deviation of average reward.

Usage

python algos/main.py --env-name CartPole-v1 --algo-name=CPO --exp-num=1 --exp-name=CPO/CartPole --save-intermediate-model=10 --gpu-index=0 --max-iter=500

Code Reference

Khrylx/PyTorch-RL

Technical Details on CPO

Owner

Sapana Chaudhary

I am a third year Ph.D. candidate in the department of Electrical and Computer Engineering at Texas A&M University.

GitHub Repository

WatermarkRemoval-WDNet-WACV2021

WatermarkRemoval-WDNet-WACV2021 Thank you for your attention. Citation Please cite the related works in your publications if it helps your research: @

63 Dec 05, 2022

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi

12.9k Jan 04, 2023

Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes

Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes The codes for simu

1 Jan 12, 2022

git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

LieTorch: Tangent Space Backpropagation Introduction The LieTorch library generalizes PyTorch to 3D transformation groups. Just as torch.Tensor is a m

482 Jan 06, 2023

This repository contains all data used for writing a research paper Multiple Object Trackers in OpenCV: A Benchmark, presented in ISIE 2021 conference in Kyoto, Japan.

OpenCV-Multiple-Object-Tracking Python is version 3.6.7 to install opencv: pip uninstall opecv-python pip uninstall opencv-contrib-python pip install

6 Dec 19, 2021

PyTorch implementation of Constrained Policy Optimization

Related tags

Overview

PyTorch implementation of Constrained Policy Optimization (CPO)

Pre-requisites

Features

Usage

Code Reference

Technical Details on CPO

Owner

Sapana Chaudhary

WatermarkRemoval-WDNet-WACV2021

A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

Compartmental epidemic model to assess undocumented infections: applications to SARS-CoV-2 epidemics in Brazil - Datasets and Codes

git《Tangent Space Backpropogation for 3D Transformation Groups》(CVPR 2021) GitHub:1]

This repository contains all data used for writing a research paper Multiple Object Trackers in OpenCV: A Benchmark, presented in ISIE 2021 conference in Kyoto, Japan.

TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.

PyTorch implementation for Score-Based Generative Modeling through Stochastic Differential Equations (ICLR 2021, Oral)

Distilled coarse part of LoFTR adapted for compatibility with TensorRT and embedded divices

3D position tracking for soccer players with multi-camera videos

An offline deep reinforcement learning library

Autonomous Robots Kalman Filters

Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

GraPE is a Rust/Python library for high-performance Graph Processing and Embedding.

ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators

[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

Research code for the paper "Variational Gibbs inference for statistical estimation from incomplete data".

Model-based reinforcement learning in TensorFlow

Blind visual quality assessment on 360° Video based on progressive learning

Python 3 module to print out long strings of text with intervals of time inbetween

AI-Bot - 一个基于watermelon改造的OpenAI-GPT-2的智能机器人