PPO-EWMA

[Paper]

This is code for training agents using PPO-EWMA and PPG-EWMA, introduced in the paper Batch size-invariance for policy optimization (citation). It is based on the code for Phasic Policy Gradient.

Installation

Supported platforms: MacOS and Ubuntu, Python 3.7

Installation using Miniconda:

git clone https://github.com/openai/ppo-ewma.git
conda env update --name ppo-ewma --file ppo-ewma/environment.yml
conda activate ppo-ewma
pip install -e ppo-ewma

Alternatively, install the dependencies from environment.yml manually.

Visualize results

Results are stored in blob storage at https://openaipublic.blob.core.windows.net/rl-batch-size-invariance/, and can be visualized as in the paper using this Colab notebook.

Citation

Please cite using the following BibTeX entry:

@article{hilton2021batch,
  title={Batch size-invariance for policy optimization},
  author={Hilton, Jacob and Cobbe, Karl and Schulman, John},
  journal={arXiv preprint arXiv:2110.00641},
  year={2021}
}

Code for Mining the Benefits of Two-stage and One-stage HOI Detection

Related tags

Overview

PPO-EWMA

[Paper]

Installation

Visualize results

Citation

Owner

OpenAI

Hierarchical Aggregation for 3D Instance Segmentation (ICCV 2021)

Implementation of "Scaled-YOLOv4: Scaling Cross Stage Partial Network" using PyTorch framwork.

A toolset for creating Qualtrics-based IAT experiments

An Implementation of Transformer in Transformer in TensorFlow for image classification, attention inside local patches

AttentionGAN for Unpaired Image-to-Image Translation & Multi-Domain Image-to-Image Translation

TimeSHAP explains Recurrent Neural Network predictions.

Generate fine-tuning samples & Fine-tuning the model & Generate samples by transferring Note On

StyleGAN2-ada for practice

Code release for ICCV 2021 paper "Anticipative Video Transformer"

https://arxiv.org/abs/2102.11005

Attention-guided gan for synthesizing IR images

Collection of NLP model explanations and accompanying analysis tools

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

Visyerres sgdf woob - Modules Woob pour l'intranet et autres sites Scouts et Guides de France

LaneAF: Robust Multi-Lane Detection with Affinity Fields

Self-supervised learning (SSL) is a method of machine learning

Code + pre-trained models for the paper Keeping Your Eye on the Ball Trajectory Attention in Video Transformers

Object detection and instance segmentation toolkit based on PaddlePaddle.

Parametric Contrastive Learning (ICCV2021)

Tiny-NewsRec: Efﬁcient and Effective PLM-based News Recommendation