Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Overview

Transfer-Learning-in-Reinforcement-Learning

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Final Report

Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations

Cite this work

Nathan Beck, Abhiramon Rajasekharan, Hieu Tran, "Transfer Reinforcement Learning for Differing Action Spaces via Q-Network Representations", 2021

Project description

Transfer learning approaches in reinforcement learning aim to assist agents in learning their target domains by leveraging the knowledge learned from other agents that have been trained on similar source domains. For example, recent research focus within this space has been placed on knowledge transfer between tasks that have different transition dynamics and reward functions; however, little focus has been placed on knowledge transfer between tasks that have different action spaces.

In this paper, we approach the task of transfer learning between domains that differ in action spaces. We present a reward shaping method based on source embedding similarity that is applicable to domains with both discrete and continuous action spaces. The efficacy of our approach is evaluated on transfer to restricted action spaces in the Acrobot-v1 and Pendulum-v0 domains (Brockman et al. 2016).

Our presentations

  • Presentation 1 here
  • Google Doc Folder here

Our Google Colab

https://colab.research.google.com/drive/1cQCV9Ko-prpB8sH6FlB4oj781On-ut_w?usp=sharing

Setup

  1. Clone our repository
  2. Install Gym

Using pip:

pip install gym

Or Building from Source

git clone https://github.com/openai/gym
cd gym
pip install -e .

How to run?

Run with python IDE

  1. Open main.py or main_multiple_run.py
  2. Modify env_name and algorithm that you want to run
  3. Modify parameters in transfer_execute function if needed
  4. Log will be printed out to the terminal and the plotting result will be shown on the new windows.

Run with Google Colab

Follow our sample in file Reward_Shaping_TL.ipynb to run your own colab.

Implemented Algorithms in Stable-Baseline3

Name Recurrent Box Discrete MultiDiscrete MultiBinary Multi Processing
A2C ✔️ ✔️ ✔️ ✔️ ✔️
DDPG ✔️
DQN ✔️
HER ✔️ ✔️
PPO ✔️ ✔️ ✔️ ✔️ ✔️
SAC ✔️
TD3 ✔️
QR-DQN1 ✔️
TQC1 ✔️
Maskable PPO1 ✔️ ✔️ ✔️ ✔️

1: Implemented in SB3 Contrib GitHub repository.

Actions gym.spaces:

  • Box: A N-dimensional box that containes every point in the action space.
  • Discrete: A list of possible actions, where each timestep only one of the actions can be used.
  • MultiDiscrete: A list of possible actions, where each timestep only one action of each discrete set can be used.
  • MultiBinary: A list of possible actions, where each timestep any of the actions can be used in any combination.

Refercences

  1. OpenAI Gym repo
  2. OpenAI Gym website
  3. Stable Baselines 3 repo
  4. Robotschool repo
  5. Gyem extension repos - This python package is an extension to OpenAI Gym for auxiliary tasks (multitask learning, transfer learning, inverse reinforcement learning, etc.)
  6. Example code of TL in DL repo
  7. Retro Contest - a transfer learning contest that measures a reinforcement learning algorithm’s ability to generalize from previous experience (hosted by OpenAI) link
  8. Rainbow: Combining Improvements in Deep Reinforcement Learning (repo), (paper)
  9. Experience replay (link)
  10. Solving RL classic control (link)

Related papers

  1. Transfer Learning for Related Reinforcement Learning Tasks via Image-to-Image Translation (paper), (repo)
  2. Deep Transfer Reinforcement Learning for Text Summarization (paper),(repo)
  3. Using Transfer Learning Between Games to Improve Deep Reinforcement Learning Performance and Stability (paper), (poster)
  4. Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics (IJCAI 2020) (paper), (repo)
  5. Using Transfer Learning Between Games to Improve Deep Reinforcement Learning Performance and Stability (paper), (poster)
  6. Deep Reinforcement Learning and Transfer Learning with Flappy Bird (paper), (poster)
  7. Decoupling Dynamics and Reward for Transfer Learning (paper), (repo)
  8. Progressive Neural Networks (paper)
  9. Deep Learning for Video Game Playing (paper)
  10. Disentangled Skill Embeddings for Reinforcement Learning (paper)
  11. Playing Atari with Deep Reinforcement Learning (paper)
  12. Dueling Network Architectures for Deep Reinforcement Learning (paper)
  13. ACTOR-MIMIC DEEP MULTITASK AND TRANSFER REINFORCEMENT LEARNING (paper)
  14. DDPG (link)

Contributors

  1. Nathan Beck [email protected]
  2. Abhiramon Rajasekharan [email protected]
  3. Trung Hieu Tran [email protected]
Owner
Trung Hieu Tran
Research Scientist @Facebook ; former @Apple
Trung Hieu Tran
Pre-training of Graph Augmented Transformers for Medication Recommendation

G-Bert Pre-training of Graph Augmented Transformers for Medication Recommendation Intro G-Bert combined the power of Graph Neural Networks and BERT (B

101 Dec 27, 2022
Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

Zhang Yi 72 Jan 01, 2023
This code is an unofficial implementation of HiFiSinger.

HiFiSinger This code is an unofficial implementation of HiFiSinger. The algorithm is based on the following papers: Chen, J., Tan, X., Luan, J., Qin,

Heejo You 87 Dec 23, 2022
🔎 Super-scale your images and run experiments with Residual Dense and Adversarial Networks.

Image Super-Resolution (ISR) The goal of this project is to upscale and improve the quality of low resolution images. This project contains Keras impl

idealo 4k Jan 08, 2023
Companion repository to the paper accepted at the 4th ACM SIGSPATIAL International Workshop on Advances in Resilient and Intelligent Cities

Transfer learning approach to bicycle sharing systems station location planning using OpenStreetMap Companion repository to the paper accepted at the

Politechnika Wrocławska - repozytorium dla informatyków 4 Oct 24, 2022
Image Super-Resolution by Neural Texture Transfer

SRNTT: Image Super-Resolution by Neural Texture Transfer Tensorflow implementation of the paper Image Super-Resolution by Neural Texture Transfer acce

Zhifei Zhang 413 Nov 30, 2022
Simple tools for logging and visualizing, loading and training

TNT TNT is a library providing powerful dataloading, logging and visualization utilities for Python. It is closely integrated with PyTorch and is desi

1.5k Jan 02, 2023
Pseudo-mask Matters in Weakly-supervised Semantic Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation By Yi Li, Zhanghui Kuang, Liyang Liu, Yimin Chen, Wayne Zhang SenseTime, Tsinghua Unive

33 Oct 14, 2022
This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

You can use this simple crypto backtesting script to ensure your trading strategy is successful Minimal setup required and works well with static TP a

Andrei 154 Sep 12, 2022
Code for Max-Margin Contrastive Learning - AAAI 2022

Max-Margin Contrastive Learning This is a pytorch implementation for the paper Max-Margin Contrastive Learning accepted to AAAI 2022. This repository

Anshul Shah 12 Oct 22, 2022
Official Implementation of "Designing an Encoder for StyleGAN Image Manipulation"

Designing an Encoder for StyleGAN Image Manipulation (SIGGRAPH 2021) Recently, there has been a surge of diverse methods for performing image editing

749 Jan 09, 2023
a short visualisation script for pyvideo data

PyVideo Speakers A CLI that visualises repeat speakers from events listed in https://github.com/pyvideo/data Not terribly efficient, but you know. Ins

Katie McLaughlin 3 Nov 24, 2021
Black box hyperparameter optimization made easy.

BBopt BBopt aims to provide the easiest hyperparameter optimization you'll ever do. Think of BBopt like Keras (back when Theano was still a thing) for

Evan Hubinger 70 Nov 03, 2022
Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord.

numpy2tfrecord Simple helper library to convert a collection of numpy data to tfrecord, and build a tensorflow dataset from the tfrecord. Installation

Ryo Yonetani 2 Jan 16, 2022
Some pvbatch (paraview) scripts for postprocessing OpenFOAM data

pvbatchForFoam Some pvbatch (paraview) scripts for postprocessing OpenFOAM data For every script there is a help message available: pvbatch pv_state_s

Morev Ilya 2 Oct 26, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
Python package facilitating the use of Bayesian Deep Learning methods with Variational Inference for PyTorch

PyVarInf PyVarInf provides facilities to easily train your PyTorch neural network models using variational inference. Bayesian Deep Learning with Vari

342 Dec 02, 2022
This is an official source code for implementation on Extensive Deep Temporal Point Process

Extensive Deep Temporal Point Process This is an official source code for implementation on Extensive Deep Temporal Point Process, which is composed o

Haitao Lin 8 Aug 15, 2022
tensorrt int8 量化yolov5 4.0 onnx模型

onnx模型转换为 int8 tensorrt引擎

123 Dec 28, 2022
Quantized models with python

quantized-network download .pth files to qmodels/: googlenet : https://download.

adreamxcj 2 Dec 28, 2021