Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Related tags

Deep LearningPASF
Overview

Learning Domain Invariant Representations in Goal-conditioned Block MDPs

Beining Han,   Chongyi Zheng,   Harris Chan,   Keiran Paster,   Michael R. Zhang,   Jimmy Ba

paper

Summary: Deep Reinforcement Learning agents often face unanticipated environmental changes after deployment in the real world. These changes are often spurious and unrelated to the underlying problem, such as background shifts for visual input agents. Unfortunately, deep RL policies are usually sensitive to these changes and fail to act robustly against them. This resembles the problem of domain generalization in supervised learning. In this work, we study this problem for goal-conditioned RL agents. We propose a theoretical framework in the Block MDP setting that characterizes the generalizability of goal-conditioned policies to new environments. Under this framework, we develop a practical method PA-SkewFit (PASF) that enhances domain generalization.

@article{han2021learning,
  title={Learning Domain Invariant Representations in Goal-conditioned Block MDPs},
  author={Han, Beining and Zheng, Chongyi and Chan, Harris and Paster, Keiran and Zhang, Michael and Ba, Jimmy},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

Installation

Our code was adapted from rlkit and was tested on a Ubuntu 20.04 server.

This instruction assumes that you have already installed NVIDIA driver, Anaconda, and MuJoCo.

You'll need to get your own MuJoCo key if you want to use MuJoCo.

1. Create Anaconda environment

Install the included Anaconda environment

$ conda env create -f environment/pasf_env.yml
$ source activate pasf_env
(pasf_env) $ python

2. Download the goals

Download the goals from the following link and put it here: (PASF DIR)/multiworld/envs/mujoco.

$ ls (PASF DIR)/multiworld/envs/mujoco
... goals ... 
  1. (Optional) Speed up with GPU rendering

3. (Optional) Speed-up with GPU rendering

Note: GPU rendering for mujoco-py speeds up training a lot but consumes more GPU memory at the same time.

Check this Issues:

Remember to do this stuff with the mujoco-py package inside of your pasf_env.

Running Experiments

The following command run the PASF experiments for the four tasks: Reach, Door, Push, Pickup, in the learning curve respectively.

$ source activate pasf_env
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_reach_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_door_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_push_lc_exp.bash
(pasf_env) $ bash (PASF DIR)/bash_scripts/pasf_pickup_lc_exp.bash
  • The bash scripts only set equation, equation, and equation with the exact values we used for LC. But you can play with other hyperparameters in python scripts under (PASF DIR)/experiment.

  • Training and evaluation environments are chosen in python scripts for each task. You can find the backgrounds in (PASF DIR)/multiworld/core/background and domains in (PASF DIR)/multiworld/envs/assets/sawyer_xyz.

  • Results are recorded in progress.csv under (PASF DIR)/data/ and variant.json contains configuration for each experiment.

  • We simply set random seeds as 0, 1, 2, etc., and run experiments with 6-9 different seeds for each task.

  • Error and output logs can be found in (PASF DIR)/terminal_log.

Questions

If you have any questions, comments, or suggestions, please reach out to Beining Han ([email protected]) and Chongyi Zheng ([email protected]).

Owner
Chongyi Zheng
Chongyi Zheng
Transfer Learning for Pose Estimation of Illustrated Characters

bizarre-pose-estimator Transfer Learning for Pose Estimation of Illustrated Characters Shuhong Chen *, Matthias Zwicker * WACV2022 [arxiv] [video] [po

Shuhong Chen 142 Dec 28, 2022
[CVPR'21] DeepSurfels: Learning Online Appearance Fusion

DeepSurfels: Learning Online Appearance Fusion Paper | Video | Project Page This is the official implementation of the CVPR 2021 submission DeepSurfel

Online Reconstruction 52 Nov 14, 2022
FOSS Digital Asset Distribution Platform built on Frappe.

Digistore FOSS Digital Assets Marketplace. Distribute digital assets, like a pro. Video Demo Here Features Create, attach and list digital assets (PDF

Mohammad Hussain Nagaria 30 Dec 08, 2022
We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

Overview This repository provides the implementation for the paper "Improved Regularization and Robustness for Fine-tuning in Neural Networks", which

NEU-StatsML-Research 21 Sep 08, 2022
[NeurIPS 2021] Introspective Distillation for Robust Question Answering

Introspective Distillation (IntroD) This repository is the Pytorch implementation of our paper "Introspective Distillation for Robust Question Answeri

Yulei Niu 13 Jul 26, 2022
Metrics to evaluate quality and efficacy of synthetic datasets.

An Open Source Project from the Data to AI Lab, at MIT Metrics for Synthetic Data Generation Projects Website: https://sdv.dev Documentation: https://

The Synthetic Data Vault Project 129 Jan 03, 2023
Basics of 2D and 3D Human Pose Estimation.

Human Pose Estimation 101 If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolv

Sudharshan Chandra Babu 293 Dec 14, 2022
Relaxed-machines - explorations in neuro-symbolic differentiable interpreters

Relaxed Machines Explorations in neuro-symbolic differentiable interpreters. Baby steps: inc_stop Libraries JAX Haiku Optax Resources Chapter 3 (∂4: A

Nada Amin 6 Feb 02, 2022
Classifying audio using Wavelet transform and deep learning

Audio Classification using Wavelet Transform and Deep Learning A step-by-step tutorial to classify audio signals using continuous wavelet transform (C

Aditya Dutt 17 Nov 29, 2022
Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (CVAMD)

Is it Time to Replace CNNs with Transformers for Medical Images? Accepted at ICCV-2021: Workshop on Computer Vision for Automated Medical Diagnosis (C

Christos Matsoukas 80 Dec 27, 2022
Second-order Attention Network for Single Image Super-resolution (CVPR-2019)

Second-order Attention Network for Single Image Super-resolution (CVPR-2019) "Second-order Attention Network for Single Image Super-resolution" is pub

516 Dec 28, 2022
Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

scikit-event-correlation Event Correlation and Changing Detection Algorithm Theo

Intellia ICT 5 Oct 30, 2022
An imperfect information game is a type of game with asymmetric information

DecisionHoldem An imperfect information game is a type of game with asymmetric information. Compared with perfect information game, imperfect informat

Decision AI 25 Dec 23, 2022
Code for our NeurIPS 2021 paper: Sparsely Changing Latent States for Prediction and Planning in Partially Observable Domains

GateL0RD This is a lightweight PyTorch implementation of GateL0RD, our RNN presented in "Sparsely Changing Latent States for Prediction and Planning i

Autonomous Learning Group 16 Nov 03, 2022
Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized

VQGAN-CLIP-Docker About Zero-Shot Text-to-Image Generation VQGAN+CLIP Dockerized This is a stripped and minimal dependency repository for running loca

Kevin Costa 73 Sep 11, 2022
Baleen: Robust Multi-Hop Reasoning at Scale via Condensed Retrieval (NeurIPS'21)

Baleen Baleen is a state-of-the-art model for multi-hop reasoning, enabling scalable multi-hop search over massive collections for knowledge-intensive

Stanford Future Data Systems 22 Dec 05, 2022
Code for EMNLP 2021 paper: "Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training"

SCAPT-ABSA Code for EMNLP2021 paper: "Learning Implicit Sentiment in Aspect-based Sentiment Analysis with Supervised Contrastive Pre-Training" Overvie

Zhengyan Li 66 Dec 04, 2022
Code for Paper "Evidential Softmax for Sparse MultimodalDistributions in Deep Generative Models"

Evidential Softmax for Sparse Multimodal Distributions in Deep Generative Models Abstract Many applications of generative models rely on the marginali

Stanford Intelligent Systems Laboratory 9 Jun 06, 2022
Some pre-commit hooks for OpenMMLab projects

pre-commit-hooks Some pre-commit hooks for OpenMMLab projects. Using pre-commit-hooks with pre-commit Add this to your .pre-commit-config.yaml - rep

OpenMMLab 16 Nov 29, 2022
List of awesome things around semantic segmentation 🎉

Awesome Semantic Segmentation List of awesome things around semantic segmentation 🎉 Semantic segmentation is a computer vision task in which we label

Dam Minh Tien 18 Nov 26, 2022