Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021

Related tags

Deep LearningPPGS
Overview

PPGS: Planning from Pixels in Environments with Combinatorially Hard Search Spaces

PPGS Overview

Environment Setup

  • We recommend pipenv for creating and managing virtual environments (dependencies for other environment managers can be found in Pipfile)
git clone https://github.com/martius-lab/PPGS
cd ppgs
pipenv install
pipenv shell
  • For simplicity, this codebase is ready for training on two of the three environments (IceSlider and DigitJump). They are part of the puzzlegen package, which we provide here, and can be simply installed with
pip install -e https://github.com/martius-lab/puzzlegen
  • Offline datasets can be generated for training and validation. In the case of IceSlider we can use
python -m puzzlegen.extract_trajectories --record-dir /path/to/train_data --env-name ice_slider --start-level 0 --number-levels 1000 --max-steps 20 --n-repeat 20 --random 1
python -m puzzlegen.extract_trajectories --record-dir /path/to/test_data --env-name ice_slider --start-level 1000 --number-levels 1000 --max-steps 20 --n-repeat 5 --random 1
  • Finally, we can add the paths to the extracted datasets in default_params.json as data_params.train_path and data_params.test_path. We should also set the name of the environment for validation in data_params.env_name ("ice_slider" for IceSlider or "digit_jump" for DigitJump).

  • Training and evaluation are performed sequentially by running

python main.py

Configuration

All settings can be handled by editing default_config.json.

Param Default Info
optimizer_params.eps 1e-05 epsilon for Adam
train_params.seed null seed for training
train_params.epochs 40 # of training epochs
train_params.batch_size 128 batch size for training
train_params.save_every_n_epochs 5 how often to save models
train_params.val_every_n_epochs 2 how often to perform validation
train_params.lr_dict - dictionary of learning rates for each component
train_params.loss_weight_dict - dictionary of weights for the three loss functions
train_params.margin 0.1 latent margin epsilon
train_params.hinge_params - hyperparameters for margin loss
train_params.schedule [] learning rate schedule
model_params.name 'ppgs' name of the model to train in ['ppgs', 'latent']
model_params.load_model true whether to load saved model if present
model_params.filters [64, 128, 256, 512] encoder filters
model_params.embedding_size 16 dimensionality of latent space
model_params.normalize true whether to normalize embeddings
model_params.forward_layers 3 layers in MLP forward model for 'latent' world model
model_params.forward_units 256 units in MLP forward model for 'latent' world model
model_params.forward_ln true layer normalization in MLP forward model for 'latent' world model
model_params.inverse_layers 1 layers in MLP inverse model
model_params.inverse_units 32 units in MLP inverse model
model_params.inverse_ln true layer normalization in MLP inverse model
data_params.train_path '' path to training dataset
data_params.test_path '' path to validation dataset
data_params.env_name 'ice_slider' name of environment ('ice_slider' for IceSlider, 'digit_jump' for DigitJump
data_params.seq_len 2 number of steps for multi-step loss
data_params.shuffle true whether to shuffle datasets
data_params.normalize true whether to normalize observations
data_params.encode_position false enables positional encoding
data_params.env_params {} params to pass to environment
eval_params.evaluate_losses true whether to compute evaluation losses
eval_params.evaluate_rollouts true whether to compute solution rates
eval_params.eval_at [1,3,4] # of steps to evaluate at
eval_params.latent_eval_at [1,5,10] K for latent metrics
eval_params.seeds [2000] starting seed for evaluation levels
eval_params.num_levels 100 # evaluation levels
eval_params.batch_size 128 batch size for latent metrics evaluation
eval_params.planner_params.batch_size 256 cutoff for graph search
eval_params.planner_params.margin 0.1 latent margin for reidentification
eval_params.planner_params.early_stop true whether to stop when goal is found
eval_params.planner_params.backtrack false enables backtracking algorithm
eval_params.planner_params.penalize_visited false penalizes visited vertices in graph search
eval_params.planner_params.eps 0 enables epsilon greedy action selection
eval_params.planner_params.max_steps 256 maximal solution length
eval_params.planner_params.replan horizon 10 T_max for full planner
eval_params.planner_params.snap false snaps new vertices to visited ones
working_dir "results/ppgs" directory for checkpoints and results
Owner
Autonomous Learning Group
Autonomous Learning Group
Code for CVPR2021 "Visualizing Adapted Knowledge in Domain Transfer". Visualization for domain adaptation. #explainable-ai

Visualizing Adapted Knowledge in Domain Transfer @inproceedings{hou2021visualizing, title={Visualizing Adapted Knowledge in Domain Transfer}, auth

Yunzhong Hou 80 Dec 25, 2022
A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squares.

W.I.P-Aim-Memory-Game A customisable game where you have to quickly click on black tiles in order of appearance while avoiding clicking on white squar

dE_soot 1 Dec 08, 2021
Contains source code for the winning solution of the xView3 challenge

Winning Solution for xView3 Challenge This repository contains source code and pretrained models for my (Eugene Khvedchenya) solution to xView 3 Chall

Eugene Khvedchenya 51 Dec 30, 2022
This is a custom made virus code in python, using tkinter module.

skeleterrorBetaV0.1-Virus-code This is a custom made virus code in python, using tkinter module. This virus is not harmful to the computer, it only ma

AR 0 Nov 21, 2022
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 03, 2023
Generative Art Using Neural Visual Grammars and Dual Encoders

Generative Art Using Neural Visual Grammars and Dual Encoders Arnheim 1 The original algorithm from the paper Generative Art Using Neural Visual Gramm

DeepMind 231 Jan 05, 2023
Universal Adversarial Triggers for Attacking and Analyzing NLP (EMNLP 2019)

Universal Adversarial Triggers for Attacking and Analyzing NLP This is the official code for the EMNLP 2019 paper, Universal Adversarial Triggers for

Eric Wallace 248 Dec 17, 2022
An Unsupervised Detection Framework for Chinese Jargons in the Darknet

An Unsupervised Detection Framework for Chinese Jargons in the Darknet This repo is the Python 3 implementation of 《An Unsupervised Detection Framewor

7 Nov 08, 2022
Generating Images with Recurrent Adversarial Networks

Generating Images with Recurrent Adversarial Networks Python (Theano) implementation of Generating Images with Recurrent Adversarial Networks code pro

Daniel Jiwoong Im 121 Sep 08, 2022
Learning where to learn - Gradient sparsity in meta and continual learning

Learning where to learn - Gradient sparsity in meta and continual learning In this paper, we investigate gradient sparsity found by MAML in various co

Johannes Oswald 28 Dec 09, 2022
Spectral normalization (SN) is a widely-used technique for improving the stability and sample quality of Generative Adversarial Networks (GANs)

Why Spectral Normalization Stabilizes GANs: Analysis and Improvements [paper (NeurIPS 2021)] [paper (arXiv)] [code] Authors: Zinan Lin, Vyas Sekar, Gi

Zinan Lin 32 Dec 16, 2022
[ICCV 2021] Deep Hough Voting for Robust Global Registration

Deep Hough Voting for Robust Global Registration, ICCV, 2021 Project Page | Paper | Video Deep Hough Voting for Robust Global Registration Junha Lee1,

57 Nov 28, 2022
A Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images.

Lobe This is a Home Assistant custom component for Lobe. Lobe is an AI tool that can classify images. This component lets you easily use an exported m

Kendell R 4 Feb 28, 2022
Bayesian Deep Learning and Deep Reinforcement Learning for Object Shape Error Response and Correction of Manufacturing Systems

Bayesian Deep Learning for Manufacturing 2.0 (dlmfg) Object Shape Error Response (OSER) Digital Lifecycle Management - In Process Quality Improvement

Sumit Sinha 30 Oct 31, 2022
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.

neural-combinatorial-rl-pytorch PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning. I have implemented the basic

Patrick E. 454 Jan 06, 2023
Implementation of "Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis"

Generalizable Neural Performer: Learning Robust Radiance Fields for Human Novel View Synthesis Abstract: This work targets at using a general deep lea

163 Dec 14, 2022
Speech-Emotion-Analyzer - The neural network model is capable of detecting five different male/female emotions from audio speeches. (Deep Learning, NLP, Python)

Speech Emotion Analyzer The idea behind creating this project was to build a machine learning model that could detect emotions from the speech we have

Mitesh Puthran 965 Dec 24, 2022
An End-to-End Machine Learning Library to Optimize AUC (AUROC, AUPRC).

Logo by Zhuoning Yuan LibAUC: A Machine Learning Library for AUC Optimization Website | Updates | Installation | Tutorial | Research | Github LibAUC a

Optimization for AI 176 Jan 07, 2023
Blind Video Temporal Consistency via Deep Video Prior

deep-video-prior (DVP) Code for NeurIPS 2020 paper: Blind Video Temporal Consistency via Deep Video Prior PyTorch implementation | paper | project web

Chenyang LEI 272 Dec 21, 2022
Code release for "Self-Tuning for Data-Efficient Deep Learning" (ICML 2021)

Self-Tuning for Data-Efficient Deep Learning This repository contains the implementation code for paper: Self-Tuning for Data-Efficient Deep Learning

THUML @ Tsinghua University 101 Dec 11, 2022