Planning from Pixels in Environments with Combinatorially Hard Search Spaces -- NeurIPS 2021

Related tags

Deep LearningPPGS
Overview

PPGS: Planning from Pixels in Environments with Combinatorially Hard Search Spaces

PPGS Overview

Environment Setup

  • We recommend pipenv for creating and managing virtual environments (dependencies for other environment managers can be found in Pipfile)
git clone https://github.com/martius-lab/PPGS
cd ppgs
pipenv install
pipenv shell
  • For simplicity, this codebase is ready for training on two of the three environments (IceSlider and DigitJump). They are part of the puzzlegen package, which we provide here, and can be simply installed with
pip install -e https://github.com/martius-lab/puzzlegen
  • Offline datasets can be generated for training and validation. In the case of IceSlider we can use
python -m puzzlegen.extract_trajectories --record-dir /path/to/train_data --env-name ice_slider --start-level 0 --number-levels 1000 --max-steps 20 --n-repeat 20 --random 1
python -m puzzlegen.extract_trajectories --record-dir /path/to/test_data --env-name ice_slider --start-level 1000 --number-levels 1000 --max-steps 20 --n-repeat 5 --random 1
  • Finally, we can add the paths to the extracted datasets in default_params.json as data_params.train_path and data_params.test_path. We should also set the name of the environment for validation in data_params.env_name ("ice_slider" for IceSlider or "digit_jump" for DigitJump).

  • Training and evaluation are performed sequentially by running

python main.py

Configuration

All settings can be handled by editing default_config.json.

Param Default Info
optimizer_params.eps 1e-05 epsilon for Adam
train_params.seed null seed for training
train_params.epochs 40 # of training epochs
train_params.batch_size 128 batch size for training
train_params.save_every_n_epochs 5 how often to save models
train_params.val_every_n_epochs 2 how often to perform validation
train_params.lr_dict - dictionary of learning rates for each component
train_params.loss_weight_dict - dictionary of weights for the three loss functions
train_params.margin 0.1 latent margin epsilon
train_params.hinge_params - hyperparameters for margin loss
train_params.schedule [] learning rate schedule
model_params.name 'ppgs' name of the model to train in ['ppgs', 'latent']
model_params.load_model true whether to load saved model if present
model_params.filters [64, 128, 256, 512] encoder filters
model_params.embedding_size 16 dimensionality of latent space
model_params.normalize true whether to normalize embeddings
model_params.forward_layers 3 layers in MLP forward model for 'latent' world model
model_params.forward_units 256 units in MLP forward model for 'latent' world model
model_params.forward_ln true layer normalization in MLP forward model for 'latent' world model
model_params.inverse_layers 1 layers in MLP inverse model
model_params.inverse_units 32 units in MLP inverse model
model_params.inverse_ln true layer normalization in MLP inverse model
data_params.train_path '' path to training dataset
data_params.test_path '' path to validation dataset
data_params.env_name 'ice_slider' name of environment ('ice_slider' for IceSlider, 'digit_jump' for DigitJump
data_params.seq_len 2 number of steps for multi-step loss
data_params.shuffle true whether to shuffle datasets
data_params.normalize true whether to normalize observations
data_params.encode_position false enables positional encoding
data_params.env_params {} params to pass to environment
eval_params.evaluate_losses true whether to compute evaluation losses
eval_params.evaluate_rollouts true whether to compute solution rates
eval_params.eval_at [1,3,4] # of steps to evaluate at
eval_params.latent_eval_at [1,5,10] K for latent metrics
eval_params.seeds [2000] starting seed for evaluation levels
eval_params.num_levels 100 # evaluation levels
eval_params.batch_size 128 batch size for latent metrics evaluation
eval_params.planner_params.batch_size 256 cutoff for graph search
eval_params.planner_params.margin 0.1 latent margin for reidentification
eval_params.planner_params.early_stop true whether to stop when goal is found
eval_params.planner_params.backtrack false enables backtracking algorithm
eval_params.planner_params.penalize_visited false penalizes visited vertices in graph search
eval_params.planner_params.eps 0 enables epsilon greedy action selection
eval_params.planner_params.max_steps 256 maximal solution length
eval_params.planner_params.replan horizon 10 T_max for full planner
eval_params.planner_params.snap false snaps new vertices to visited ones
working_dir "results/ppgs" directory for checkpoints and results
Owner
Autonomous Learning Group
Autonomous Learning Group
This repo contains the implementation of YOLOv2 in Keras with Tensorflow backend.

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).

Huynh Ngoc Anh 1.7k Dec 24, 2022
Automatic Image Background Subtraction

Automatic Image Background Subtraction This repo contains set of scripts for automatic one-shot image background subtraction task using the following

Oleg Sémery 6 Dec 05, 2022
A robotic arm that mimics hand movement through MediaPipe tracking.

La-Z-Arm A robotic arm that mimics hand movement through MediaPipe tracking. Hardware NVidia Jetson Nano Sparkfun Pi Servo Shield Micro Servos Webcam

Alfred 1 Jun 05, 2022
Bayesian Meta-Learning Through Variational Gaussian Processes

vmgp This is the repository of Vivek Myers and Nikhil Sardana for our CS 330 final project, Bayesian Meta-Learning Through Variational Gaussian Proces

Vivek Myers 2 Nov 17, 2022
Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions

EMS-COLS-recourse Initial Code for Low-Cost Algorithmic Recourse for Users With Uncertain Cost Functions Folder structure: data folder contains raw an

Prateek Yadav 1 Nov 25, 2022
Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous Event-Based Data"

A Differentiable Recurrent Surface for Asynchronous Event-Based Data Code for the ECCV2020 paper "A Differentiable Recurrent Surface for Asynchronous

Marco Cannici 21 Oct 05, 2022
Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback

CoSMo.pytorch Official Implementation of CoSMo: Content-Style Modulation for Image Retrieval with Text Feedback, Seungmin Lee*, Dongwan Kim*, Bohyung

Seung Min Lee 54 Dec 08, 2022
Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN) This code implements the skeleton-based action segmentation MS-GCN model from Autom

Benjamin Filtjens 8 Nov 29, 2022
Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021) Zeyu Wang, Sherry Qiu, Nicole Feng, Holly Rushmeier, Leonard McMill

Zach Zeyu Wang 23 Dec 09, 2022
Learning-based agent for Google Research Football

TiKick 1.Introduction Learning-based agent for Google Research Football Code accompanying the paper "TiKick: Towards Playing Multi-agent Football Full

Tsinghua AI Research Team for Reinforcement Learning 90 Dec 26, 2022
frida工具的缝合怪

fridaUiTools fridaUiTools是一个界面化整理脚本的工具。新人的练手作品。参考项目ZenTracer,觉得既然可以界面化,那么应该可以把功能做的更加完善一些。跨平台支持:win、mac、linux 功能缝合怪。把一些常用的frida的hook脚本简单统一输出方式后,整合进来。并且

diveking 997 Jan 09, 2023
Weighing Counts: Sequential Crowd Counting by Reinforcement Learning

LibraNet This repository includes the official implementation of LibraNet for crowd counting, presented in our paper: Weighing Counts: Sequential Crow

Hao Lu 18 Nov 05, 2022
a pytorch implementation of auto-punctuation learned character by character

Learning Auto-Punctuation by Reading Engadget Articles Link to Other of my work 🌟 Deep Learning Notes: A collection of my notes going from basic mult

Ge Yang 137 Nov 09, 2022
RefineGNN - Iterative refinement graph neural network for antibody sequence-structure co-design (RefineGNN)

Iterative refinement graph neural network for antibody sequence-structure co-des

Wengong Jin 83 Dec 31, 2022
Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification T M Feroz Ali, Subhasis Chaudhuri, ICVGIP-20-21

T M Feroz Ali 3 Jun 17, 2022
Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder Authors: - Eashan Adhikarla - Dan Luo - Dr. Brian D. Davison Abstract Many

Eashan Adhikarla 4 Dec 25, 2022
TensorFlow CNN for fast style transfer

Fast Style Transfer in TensorFlow Add styles from famous paintings to any photo in a fraction of a second! It takes 100ms on a 2015 Titan X to style t

1 Dec 14, 2021
Official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

peng gao 42 Nov 26, 2022
Indices Matter: Learning to Index for Deep Image Matting

IndexNet Matting This repository includes the official implementation of IndexNet Matting for deep image matting, presented in our paper: Indices Matt

Hao Lu 357 Nov 26, 2022
Unsupervised Representation Learning by Invariance Propagation

Unsupervised Learning by Invariance Propagation This repository is the official implementation of Unsupervised Learning by Invariance Propagation. Pre

FengWang 15 Jul 06, 2022