Framework for training options with different attention mechanism and using them to solve downstream tasks.

Overview

Using Attention in HRL

Framework for training options with different attention mechanism and using them to solve downstream tasks.

Requirements

GPU required

conda env create -f conda_env.yml

After the instalation ends you can activate your environment and install remaining dependencies. (e.g. sub-module gym_minigrid which is a modified version of MiniGrid )

conda activate affenv
cd gym-minigrid
pip install -e .
cd ../
pip install -e .

Instructions

In order to train options and IC_net follow these steps:

1. Configure desired environment - number of task and objects per task in file config/op_ic_net.yaml. E.g:
  env_args:
    task_size: 3
    num_tasks: 4

2. Configure desired type of attention (between "affordance", "interest", "nan") - in file config/op_ic_net.yaml. E.g. 
main:
  attention: "affordance" 

3. Train by running command
liftoff train_main.py configs/op_ic_net.yaml

Once a pre-trained option checkpoint exists a HRL agent can be trained to solve the downstream task (for the same environment the options were trained on). Follow these steps in order to train an HRL-Agent with different types of attentions:

1. Configure checkpoint (experiment config file and options_model_id) for pre-trained Options and IC_net - in file configs/hrl-agent.yaml. E.g: 

main:
  options_model_cfg: "results/op_aff_4x3/0000_multiobj/0/cfg.yaml"
  options_model_id: -1  # Last checkpoint will be used

2. Configure type of attention for training the HRL-agent (between "affordance", "interest", "nan") - in file configs/hrl-agent.yaml. E.g:
main:
  modulate_policy: affordance

3. Train HRL-agent by running command
liftoff train_mtop_ppo.py configs/hrl-agent.yaml

Both training scrips produce results in the results folder, where all the outputs are going to be stored including train/eval logs, checkpoints. Live plotting is integrated using services from Wandb (plotting has to be enabled in the config file main:plot and user logged in Wandb or user login api key in the file .wandb_key).

The console output is also available in a form:

  • Option Pre-training e.g.:
U 11 | F 022528 | FPS 0024 | D 402 | rR:u, 0.03 | F:u, 41.77 | tL:u 0.00 | tPL:u 6.47 | tNL:u 0.00 | t 52 | aff_loss 0.0570 | aff 2.8628 | NOaff 0.0159 | ic 0.0312 | cnt_ic 1.0000 | oe 2.4464 | oic0 0.0000 | oic1 0.0000 | oic2 0.0000 | oic3 0.0000 | oPic0 0.0000 | oPic1 0.0000 | oPic2 0.0000 | oPic3 0.0000 | icB 0.0208 | PicB 0.1429 | icND 0.0192

Some of the training entries decodes as

F - number of frames (steps in the env)
tL - termination loss
aff_loss - IC_net loss
cnt_ic - Intent completion per training batch 
oicN - Intent completion fraction for each option N out of Total option N sampled
oPicN - Intent completion fraction for each option N out of affordable ones
PicB - Intent completion average over all options out of affordable ones
  • HRL-agent training
U 1 | F 4555192.0 | FPS 21767 | D 209 | rR:u, 0.00 | F:u, 8.11 | e:u, 2.48 | v:u 0.00 | pL:u 0.01 | vL:u 0.00 | g:u 0.01 | TrR:u, 0.00

Some of the training entries decodes as

F - number of frames (steps in the env offseted by the number of pre-training steps)
rR - Accumulated episode reward average
TrR - Average episode success rate

Framework structure

The code is organised as follows:

  • agents/ - implementation of agents (e.g. training options and IC_net multistep_affordance.py; hrl-agent PPO ppo_smdp.py )
  • configs/ - config files for training agents
  • gym-minigrid/ - sub-module - Minigrid envs
  • models/ - Neural network modules (e.g options with IC_net aff_multistep.py and CNN backbone extractor_cnn_v2.py)
  • utils/ - Scripts for e.g.: running envs in parallel, preprocessing observations, gym wrappers, data structures, logging modules
  • train_main.py - Train Options with IC_net
  • train_mtop_ppo.py - Train HRL-agent

Acknowledgements

We used PyTorch as a machine learning framework.

We used liftoff for experiment management.

We used wandb for plotting.

We used PPO adapted for training our agents.

We used MiniGrid to create our environment.

Code for unmixing audio signals in four different stems "drums, bass, vocals, others". The code is adapted from "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Disclaimer This code is a based on "Jukebox: A Generative Model for Music" Paper We adju

Wadhah Zai El Amri 24 Dec 29, 2022
Code, pre-trained models and saliency results for the paper "Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB Images".

Boosting RGB-D Saliency Detection by Leveraging Unlabeled RGB This repository is the official implementation of the paper. Our results comming soon in

Xiaoqiang Wang 8 May 22, 2022
Code basis for the paper "Camera Condition Monitoring and Readjustment by means of Noise and Blur" (2021)

Camera Condition Monitoring and Readjustment by means of Noise and Blur This repository contains the source code of the paper: Wischow, M., Gallego, G

7 Dec 22, 2022
This is the official pytorch implementation for our ICCV 2021 paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering" on VQA Task

🌈 ERASOR (RA-L'21 with ICRA Option) Official page of "ERASOR: Egocentric Ratio of Pseudo Occupancy-based Dynamic Object Removal for Static 3D Point C

Hyungtae Lim 225 Dec 29, 2022
Code for the KDD 2021 paper 'Filtration Curves for Graph Representation'

Filtration Curves for Graph Representation This repository provides the code from the KDD'21 paper Filtration Curves for Graph Representation. Depende

Machine Learning and Computational Biology Lab 16 Oct 16, 2022
Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Volumetric TSDF Fusion of RGB-D Images in Python This is a lightweight python script that fuses multiple registered color and depth images into a proj

Andy Zeng 845 Jan 03, 2023
Deep Learning for Computer Vision final project

Deep Learning for Computer Vision final project

grassking100 1 Nov 30, 2021
🐦 Opytimizer is a Python library consisting of meta-heuristic optimization techniques.

Opytimizer: A Nature-Inspired Python Optimizer Welcome to Opytimizer. Did you ever reach a bottleneck in your computational experiments? Are you tired

Gustavo Rosa 546 Dec 31, 2022
An efficient and easy-to-use deep learning model compression framework

TinyNeuralNetwork 简体中文 TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework, which contains features like neura

Alibaba 441 Dec 25, 2022
It's like Shape Editor in Maya but works with skeletons (transforms).

Skeleposer What is Skeleposer? Briefly, it's like Shape Editor in Maya, but works with transforms and joints. It can be used to make complex facial ri

Alexander Zagoruyko 1 Nov 11, 2022
Cours d'Algorithmique Appliquée avec Python pour BTS SIO SISR

Course: Introduction to Applied Algorithms with Python (in French) This is the source code of the website for the Applied Algorithms with Python cours

Loic Yvonnet 0 Jan 27, 2022
*ObjDetApp* deploys a pytorch model for object detection

*ObjDetApp* deploys a pytorch model for object detection

Will Chao 1 Dec 26, 2021
A simple, unofficial implementation of MAE using pytorch-lightning

Masked Autoencoders in PyTorch A simple, unofficial implementation of MAE (Masked Autoencoders are Scalable Vision Learners) using pytorch-lightning.

Connor Anderson 20 Dec 03, 2022
DTCN IJCAI - Sequential prediction learning framework and algorithm

DTCN This is the implementation of our paper "Sequential Prediction of Social Me

Bobby 2 Jan 24, 2022
Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks.

Luminous is a framework for testing the performance of Embodied AI (EAI) models in indoor tasks. Generally, we intergrete different kind of functional

28 Jan 08, 2023
K-Nearest Neighbor in Pytorch

Pytorch KNN CUDA 2019/11/02 This repository will no longer be maintained as pytorch supports sort() and kthvalue on tensors. git clone https://github.

Chris Choy 65 Dec 01, 2022
Shōgun

The SHOGUN machine learning toolbox Unified and efficient Machine Learning since 1999. Latest release: Cite Shogun: Develop branch build status: Donat

Shōgun ML 2.9k Jan 04, 2023
Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via Bayesian Deep Learning

Incremental Cross-Domain Adaptation for Robust Retinopathy Screening via Bayesian Deep Learning Update (September 18th, 2021) A supporting document de

Taimur Hassan 1 Mar 16, 2022
blind SQLIpy sebuah alat injeksi sql yang menggunakan waktu sql untuk mendapatkan sebuah server database.

blind SQLIpy Alat blind SQLIpy ini merupakan alat injeksi sql yang menggunakan metode time based blind sql injection metode tersebut membutuhkan waktu

Galih Anggoro Prasetya 4 Feb 24, 2022
Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022