Adaptive Attention Span for Reinforcement Learning

Last update: Nov 15, 2022

Overview

Adaptive Transformers in RL

Official implementation of Adaptive Transformers in RL

In this work we replicate several results from Stabilizing Transformers for RL on both Pong and rooms_select_nonmatching_object from DMLab30.

We also extend the Stable Transformer architecture with Adaptive Attention Span on a partially observable (POMDP) setting of Reinforcement Learning. To our knowledge this is one of the first attempts to stabilize and explore Adaptive Attention Span in an RL domain.

Steps to replicate what we did on your own machine

Downloading DMLab:
- Build DMLab package with Bazel– https://github.com/deepmind/lab/blob/master/docs/users/build.md
- Install the python module for DMLab– https://github.com/deepmind/lab/tree/master/python/pip_package
Downloading Atari: Getting Started with Gym– http://gym.openai.com/docs/#getting-started-with-gym
Execution notes:

The experiments take around 4 hours on 32vCPUs and 2 P100 GPUs for 6 million environment interactions. To run without a GPU, use the flag “--disable_cuda”.
For more details on other flags, see the top of train.py (include a link to this file) which has descriptions for each.
All experiments use a slightly revised version of IMPALA from torchbeast

Snippets

Best performing adaptive attention span model on “rooms_select_nonmatching_object”:

python train.py --total_steps 20000000 \
--learning_rate 0.0001 --unroll_length 299 --num_buffers 40 --n_layer 3 \
--d_inner 1024 --xpid row85 --chunk_size 100 --action_repeat 1 \
--num_actors 32 --num_learner_threads 1 --sleep_length 20 \
--level_name rooms_select_nonmatching_object --use_adaptive \
--attn_span 400 --adapt_span_loss 0.025 --adapt_span_cache

Best performing Stable Transformer on Pong:

python train.py --total_steps 10000000 \
--learning_rate 0.0004 --unroll_length 239 --num_buffers 40 \
--n_layer 3 --d_inner 1024 --xpid row82 --chunk_size 80 \
--action_repeat 1 --num_actors 32 --num_learner_threads 1 \
--sleep_length 5 --atari True

Best performing Stable Transformer on “rooms_select_nonmatching_object”:

python train.py --total_steps 20000000 \
--learning_rate 0.0001 --unroll_length 299 \
--num_buffers 40 --n_layer 3 --d_inner 1024 \
--xpid row79 --chunk_size 100 --action_repeat 1 \
--num_actors 32 --num_learner_threads 1 --sleep_length 20 \
--level_name rooms_select_nonmatching_object  --mem_len 200

Reference

If you find this repository useful, do cite it with,

@article{kumar2020adaptive,
    title={Adaptive Transformers in RL},
    author={Shakti Kumar and Jerrod Parker and Panteha Naderian},
    year={2020},
    eprint={2004.03761},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Adaptive Attention Span for Reinforcement Learning

Related tags

Overview

Adaptive Transformers in RL

Steps to replicate what we did on your own machine

Snippets

Reference

Owner

traiNNer is an open source image and video restoration (super-resolution, denoising, deblurring and others) and image to image translation toolbox based on PyTorch.

Nest Protect integration for Home Assistant. This will allow you to integrate your smoke, heat, co and occupancy status real-time in HA.

MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

STMTrack: Template-free Visual Tracking with Space-time Memory Networks

Crosslingual Segmental Language Model

MatchGAN: A Self-supervised Semi-supervised Conditional Generative Adversarial Network

IEEE-CIS Technical Challenge on Predict+Optimize for Renewable Energy Scheduling

A tutorial on training a DarkNet YOLOv4 model for the CrowdHuman dataset

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

Code release for Local Light Field Fusion at SIGGRAPH 2019

[ICML 2021] Towards Understanding and Mitigating Social Biases in Language Models

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

BMVC 2021: This is the github repository for "Few Shot Temporal Action Localization using Query Adaptive Transformers" accepted in British Machine Vision Conference (BMVC) 2021, Virtual

Xview3 solution - XView3 challenge, 2nd place solution

Source code for Acorn, the precision farming rover by Twisted Fields

Facial Image Inpainting with Semantic Control

Beginner-friendly repository for Hacktober Fest 2021. Start your contribution to open source through baby steps. 💜

Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Supplementary code for the paper "Meta-Solver for Neural Ordinary Differential Equations" https://arxiv.org/abs/2103.08561