Multi-Stage Episodic Control for Strategic Exploration in Text Games

Last update: May 24, 2022

Overview

XTX: eXploit - Then - eXplore

Requirements

First clone this repo using git clone https://github.com/princeton-nlp/XTX.git

Please create two conda environments as follows:

conda env create -f yml_envs/jericho-wt.yml
a. conda activate jericho-wt
b. pip install git+https://github.com/jens321/[email protected]
conda env create -f yml_envs/jericho-no-wt.yml

The first set of commands will create a conda environment called jericho-wt which has added actions to the game grammar for specific games (see games with * in the paper). The second command will create another conda environment called jericho-no-wt which installs an unmodified version of the Jericho library.

Training

All code can be run from the root folder of this project. Please follow the commands below for each specific model:

XTX: sh scripts/run_xtx.sh
XTX (no-mix): sh scripts/run_xtx_no_mix.sh
XTX (uniform): sh scrtips/run_xtx_uniform.sh
XTX ($\lambda$ = 0, 0.5, or 1): sh scripts/run_xtx_ablation.sh
INV DY: sh scripts/run_inv_dy.sh
DRRN: sh scripts/run_drrn.sh

Notes

You can use analysis/sample_env.py for quickly playing around with a sample Jericho environment. Run it using python3 -m analysis.sample_env.
You can use analysis/augment_wt.py for generating the missing action candidates that can be added to the game grammar (games with * in the paper). Run it using python3 -m analysis.augment_wt.
Note that all models should finish within a day or two given 1 gpu and 8 cpus, except for games where Jericho's valid action handicap is slow (e.g. Library, Dragon). Since Jericho's valid action handicap heavily relies on parallelization, increasing the number of cpus also results in good speedups (e.g. 8 -> 16).

Acknowledgements

We used Weights & Biases for experiment tracking and visualizations to develop insights for this paper.

Some of the code borrows from the TDQN repo.

For any questions please contact Jens Tuyls ([email protected]).

Multi-Stage Episodic Control for Strategic Exploration in Text Games

Related tags

Overview

XTX: eXploit - Then - eXplore

Requirements

Training

Notes

Acknowledgements

Owner

Princeton Natural Language Processing

Combine Tacotron2 and Hifi GAN to generate speech from text

[IEEE Transactions on Computational Imaging] Self-Gated Memory Recurrent Network for Efficient Scalable HDR Deghosting

A machine learning library for spiking neural networks. Supports training with both torch and jax pipelines, and deployment to neuromorphic hardware.

PyTorch implementation of the NIPS-17 paper "Poincaré Embeddings for Learning Hierarchical Representations"

Learning to Reconstruct 3D Manhattan Wireframes from a Single Image

Code for NeurIPS 2021 paper "Curriculum Offline Imitation Learning"

CenterPoint 3D Object Detection and Tracking using center points in the bird-eye view.

Benchmark for Answering Existential First Order Queries with Single Free Variable

[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

DrNAS: Dirichlet Neural Architecture Search

Official Code for ICML 2021 paper "Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline"

DrWhy is the collection of tools for eXplainable AI (XAI). It's based on shared principles and simple grammar for exploration, explanation and visualisation of predictive models.

Implementation of MeMOT - Multi-Object Tracking with Memory - in Pytorch

Image super-resolution through deep learning

Magisk module to enable hidden features on Android 12 Developer Preview 1.

C3d-pytorch - Pytorch porting of C3D network, with Sports1M weights

IhoneyBakFileScan Modify - 批量网站备份文件扫描器，增加文件规则，优化内存占用

Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Implementation EfficientDet: Scalable and Efficient Object Detection in PyTorch

Official repository of PanoAVQA: Grounded Audio-Visual Question Answering in 360° Videos (ICCV 2021)