Implementation of MA-Trace - a general-purpose multi-agent RL algorithm for cooperative environments.

Related tags

Deep Learningseed_rl
Overview

Off-Policy Correction For Multi-Agent Reinforcement Learning

This repository is the official implementation of Off-Policy Correction For Multi-Agent Reinforcement Learning. It is based on SEED RL, commit 5f07ba2a072c7a562070b5a0b3574b86cd72980f.

Requirements

Execution of our code is done within Docker container, you must install Docker according to the instructions provided by the authors. The specific requirements for our project are prepared as dockerfile (docker/Dockerfile.starcraft) and installed inside a container during the first execution of running script. Before running training, firstly build its base image by running:

./docker_base/marlgrid/docker/build_base.sh

Note that to execute docker commands you may need to use sudo or install Docker in rootless mode.

Training

To train a MA-Trace model, run the following command:

./run_local.sh starcraft vtrace [nb of actors] [configuration]

The [nb of actors] specifies the number of workers used for training, should be a positive natural number.

The [configuration] specifies the hyperparameters of training.

The most important hyperparameters are:

  • learning_rate the learning rate
  • entropy_cost initial entropy cost
  • target_entropy final entropy cost
  • entropy_cost_adjustment_speed how fast should entropy cost be adjusted towards the final value
  • frames_stacked the number of stacked frames
  • batch_size the size of training batches
  • discounting the discount factor
  • full_state_critic whether to use full state as input to critic network, set False to use only agents' observations
  • is_centralized whether to perform centralized or decentralized training
  • task_name name of the SMAC task to train on, see the section below

There are other parameters to configure, listed in the files, though of minor importance.

The running script provides evaluation metrics during training. They are displayed using tmux, consider checking the navigation controls.

For example, to use default parameters and one actor, run:

./run_local.sh starcraft vtrace 1 ""

To train the algorithm specified in the paper:

  • MA-Trace (obs): ./run_local.sh starcraft vtrace 1 "--full_state_critic=False"
  • MA-Trace (full): ./run_local.sh starcraft vtrace 1 "--full_state_critic=True"
  • DecMa-Trace: ./run_local.sh starcraft vtrace 1 "--is_centralized=False"
  • MA-Trace (obs) with 3 stacked observations: ./run_local.sh starcraft vtrace 1 "--full_state_critic=False --frames_stacked=3"
  • MA-Trace (full) with 4 stacked observations: ./run_local.sh starcraft vtrace 1 "--full_state_critic=True --frames_stacked=4"

Note that to match the perforance presented in the paper it is required to use higher number of actors, e.g. 20.

StarCraft Multi-Agent Challange

We evaluate our models on the StarCraft Multi-Agent Challange benchmark (latest version, i.e. 4.10). The challange consists of 14 tasks: '2s_vs_1sc', '2s3z', '3s5z', '1c3s5z', '10m_vs_11m', '2c_vs_64zg', 'bane_vs_bane', '5m_vs_6m', '3s_vs_5z', '3s5z_vs_3s6z', '6h_vs_8z', '27m_vs_30m', 'MMM2' and 'corridor'.

To train on a chosen task, e.g. 'MMM2', add --task_name='MMM2' to configuration, e.g.

./run_local.sh starcraft vtrace 1 "--full_state_critic=False --task_name='MMM2'"

Results

Our model achieves the following performance on SMAC:

results.png

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks (Scientific Reports)

SkipGNN: Predicting Molecular Interactions with Skip-Graph Networks Molecular interaction networks are powerful resources for the discovery. While dee

Kexin Huang 49 Oct 15, 2022
An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Text2Event An implementation for Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction Please contact Yaojie Lu (@

Roger 153 Jan 07, 2023
This repository contains the code for the ICCV 2019 paper "Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics"

Occupancy Flow This repository contains the code for the project Occupancy Flow - 4D Reconstruction by Learning Particle Dynamics. You can find detail

189 Dec 29, 2022
Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Code for ICCV 2021 paper "HuMoR: 3D Human Motion Model for Robust Pose Estimation"

Davis Rempe 367 Dec 24, 2022
Explaining Hyperparameter Optimization via PDPs

Explaining Hyperparameter Optimization via PDPs This repository gives access to an implementation of the methods presented in the paper submission “Ex

2 Nov 16, 2022
A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

img_sussifier A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196 Examples How to use install python pip i

41 Sep 30, 2022
Zalo AI challenge 2021 task hum to song

Zalo AI challenge 2021 task Hum to Song pipeline: Chuẩn bị dữ liệu cho quá trình train: Sửa các file đường dẫn trong config/preprocess.yaml raw_path:

Vo Van Phuc 105 Dec 16, 2022
Official implementation of Deep Convolutional Dictionary Learning for Image Denoising.

DCDicL for Image Denoising Hongyi Zheng*, Hongwei Yong*, Lei Zhang, "Deep Convolutional Dictionary Learning for Image Denoising," in CVPR 2021. (* Equ

Z80 91 Dec 21, 2022
Intel® Neural Compressor is an open-source Python library running on Intel CPUs and GPUs

Intel® Neural Compressor targeting to provide unified APIs for network compression technologies, such as low precision quantization, sparsity, pruning, knowledge distillation, across different deep l

Intel Corporation 846 Jan 04, 2023
Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models

Experimental code for paper: Generative Adversarial Networks as Variational Training of Energy Based Models, under review at ICLR 2017 requirements: T

Shuangfei Zhai 18 Mar 05, 2022
CCP dataset from Clothing Co-Parsing by Joint Image Segmentation and Labeling

Clothing Co-Parsing (CCP) Dataset Clothing Co-Parsing (CCP) dataset is a new clothing database including elaborately annotated clothing items. 2, 098

Wei Yang 434 Dec 24, 2022
Prototypical Networks for Few shot Learning in PyTorch

Prototypical Networks for Few shot Learning in PyTorch Simple alternative Implementation of Prototypical Networks for Few Shot Learning (paper, code)

Orobix 835 Jan 08, 2023
An open-source project for applying deep learning to medical scenarios

Auto Vaidya An open source solution for creating end-end web app for employing the power of deep learning in various clinical scenarios like implant d

Smaranjit Ghose 18 May 29, 2022
SFD implement with pytorch

S³FD: Single Shot Scale-invariant Face Detector A PyTorch Implementation of Single Shot Scale-invariant Face Detector Description Meanwhile train hand

Jun Li 251 Dec 22, 2022
Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

DID-MDN Density-aware Single Image De-raining using a Multi-stream Dense Network He Zhang, Vishal M. Patel [Paper Link] (CVPR'18) We present a novel d

He Zhang 224 Dec 12, 2022
A self-supervised 3D representation learning framework named viewpoint bottleneck.

Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck Paper Created by Liyi Luo, Beiwen Tian, Hao Zhao and Guyue Zhou from Institute for AI In

63 Aug 11, 2022
Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space"

Sparse Steerable Convolution (SS-Conv) Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and

25 Dec 21, 2022
Graph Self-Supervised Learning for Optoelectronic Properties of Organic Semiconductors

SSL_OSC Graph Self-Supervised Learning for Optoelectronic Properties of Organic Semiconductors

zaixizhang 2 May 14, 2022
Unadversarial Examples: Designing Objects for Robust Vision

Unadversarial Examples: Designing Objects for Robust Vision This repository contains the code necessary to replicate the major results of our paper: U

Microsoft 93 Nov 28, 2022
Omniverse sample scripts - A guide for developing with Python scripts on NVIDIA Ominverse

Omniverse sample scripts ここでは、NVIDIA Omniverse ( https://www.nvidia.com/ja-jp/om

ft-lab (Yutaka Yoshisaka) 37 Nov 17, 2022