Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Last update: Sep 16, 2022

Related tags

Overview

Overcooked-AI

We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm.
In this repository, we implemented behavior cloning(BC), offline MADDPG, MADDPG+REM (MADDPG w/ REM), MADDPG+BCQ (MADDPG w/ BCQ) with pytorch. Now, BCQ is in ' Working In Progress', and it's not implemented completely.

We collected 0.5M multi-agent offline RL dataset and experimented with each comparison methods. We collected this data with online MADDPG agents, and it includes exploration trajectories using OU noise. The experiments are ran on Asymmetric Advantages on the Overcooked environment.

We are looking forward your contribution!

How to Run

Collect Offline Data

python train_online.py agent=maddpg save_replay_buffer=true

While the agents train with 0.5M steps, the trajectory replay buffer will be dumped in your experiment/{date}/{time}_maddpg_{exp_name}/buffer folder.
Please replace the path in config/data/local.yaml to the experiment by-product directory.

Download Dataset

Or, if you want to use our dataset pre-collected, please enjoy this link.
We provide 0.5M trajectories in Asymmetric Advantages layout.
Please download our dataset in your local computer and replace the path in config/data/local.yaml

Train Offline Models

Behavior Cloning

python train_bc.py agent=bc data=local

Offline MADDPG (Vanilla)

python train_offline.py agent=maddpg data=local

Offline MADDPG (w/ REM)

python train_offline.py agent=rem_maddpg data=local

Offline MADDPG (w/ BCQ) (WIP)

python train_offline.py agent=bcq_maddpg data=local

Result

Graph

Online	Offline (0.5M Data)	Offline (0.25M Data)

Video

Online	BC	Offline /w REM

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Related tags

Overview

Overcooked-AI

How to Run

Collect Offline Data

Download Dataset

Train Offline Models

Behavior Cloning

Offline MADDPG (Vanilla)

Offline MADDPG (w/ REM)

Offline MADDPG (w/ BCQ) (WIP)

Result

Graph

Video

Acknowledgement

Owner

Baek In-Chang

Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

PyTorch-Multi-Style-Transfer - Neural Style and MSG-Net

FANet - Real-time Semantic Segmentation with Fast Attention

CLADE - Efficient Semantic Image Synthesis via Class-Adaptive Normalization (TPAMI 2021)

This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

This code is for our paper "VTGAN: Semi-supervised Retinal Image Synthesis and Disease Prediction using Vision Transformers"

Voxel Transformer for 3D object detection

Official Implementation of PCT

HyperCube: Implicit Field Representations of Voxelized 3D Models

Code for our paper Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

Graph Self-Attention Network for Learning Spatial-Temporal Interaction Representation in Autonomous Driving

Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Project page for End-to-end Recovery of Human Shape and Pose

Implementation of the paper "Language-agnostic representation learning of source code from structure and context".

[SIGGRAPH Asia 2021] DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning.

Flower - A Friendly Federated Learning Framework

All public open-source implementations of convnets benchmarks

Official Pytorch Implementation of GraphiT

Codebase for the solution that won first place and was awarded the most human-like agent in the 2021 NeurIPS Competition MineRL BASALT Challenge.