Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Overview

Resilient projection-based consensus actor-critic (RPBCAC) algorithm

We implement the RPBCAC algorithm with nonlinear approximation from [1] and focus on training performance of cooperative agents in the presence of adversaries. We aim to validate the analytical results presented in the paper and prevent adversarial attacks that can arbitrarily hurt cooperative network performance including the one studied in [2]. The repository contains folders whose description is provided below:

  1. agents - contains resilient and adversarial agents
  2. environments - contains a grid world environment for the cooperative navigation task
  3. simulation_results - contains plots that show training performance
  4. training - contains functions for training agents

To train agents, execute main.py.

Multi-agent grid world: cooperative navigation

We train five agents in a grid-world environment. Their original goal is to approach their desired position without colliding with other agents in the network. We design a grid world of dimension (6 x 6) and consider a reward function that penalizes the agents for distance from the target and colliding with other agents.

We compare the cooperative network performance under the RPBCAC algorithm with the trimming parameter H=0 and H=1, which corresponds to the number of adversarial agents that are assumed to be present in the network. We consider four scenarios:

  1. All agents are cooperative. They maximize the team-average expected returns.
  2. One agent is greedy as it maximizes its own expected returns. It shares parameters with other agents but does not apply consensus updates.
  3. One agent is faulty and does not have a well-defined objective. It shares fixed parameter values with other agents.
  4. One agent is strategic; it maximizes its own returns and leads the cooperative agents to minimize their returns. The strategic agent has knowledge of other agents' rewards and updates two critic estimates (one critic is used to improve the adversary's policy and the other to hurt the cooperative agents' performance).

The simulation results below demonstrate very good performance of the RPBCAC with H=1 (right) compared to the non-resilient case with H=0 (left). The performance is measured by the episode returns.

1) All cooperative

2) Three cooperative + one greedy

3) Three cooperative + one faulty

4) Three cooperative + one malicious

The folder with resilient agents contains the RPBCAC agent as well as an agent that applies the method of trimmed means in the consensus updates (RTMCAC).

References

[2] Figura, M., Kosaraju, K. C., and Gupta, V. Adversarial attacks in consensus-based multi-agent reinforcement learning. arXiv preprint arXiv:2103.06967, 2021.

Owner
Martin Figura
Graduate research assistant
Martin Figura
The official code of "SCROLLS: Standardized CompaRison Over Long Language Sequences".

SCROLLS This repository contains the official code of the paper: "SCROLLS: Standardized CompaRison Over Long Language Sequences". Links Official Websi

TAU NLP Group 39 Dec 23, 2022
This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

SummaC: Summary Consistency Detection This repository contains the code for TACL2021 paper: SummaC: Re-Visiting NLI-based Models for Inconsistency Det

Philippe Laban 24 Jan 03, 2023
Learning Time-Critical Responses for Interactive Character Control

Learning Time-Critical Responses for Interactive Character Control Abstract This code implements the paper Learning Time-Critical Responses for Intera

Movement Research Lab 227 Dec 31, 2022
face property detection pytorch

This is the face property train code of project face-detection-project

i am x 2 Oct 18, 2021
Code for KDD'20 "An Efficient Neighborhood-based Interaction Model for Recommendation on Heterogeneous Graph"

Heterogeneous INteract and aggreGatE (GraphHINGE) This is a pytorch implementation of GraphHINGE model. This is the experiment code in the following w

Jinjiarui 69 Nov 24, 2022
Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression"

beyond-preserved-accuracy Repo for EMNLP 2021 paper "Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression" How to implemen

Kevin Canwen Xu 10 Dec 23, 2022
The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you

Yu Meng 38 Dec 12, 2022
Machine Unlearning with SISA

Machine Unlearning with SISA Lucas Bourtoule, Varun Chandrasekaran, Christopher Choquette-Choo, Hengrui Jia, Adelin Travers, Baiwu Zhang, David Lie, N

CleverHans Lab 70 Jan 01, 2023
ComputerVision - This repository aims at realized easy network architecture

ComputerVision This repository aims at realized easy network architecture Colori

DongDong 4 Dec 14, 2022
BarcodeRattler - A Raspberry Pi Powered Barcode Reader to load a game on the Mister FPGA using MBC

Barcode Rattler A Raspberry Pi Powered Barcode Reader to load a game on the Mist

Chrissy 29 Oct 31, 2022
Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback

Graph-Refined Convolutional Network for Multimedia Recommendation with Implicit Feedback This is our Pytorch implementation for the paper: Yinwei Wei,

17 Jun 10, 2022
Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation (CVPR 2020)

Super-BPD for Fast Image Segmentation (CVPR 2020) Introduction We propose direction-based super-BPD, an alternative to superpixel, for fast generic im

189 Dec 07, 2022
DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers

DALL-Eval: Probing the Reasoning Skills and Social Biases of Text-to-Image Generative Transformers Authors: Jaemin Cho, Abhay Zala, and Mohit Bansal (

Jaemin Cho 98 Dec 15, 2022
Tensor-based approaches for fMRI classification

tensor-fmri Using tensor-based approaches to classify fMRI data from StarPLUS. Citation If you use any code in this repository, please cite the follow

4 Sep 07, 2022
E-RAFT: Dense Optical Flow from Event Cameras

E-RAFT: Dense Optical Flow from Event Cameras This is the code for the paper E-RAFT: Dense Optical Flow from Event Cameras by Mathias Gehrig, Mario Mi

Robotics and Perception Group 71 Dec 12, 2022
The fastai deep learning library

Welcome to fastai fastai simplifies training fast and accurate neural nets using modern best practices Important: This documentation covers fastai v2,

fast.ai 23.2k Jan 07, 2023
[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

[CVPR2021] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator Overview This is the entire codebase for the paper

35 Dec 01, 2022
Neural Architecture Search Powered by Swarm Intelligence 🐜

Neural Architecture Search Powered by Swarm Intelligence 🐜 DeepSwarm DeepSwarm is an open-source library which uses Ant Colony Optimization to tackle

288 Oct 28, 2022
The official PyTorch code for NeurIPS 2021 ML4AD Paper, "Does Thermal data make the detection systems more reliable?"

MultiModal-Collaborative (MMC) Learning Framework for integrating RGB and Thermal spectral modalities This is the official code for NeurIPS 2021 Machi

NeurAI 12 Nov 02, 2022
The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared Information".

The HIST framework for stock trend forecasting The implementation of the paper "HIST: A Graph-based Framework for Stock Trend Forecasting via Mining C

Wentao Xu 110 Dec 27, 2022