Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

Overview

Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods

This repository is the official implementation of

  • Seohong Park, Jaekyeom Kim, Gunhee Kim. Time Discretization-Invariant Safe Action Repetition for Policy Gradient Methods. In NeurIPS, 2021.

It contains the implementations for SAR, FiGAR-C and base policy gradient algorithms (PPO, TRPO and A2C).

The code is based on Stable Baselines3 (SB3) for PPO and A2C, and Stable Baselines (SB) for TRPO.

Requirements

Run examples

PPO and A2C (based on Stable Baselines3)

To install requirements:

cd sb3
pip install -r requirements.txt
pip install -e .

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train FiGAR-C-PPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.01 --max_t=0.05

Train PPO on InvertedPendulum-v2 with the original δ:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg

Train SAR-A2C on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --algo=a2c_rg --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Action Noise" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=action --anoise_prob=0.05 --anoise_std=3

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "External Force" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_f --anoise_prob=0.05 --anoise_std=300

Train SAR-PPO on InvertedPendulum-v2 with δ = 0.002 and the "Strong External Force (Perceptible)" setting:

python repeat/main.py --env=InvertedPendulum-v2 --algo=ppo_rg --frame_skip=1 --dt=0.002 --max_t=0.05 --max_d=0.5 --anoise_type=ext_fpc --anoise_prob=0.05 --anoise_std=1000

TRPO (based on Stable Baselines)

To install requirements:

cd sb
pip install -r requirements.txt
pip install -e .

Train SAR-TRPO on InvertedPendulum-v2 with δ = 0.01:

python repeat/main.py --env=InvertedPendulum-v2 --frame_skip=1 --dt=0.01 --max_t=0.05 --max_d=0.5

License

This codebase is licensed under the MIT License. See also sb3/LICENSE_SB3 and sb/LICENSE_SB.

Owner
Seohong Park
Seohong Park
A Static Analysis Tool for Detecting Security Vulnerabilities in Python Web Applications

This project is no longer maintained March 2020 Update: Please go see the amazing Pysa tutorial that should get you up to speed finding security vulne

2.1k Dec 25, 2022
Argument Injection in Dragonfly Ruby Gem

CVE-2021-33564 PoC Exploit script for CVE-2021-33564 (Argument Injection in Dragonfly Ruby Gem). Usage Arbitrary File Read python3 poc.py -u https://

Michael Tsai 12 Nov 09, 2022
web指纹识别工具

前言 一直苦于没有用的顺手的web指纹识别工具,学习前辈s7ckTeam的Glass和broken5的WebAliveScan优秀开源程序开发的轻量型web指纹工具。

EASY 966 Dec 26, 2022
This is a Crypto asset tracker that I built to aid my personal journey in cryptocurrencies.

Wallet Tracker This is a Crypto asset tracker that I built to aid my personal journey in cryptocurrencies. build docker build -t wallet-tracker . run

2 Mar 21, 2022
Abusing Microsoft 365 OAuth Authorization Flow for Phishing Attack

O365DevicePhish Microsoft365_devicePhish Abusing Microsoft 365 OAuth Authorization Flow for Phishing Attack This is a simple proof-of-concept script t

Trewis [work] Scotch 4 Sep 23, 2022
Nmap automated port scanner written in Python

port-scanner Nmap automated port scanner written in Python. USE: Clone the module Import the module: from portscanModule import portscanner Use: ports

Brayden Karnes 1 Dec 03, 2021
Natas teaches the basics of serverside web-security.

over-the-wire-natas Natas teaches the basics of serverside web-security. Each level of natas consists of its own website located at http://natasX.nata

Siddhant Chouhan 1 Nov 27, 2021
Fast subdomain scanner, Takes arguments from a Json file ("args.json") and outputs the subdomains.

Fast subdomain scanner, Takes arguments from a Json file ("args.json") and outputs the subdomains. File Structure core/ colors.py db/ wordlist.txt REA

whoami security 4 Jul 02, 2022
MozDef: Mozilla Enterprise Defense Platform

MozDef: Documentation: https://mozdef.readthedocs.org/en/latest/ Give MozDef a Try in AWS: The following button will launch the Mozilla Enterprise Def

Mozilla 2.2k Jan 08, 2023
Exploiting CVE-2021-44228 in Unifi Network Application for remote code execution and more

Log4jUnifi Exploiting CVE-2021-44228 in Unifi Network Application for remote cod

96 Jan 02, 2023
NexScanner is a tool which allows you to scan a website and find the admin login panel and sub-domains

NexScanner NexScanner is a tool which helps you scan a website for sub-domains and also to find login pages in the website like the admin login panel

8 Sep 03, 2022
带回显版本的漏洞利用脚本

CVE-2021-21978 带回显版本的漏洞利用脚本,更简单的方式 0. 漏洞信息 VMware View Planner Web管理界面存在一个上传日志功能文件的入口,没有进行认证且写入的日志文件路径用户可控,通过覆盖上传日志功能文件log_upload_wsgi.py,即可实现RCE 漏洞代码

3ky7in4 24 Nov 09, 2022
the metasploit script(POC) about CVE-2021-36260

CVE-2021-36260-metasploit the metasploit script(POC) about CVE-2021-36260. A command injection vulnerability in the web server of some Hikvision produ

Taroballz 14 Nov 09, 2022
A Python Scanner for log4j

log4j-Scanner scanner for log4j cat web-urls.txt | python3 log4j.py ID.burpcollaborator.net web-urls.txt http://127.0.0.1:8080 https://www.google.c

Ihebski 5 Jun 26, 2022
Fast python tool to test apache path traversal CVE-2021-41773 in a List of url

CVE-2021-41773 Fast python tool to test apache path traversal CVE-2021-41773 in a List of url Usage :- create a live urls file and use the flag "-l" p

Zahir Tariq 12 Nov 09, 2022
PortSwigger Burp Plugin for the Log4j (CVE-2021-44228)

yLog4j This is Y-Sec's @PortSwigger Burp Plugin for the Log4j CVE-2021-44228 vulnerability. The focus of yLog4j is to support mass-scanning of the Log

Y-Security 1 Jan 31, 2022
Bypass 4xx HTTP response status codes.

Forbidden Bypass 4xx HTTP response status codes. To see all the test cases, check the source code - follow the NOTE comments. Script uses multithreadi

Ivan Šincek 165 Dec 28, 2022
Notebooks, slides and dataset of the CorrelAid Machine Learning Winter School

CorrelAid Machine Learning Spring School Welcome to the CorrelAid ML Spring School! In this repository you can find the slides and other files for the

CorrelAid 12 Nov 23, 2022
A simple tool to audit Unix/*BSD/Linux system libraries to find public security vulnerabilities

master_librarian A simple tool to audit Unix/*BSD/Linux system libraries to find public security vulnerabilities. To install requirements: $ sudo pyth

CoolerVoid 167 Dec 19, 2022
Separate handling of protected media in Django, with X-Sendfile support

Django Protected Media Django Protected Media is a Django app that manages media that are considered sensitive in a protected fashion. Not only does t

Cobus Carstens 46 Nov 12, 2022