MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Last update: Dec 24, 2022

Overview

Applied Reinforcement Learning with Python

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.

This is a preliminary, non-stable release of Maze. It is not yet complete and not all of our interfaces have settled yet. Hence, there might be some breaking changes on our way towards the first stable release.

Spotlight Features

Below we list a few selected Maze features.

Design and visualize your policy and value networks with the Perception Module. It is based on PyTorch and provides a large variety of neural network building blocks and model styles. Quickly compose powerful representation learners from building blocks such as: dense, convolution, graph convolution and attention, recurrent architectures, action- and observation masking, self-attention etc.
Create the conditions for efficient RL training without writing boiler plate code, e.g. by supporting best practices like pre-processing and normalizing your observations.
Maze supports advanced environment structures reflecting the requirements of real-world industrial decision problems such as multi-step and multi-agent scenarios. You can of course work with existing Gym-compatible environments.
Use the provided Maze trainers (A2C, PPO, Impala, SAC, Evolution Strategies), which are supporting dictionary action and observation spaces as well as multi-step (auto-regressive policies) training. Or stick to your favorite tools and trainers by combining Maze with other RL frameworks.
Out of the box support for advanced training workflows such as imitation learning from teacher policies and policy fine-tuning.
Keep even complex application and experiment configuration manageable with the Hydra Config System.

Get Started

Make sure PyTorch is installed and then get the latest released version of Maze as follows
```
pip install -U maze-rl

# optionally install RLLib if you want to use it in combination with Maze
pip install ray[rllib] tensorflow  
```
Read more about other options like the installation of the latest development version.

⚡ We encourage you to start with Python 3.7, as many popular environments like Atari or Box2D can not easily be installed in newer Python environments. Maze itself supports newer Python versions, but for Python 3.9 you might have to install additional binary dependencies manually
To see Maze in action check out a first example.
For a more applied introduction visit the step by step tutorial.

Installation

First Example

Step by Step Tutorial

Documentation

Learn more about Maze

The documentation is the starting point to learn more about the underlying concepts, but most importantly also provides code snippets and minimum working examples to get you started quickly.

The Workflow section guides you through typical tasks in a RL project
Policy and Value Networks introduces you to the Perception Module, how to customize action spaces and the underlying action probability distributions and two styles of policy and value networks construction:
- Template models are composed directly from an environment's observation and action space, allowing you to train with suitable agent networks on a new environment within minutes.
- Custom models gives you the full flexibility of application specific models, either with the provided Maze building blocks or directly with PyTorch.
Learn more about core concepts and structures such as the Maze environment hierarchy, the Maze event system providing a convenient way to collect statistics and KPIs, enable flexible reward formulation and supporting offline analysis.
Structured Environments and Action Masking introduces you to a general concept, which can greatly improve the performance of the trained agents in practical RL problems.

License

Maze is freely available for research and non-commercial use. A commercial license is available, if interested please contact us on our company website or write us an email.

We believe in Open Source principles and aim at transitioning Maze to a commercial Open Source project, releasing larger parts of the framework under a permissive license in the near future.

Comments

Configuration problems in the step-by-step tutorial
I've just been trying out maze and tried out the step-by-step tutorial.

In Step 5 (5. Training the MazeEnv) the instructions are incomplete or wrong.

I was able to get it running in the end, but it took (us) quite some time. I'm not sure if this is a bug in maze or hydra, of if just some newer version of either library changes the behavior a little bit. But you should update the documentation such that it works out of the box for new users of the library.

The setup (under Ubuntu 2020.04):

>> mkdir maze5 && cd maze5 >> pyenv local 3.8.8 >> python -m venv .venv >> source .venv/bin/activate >> pip install maze-rl torch >> pip list Package Version ----------------------- ----------- hydra-core 1.1.0 hydra-nevergrad-sweeper 1.1.5 maze-rl 0.1.7 torch 1.9.0 ...

Then just copy-pasted the files from the https://github.com/enlite-ai/maze-examples/tree/main/tutorial_maze_env/part03_maze_env repo and adjusted the _target paths in the config yamls (e.g. from _target_: tutorial_maze_env.part03_maze_env.env.maze_env.maze_env_factory to _target_: env.maze_env.maze_env_factory).

Problem 1:

When you run the suggested training command, Hydra will just complain that it can't find the configuration files.

>> maze-run -cn conf_train env=tutorial_cutting_2d_basic wrappers=tutorial_cutting_2d_basic \ model=tutorial_cutting_2d_basic algorithm=ppo In 'conf_train': Could not find 'model/tutorial_cutting_2d_basic' Available options in 'model': flatten_concat flatten_concat_shared_embedding pixel_obs pixel_obs_rnn rllib vector_obs vector_obs_rnn Config search path: provider=hydra, path=pkg://hydra.conf provider=main, path=pkg://maze.conf provider=schema, path=structured://

Fix:

You can just define the config directory for hydra with maze-run -cd conf -cn conf_train .... Then Hydra will find the 3 config files and load them correctly.

Problem 2:

After loading the config files, hydra tries to load the modules defined in the _target fields. And that fails immediatly with:

... File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 104, in _resolve_target return _locate(target) File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/utils.py", line 563, in _locate raise ImportError(f"Error loading module '{path}'") from e ImportError: Error loading module 'env.maze_env.maze_env_factory'

Fix:

For some reason Hydra doesn't know the path to the directory from where we call maze-run. And therefore it doesn't find the env directory containing the maze_env file.

This is fixable by just setting the environment variable: export PYTHONPATH="$PYTHONPATH:$PWD/".
bug documentation
opened by jakobkogler 2
Hello from Hydra :)

Thanks for using Hydra! I see that you are using Hydra 1.1 already which is great. One thing that is really recent is the ability to configure the config searchpath from the primary config. You can learn about it here.

This can probably eliminate the need of your users to even know what a ConfigSearchpathPlugin is.

Feel free to jump into the Hydra chat if you have any questions.

opened by omry 2
Version 0.1.7
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API
opened by enliteai 0
Version 0.1.6
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simpified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub
opened by md-enlite 0
Version 0.1.5
Features:

Adds documentation for run_context

Changes of simulated environment interfaces step_without_observation -> fast_step

Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

added value transformations
opened by md-enlite 0
Towards Version 0.1.5
Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images
opened by md-enlite 0
Release Version 0.1.4
improved docs

switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

added testing dependencies to main package
opened by enliteai 0
Dev
adds PointNetFeatureBlock to perception module

adds Tensorboard hyper paramter visualization for hydra multiruns

merges parallel and sequential dataset into a single InMemoryDataset
opened by md-enlite 0
Version 0.1.3
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation
opened by enliteai 0
Version 0.1.2
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.
opened by enliteai 0
Dev
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

Fixes:

cumulative stats logging
opened by md-enlite 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2022)
New graph neural network building blocks (message passing based on torch-scatter in addition to existing graph convolutions)

Support for action recording, replay from pre-computed action records and feature collection.

Improved wrapper hierarchy semantics: Previously values were assigned to the outermost wrapper. Now values are assigned to existing attributes by traversing the wrapper hierarchy.

Removal of deprecated modules (APIContext and Maze models for RLlib)

Reflecting changes in upstream dependencies (Gym version pinned to <0.23)

Source code(tar.gz)
Source code(zip)
v0.1.8(Dec 13, 2021)
New Features

Agent Deployment Workflow

Soft Actor Critic from Demonstrations (SACfD)

Locally Distributed ES Runner

SpacesRecordingWrapper: Records and dumps processed trajectories to pickle files

Fixes event logging for environment resets and policy events

Source code(tar.gz)
Source code(zip)
submission_22-08-25-14-06.1.zip(252.75 MB)
v0.1.7(Jun 24, 2021)
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API

Compatibility with PyTorch 1.9

Source code(tar.gz)
Source code(zip)
v0.1.6(Jun 14, 2021)
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simplified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub

Source code(tar.gz)
Source code(zip)
v0.1.5(May 20, 2021)
Features:

adds RunContext (Maze Python API)

adds seeding to environments, models and trainers

changes of simulated environment interfaces step_without_observation -> fast_step

Improvements:

adds an ExportGifWrapper

adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

adds value transformations

Source code(tar.gz)
Source code(zip)
v0.1.4(Apr 29, 2021)
switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

interfaces support collaborative multi-agent actor critic

improved docs

added testing dependencies to main package

Source code(tar.gz)
Source code(zip)
v0.1.3(Apr 1, 2021)
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation

Source code(tar.gz)
Source code(zip)
v0.1.2(Mar 25, 2021)
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.

Source code(tar.gz)
Source code(zip)
v0.1.1(Mar 18, 2021)
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

adds MazeEnvMonitoringWrapper as a default to wrapper stacks

Fixes:

cumulative stats logging

Source code(tar.gz)
Source code(zip)
v0.1.0(Mar 11, 2021)
Documentation updates:

Integrating existing Gym environments

Factory documentation

Experiments workflow, ...

Updated to Hydra 1.1.0:

Using Hydra.instantiate instead of custom registry implementation

Added Rollout evaluator
Source code(tar.gz)
Source code(zip)

Owner

EnliteAI GmbH

enliteAI is a machine learning company, developing the Reinforcement Learning framework Maze.

GitHub Repository https://maze-rl.readthedocs.io/

PyTorch implementation for ComboGAN

ComboGAN This is our ongoing PyTorch implementation for ComboGAN. Code was written by Asha Anoosheh (built upon CycleGAN) [ComboGAN Paper] If you use

139 Dec 20, 2022

Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Cancer-and-Tumor-Detection-Using-Inception-model In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks

1 Jan 01, 2022

A repo for Causal Imitation Learning under Temporally Correlated Noise

CausIL A repo for Causal Imitation Learning under Temporally Correlated Noise. Running Experiments To re-train an expert, run: python experts/train_ex

5 Nov 01, 2022

VISNOTATE: An Opensource tool for Gaze-based Annotation of WSI Data

VISNOTATE: An Opensource tool for Gaze-based Annotation of WSI Data Introduction Requirements Installation and Setup Supported Hardware and Software R

1 Jun 14, 2022

Text and code for the forthcoming second edition of Think Bayes, by Allen Downey.

Think Bayes 2 by Allen B. Downey The HTML version of this book is here. Think Bayes is an introduction to Bayesian statistics using computational meth

1.5k Jan 08, 2023

FirmWire is a full-system baseband firmware emulation platform for fuzzing, debugging, and root-cause analysis of smartphone baseband firmwares

___ __ __ -. .-. | __|(+) _ _ _ _\ \ / /(+) _ _ ___ .-. .- \ / \ | _| | | '_| ' \ \/

571 Dec 25, 2022

A Python package for causal inference using Synthetic Controls

Synthetic Control Methods A Python package for causal inference using synthetic controls This Python package implements a class of approaches to estim

107 Dec 28, 2022

Seq2seq - Sequence to Sequence Learning with Keras

Seq2seq Sequence to Sequence Learning with Keras Hi! You have just found Seq2Seq. Seq2Seq is a sequence to sequence learning add-on for the python dee

3.1k Dec 18, 2022

上海交通大学全自动抢课脚本，支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Welcome to Course-Bullying-in-SJTU-v3.1！ 2021/6/8 紧急更新v3.1 更新说明为了更好地保护用户隐私，将原来用户名+密码的登录方式改为微信扫二维码+cookie登录方式，不再需要配置使用pytesseract。在使用扫码登录模式时，请稍等，二维码将马

87 Sep 13, 2022

Predicting Tweet Sentiment Maching Learning and streamlit

Predicting-Tweet-Sentiment-Maching-Learning-and-streamlit (I prefere using Visual Studio Code ) Open the folder in VS Code Run the first cell in requi

1 Nov 20, 2021

Image inpainting using Gaussian Mixture Models

dmfa_inpainting Source code for: MisConv: Convolutional Neural Networks for Missing Data (to be published at WACV 2022) Estimating conditional density

8 Oct 09, 2022

Additional environments compatible with OpenAI gym

Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning A codebase for training reinforcement learning policies for quad

40 Dec 06, 2022

Light-Head R-CNN

Light-head R-CNN Introduction We release code for Light-Head R-CNN. This is my best practice for my research. This repo is organized as follows: light

835 Dec 06, 2022

Lab course materials for IEMBA 8/9 course "Coding and Artificial Intelligence"

IEMBA 8/9 - Coding and Artificial Intelligence Dear IEMBA 8/9 students, welcome to our IEMBA 8/9 elective course Coding and Artificial Intelligence, t

1 Jan 11, 2022

This repository contains small projects related to Neural Networks and Deep Learning in general.

ILearnDeepLearning.py Description People say that nothing develops and teaches you like getting your hands dirty. This repository contains small proje

1.2k Dec 22, 2022

GT4SD, an open-source library to accelerate hypothesis generation in the scientific discovery process.

The GT4SD (Generative Toolkit for Scientific Discovery) is an open-source platform to accelerate hypothesis generation in the scientific discovery process. It provides a library for making state-of-t

142 Dec 24, 2022

OpenDILab RL Kubernetes Custom Resource and Operator Lib

DI Orchestrator DI Orchestrator is designed to manage DI (Decision Intelligence) jobs using Kubernetes Custom Resource and Operator. Prerequisites A w

205 Dec 29, 2022

Diverse Image Generation via Self-Conditioned GANs

Diverse Image Generation via Self-Conditioned GANs Project | Paper Diverse Image Generation via Self-Conditioned GANs Steven Liu, Tongzhou Wang, David

147 Dec 03, 2022

Improving XGBoost survival analysis with embeddings and debiased estimators

xgbse: XGBoost Survival Embeddings "There are two cultures in the use of statistical modeling to reach conclusions from data

242 Dec 30, 2022

Node for thenewboston digital currency network.

Project setup For project setup see INSTALL.rst Community Join the community to stay updated on the most recent developments, project roadmaps, and ra

27 Jul 08, 2022