Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Last update: Dec 07, 2022

Overview

Phoenix-Drone-Simulation

An OpenAI Gym environment based on PyBullet for learning to control the CrazyFlie quadrotor:

Can be used for Reinforcement Learning (check out the examples!) or Model Predictive Control
We used this repository for sim-to-real transfer experiments (see publication [1] below)
The implemented dynamics model is based on the Bitcraze's Crazyflie 2.1 nano-quadrotor

Circle Task	TakeOff

The following tasks are currently available to fly the little drone:

Hover
Circle
Take-off (implemented but not yet working properly: reward function must be tuned!)
~~Reach~~ (not yet implemented)

Overview of Environments

	Task	Controller	Physics	Observation Frequency	Domain Randomization	Aerodynamic effects	Motor Dynamics
`DroneHoverSimpleEnv-v0`	Hover	PWM (100Hz)	Simple	100 Hz	10%	None	Instant force
`DroneHoverBulletEnv-v0`	Hover	PWM (100Hz)	PyBullet	100 Hz	10%	None	First-order
`DroneCircleSimpleEnv-v0`	Circle	PWM (100Hz)	Simple	100 Hz	10%	None	Instant force
`DroneCircleBulletEnv-v0`	Circle	PWM (100Hz)	PyBullet	100 Hz	10%	None	First-order
`DroneTakeOffSimpleEnv-v0`	Take-off	PWM (100Hz)	Simple	100 Hz	10%	Ground-effect	Instant force
`DroneTakeOffBulletEnv-v0`	Take-off	PWM (100Hz)	PyBullet	100 Hz	10%	Ground-effect	First-order

Installation and Requirements

Here are the (few) steps to follow to get our repository ready to run. Clone the repository and install the phoenix-drone-simulation package via pip. Note that everything after a $ is entered on a terminal, while everything after >>> is passed to a Python interpreter. Please, use the following three steps for installation:

$ git clone https://github.com/SvenGronauer/phoenix-drone-simulation
$ cd phoenix-drone-simulation/
$ pip install -e .

This package follows OpenAI's Gym Interface.

Note: if your default python is 2.7, in the following, replace pip with pip3 and python with python3

Supported Systems

We tested this package under Ubuntu 20.04 and Mac OS X 11.2 running Python 3.7 and 3.8. Other system might work as well but have not been tested yet. Note that PyBullet supports Windows as platform only experimentally!.

Dependencies

Bullet-Safety-Gym heavily depends on two packages:

Gym
PyBullet

Getting Started

After the successful installation of the repository, the Bullet-Safety-Gym environments can be simply instantiated via gym.make. See:

>>> import gym
>>> import phoenix_drone_simulation
>>> env = gym.make('DroneHoverBulletEnv-v0')

The functional interface follows the API of the OpenAI Gym (Brockman et al., 2016) that consists of the three following important functions:

>>> observation = env.reset()
>>> random_action = env.action_space.sample()  # usually the action is determined by a policy
>>> next_observation, reward, done, info = env.step(random_action)

A minimal code for visualizing a uniformly random policy in a GUI, can be seen in:

import gym
import time
import phoenix_drone_simulation

env = gym.make('DroneHoverBulletEnv-v0')

while True:
    done = False
    env.render()  # make GUI of PyBullet appear
    x = env.reset()
    while not done:
        random_action = env.action_space.sample()
        x, reward, done, info = env.step(random_action)
        time.sleep(0.05)

Note that only calling the render function before the reset function triggers visuals.

Training Policies

To train an agent with the PPO algorithm call:

$ python -m phoenix_drone_simulation.train --alg ppo --env DroneHoverBulletEnv-v0

This works with basically every environment that is compatible with the OpenAI Gym interface:

$ python -m phoenix_drone_simulation.train --alg ppo --env CartPole-v0

After an RL model has been trained and its checkpoint has been saved on your disk, you can visualize the checkpoint:

$ python -m phoenix_drone_simulation.play --ckpt PATH_TO_CKPT

where PATH_TO_CKPT is the path to the checkpoint, e.g. /var/tmp/sven/DroneHoverSimpleEnv-v0/trpo/2021-11-16__16-08-09/seed_51544

Examples

`generate_trajectories.py`

See the generate_trajectories.py script which shows how to generate data batches of size N. Use generate_trajectories.py --play to visualize the policy in PyBullet simulator.

`train_drone_hover.py`

Use Reinforcement Learning (RL) to learn the drone holding its position at (0, 0, 1). This canonical example relies on the RL-safety-Algorithms repository which is a very strong framework for parallel RL algorithm training.

`transfer_learning_drone_hover.py`

Shows a transfer learning approach. We first train a PPO model in the source domain DroneHoverSimpleEnv-v0 and then re-train the model on a more complex target domain DroneHoverBulletEnv-v0. Note that the DroneHoverBulletEnv-v0 environment builds upon an accurate motor modelling of the CrazyFlie drone and includes a motor dead time as well as a motor lag.

Tools

convert.py @ Sven Gronauer

A function used by Sven to extract the policy networks from his trained Actor Critic module and convert the model to a json file format.

Version History and Changes

Version	Changes	Date
v1.0	Public Release: Simulation parameters as proposed in Publication [1]	19.04.2022
v0.2	Add: accurate motor dynamic model and first real-world transfer insights	21.09.2021
v0.1	Re-factor: of repository (only Hover task yet implemented)	18.05.2021
v0.0	Fork: from Gym-PyBullet-Drones Repo	01.12.2020

Publications

Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors

Sven Gronauer, Matthias Kissel, Luca Sacchetto, Mathias Korte, Klaus Diepold

https://arxiv.org/abs/2201.01369

Lastly, we want to thank:

Jacopo Panerati and his team for contributing the Gym-PyBullet-Drones Repo which was the staring point for this repository.
Artem Molchanov and collaborators for their hints about the CrazyFlie Firmware and the motor dynamics in their paper "Sim-to-(Multi)-Real: Transfer of Low-Level Robust Control Policies to Multiple Quadrotors"
Jakob Foerster for this Bachelor Thesis and his insights about the CrazyFlie's parameter values

This repository has been develepod at the

Chair of Data Processing
TUM School of Computation, Information and Technology
Technical University of Munich

Simulation environments for the CrazyFlie quadrotor: Used for Reinforcement Learning and Sim-to-Real Transfer

Related tags

Overview

Phoenix-Drone-Simulation

Overview of Environments

Installation and Requirements

Supported Systems

Dependencies

Getting Started

Training Policies

Examples

`generate_trajectories.py`

`train_drone_hover.py`

`transfer_learning_drone_hover.py`

Tools

Version History and Changes

Publications

Owner

Sven Gronauer

mPose3D, a mmWave-based 3D human pose estimation model.

LocUNet is a deep learning method to localize a UE based solely on the reported signal strengths from a set of BSs.

Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet.

Project repo for Learning Category-Specific Mesh Reconstruction from Image Collections

Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Open source Python implementation of the HDR+ photography pipeline

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

Self-Supervised Methods for Noise-Removal

This repo generates the training data and the model for Morpheus-Deblend

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

SimulLR - PyTorch Implementation of SimulLR

TOOD: Task-aligned One-stage Object Detection, ICCV2021 Oral

《Single Image Reflection Removal Beyond Linearity》(CVPR 2019)

IJON is an annotation mechanism that analysts can use to guide fuzzers such as AFL.

Implementation of OmniNet, Omnidirectional Representations from Transformers, in Pytorch

Algorithmic trading with deep learning experiments

Python interface for SmartRF Sniffer 2 Firmware

Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.

An implementation of "Learning human behaviors from motion capture by adversarial imitation"