Model-based Reinforcement Learning Improves Autonomous Racing Performance

Overview

Racing Dreamer: Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars

In this work, we propose to learn a racing controller directly from raw Lidar observations.

The resulting policy has been evaluated on F1tenth-like tracks and then transfered to real cars.

Racing Dreamer

The free version is available on arXiv.

If you find this code useful, please reference in your paper:

@misc{brunnbauer2021modelbased,
      title={Model-based versus Model-free Deep Reinforcement Learning for Autonomous Racing Cars}, 
      author={Axel Brunnbauer and Luigi Berducci and Andreas Brandstätter and Mathias Lechner and Ramin Hasani and Daniela Rus and Radu Grosu},
      year={2021},
      eprint={2103.04909},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

This repository is organized as follows:

  • Folder dreamer contains the code related to the Dreamer agent.
  • Folder baselines contains the code related to the Model Free algorihtms (D4PG, MPO, PPO, LSTM-PPO, SAC).
  • Folder ros_agent contains the code related to the transfer on real racing cars.
  • Folder docs contains the track maps, mechanical and general documentation.

Dreamer

"Dreamer learns a world model that predicts ahead in a compact feature space. From imagined feature sequences, it learns a policy and state-value function. The value gradients are backpropagated through the multi-step predictions to efficiently learn a long-horizon policy."

This implementation extends the original implementation of Dreamer (Hafner et al. 2019).

We refer the reader to the Dreamer website for the details on the algorithm.

Dreamer

Instructions

This code has been tested on Ubuntu 18.04 with Python 3.7.

Get dependencies:

pip install --user -r requirements.txt

Training

We train Dreamer on LiDAR observations and propose two Reconstruction variants: LiDAR and Occupancy Map.

Reconstruction Variants

Train the agent with LiDAR reconstruction:

python dreamer/dream.py --track columbia --obs_type lidar

Train the agent with Occupancy Map reconstruction:

python dream.py --track columbia --obs_type lidar_occupancy

Please, refer to dream.py for the other command-line arguments.

Offline Evaluation

The evaluation module runs offline testing of a trained agent (Dreamer, D4PG, MPO, PPO, SAC).

To run evaluation, assuming to have the dreamer directory in the PYTHONPATH:

python evaluations/run_evaluation.py --agent dreamer \
                                     --trained_on austria \
                                     --obs_type lidar \
                                     --checkpoint_dir logs/checkpoints \
                                     --outdir logs/evaluations \
                                     --eval_episodes 10 \
                                     --tracks columbia barcelona 

The script will look for all the checkpoints with pattern logs/checkpoints/austria_dreamer_lidar_* The checkpoint format depends on the saving procedure (pkl, zip or directory).

The results are stored as tensorflow logs.

Plotting

The plotting module containes several scripts to visualize the results, usually aggregated over multiple experiments.

To plot the learning curves:

python plotting/plot_training_curves.py --indir logs/experiments \
                                                --outdir plots/learning_curves \
                                                --methods dreamer mpo \
                                                --tracks austria columbia treitlstrasse_v2 \
                                                --legend

It will produce the comparison between Dreamer and MPO on the tracks Austria, Columbia, Treitlstrasse_v2.

To plot the evaluation results:

python plotting/plot_test_evaluation.py --indir logs/evaluations \
                                                --outdir plots/evaluation_charts \
                                                --methods dreamer mpo \
                                                --vis_tracks austria columbia treitlstrasse_v2 \
                                                --legend

It will produce the bar charts comparing Dreamer and MPO evaluated in Austria, Columbia, Treitlstrasse_v2.

Instructions with Docker

We also provide an docker image based on tensorflow:2.3.1-gpu. You need nvidia-docker to run them, see here for more details.

To build the image:

docker build -t dreamer .

To train Dreamer within the container:

docker run -u $(id -u):$(id -g) -v $(pwd):/src --gpus all --rm dreamer python dream.py --track columbia --steps 1000000

Model Free

The organization of Model-Free codebase is similar and we invite the users to refer to the README for the detailed instructions.

Hardware

The codebase for the implementation on real cars is contained in ros_agent.

Additional material:

  • Folder docs/maps contains a collection of several tracks to be used in F1Tenth races.
  • Folder docs/mechanical contains support material for real world race-tracks.
Owner
Cyber Physical Systems - TU Wien
Cyber Physical Systems - TU Wien
Lama-cleaner: Image inpainting tool powered by LaMa

Lama-cleaner: Image inpainting tool powered by LaMa

Qing 5.8k Jan 05, 2023
CBKH: The Cornell Biomedical Knowledge Hub

Cornell Biomedical Knowledge Hub (CBKH) CBKG integrates data from 18 publicly available biomedical databases. The current version of CBKG contains a t

44 Dec 21, 2022
This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

AtlasNet V2 - Learning Elementary Structures This work was build upon Thibault Groueix's AtlasNet and 3D-CODED projects. (you might want to have a loo

Théo Deprelle 123 Nov 11, 2022
Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Computational Fluid Dynamics in Python Using NumPy to solve the equations of fluid mechanics 🌊 🌊 🌊 together with Finite Differences, explicit time

Felix Köhler 4 Nov 12, 2022
🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

🍀 Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.⭐⭐⭐

xmu-xiaoma66 7.7k Jan 05, 2023
A PyTorch implementation of the continual learning experiments with deep neural networks

Brain-Inspired Replay A PyTorch implementation of the continual learning experiments with deep neural networks described in the following paper: Brain

182 Dec 27, 2022
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF

xuxudong 97 Nov 10, 2022
基于Pytorch实现优秀的自然图像分割框架!(包括FCN、U-Net和Deeplab)

语义分割学习实验-基于VOC数据集 usage: 下载VOC数据集,将JPEGImages SegmentationClass两个文件夹放入到data文件夹下。 终端切换到目标目录,运行python train.py -h查看训练 (torch) Li Xiang 28 Dec 21, 2022

SSD-based Object Detection in PyTorch

SSD-based Object Detection in PyTorch 서강대학교 현대모비스 SW 프로그램에서 진행한 인공지능 프로젝트입니다. Jetson nano를 이용해 pre-trained network를 fine tuning시켜 차량 및 신호등 인식을 구현하였습니다

Haneul Kim 1 Nov 16, 2021
A new framework, collaborative cascade prediction based on graph neural networks (CCasGNN) to jointly utilize the structural characteristics, sequence features, and user profiles.

CCasGNN A new framework, collaborative cascade prediction based on graph neural networks (CCasGNN) to jointly utilize the structural characteristics,

5 Apr 29, 2022
An unofficial styleguide and best practices summary for PyTorch

A PyTorch Tools, best practices & Styleguide This is not an official style guide for PyTorch. This document summarizes best practices from more than a

IgorSusmelj 1.5k Jan 05, 2023
Understanding Convolution for Semantic Segmentation

TuSimple-DUC by Panqu Wang, Pengfei Chen, Ye Yuan, Ding Liu, Zehua Huang, Xiaodi Hou, and Garrison Cottrell. Introduction This repository is for Under

TuSimple 585 Dec 31, 2022
Totally Versatile Miscellanea for Pytorch

Totally Versatile Miscellania for PyTorch Thomas Viehmann [email protected] Thi

Thomas Viehmann 428 Dec 28, 2022
How to use TensorLayer

How to use TensorLayer While research in Deep Learning continues to improve the world, we use a bunch of tricks to implement algorithms with TensorLay

zhangrui 349 Dec 07, 2022
LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

Xuan Hieu Duong 7 Jan 12, 2022
a reimplementation of Optical Flow Estimation using a Spatial Pyramid Network in PyTorch

pytorch-spynet This is a personal reimplementation of SPyNet [1] using PyTorch. Should you be making use of this work, please cite the paper according

Simon Niklaus 269 Jan 02, 2023
Code for: https://berkeleyautomation.github.io/bags/

DeformableRavens Code for the paper Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks. Here is the

Daniel Seita 121 Dec 30, 2022
DyNet: The Dynamic Neural Network Toolkit

The Dynamic Neural Network Toolkit General Installation C++ Python Getting Started Citing Releases and Contributing General DyNet is a neural network

Chris Dyer's lab @ LTI/CMU 3.3k Jan 06, 2023
Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

CasRel-pytorch-reimplement Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The o

longlongman 170 Dec 01, 2022
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022