Real-time Joint Semantic Reasoning for Autonomous Driving

Overview

MultiNet

MultiNet is able to jointly perform road segmentation, car detection and street classification. The model achieves real-time speed and state-of-the-art performance in segmentation. Check out our paper for a detailed model description.

MultiNet is optimized to perform well at a real-time speed. It has two components: KittiSeg, which sets a new state-of-the art in road segmentation; and KittiBox, which improves over the baseline Faster-RCNN in both inference speed and detection performance.

The model is designed as an encoder-decoder architecture. It utilizes one VGG encoder and several independent decoders for each task. This repository contains generic code that combines several tensorflow models in one network. The code for the individual tasks is provided by the KittiSeg, KittiBox, and KittiClass repositories. These repositories are utilized as submodules in this project. This project is built to be compatible with the TensorVision back end, which allows for organizing experiments in a very clean way.

Requirements

The code requires Python 2.7, Tensorflow 1.0, as well as the following python libraries:

  • matplotlib
  • numpy
  • Pillow
  • scipy
  • runcython
  • commentjson

Those modules can be installed using: pip install numpy scipy pillow matplotlib runcython commentjson or pip install -r requirements.txt.

Setup

  1. Clone this repository: https://github.com/MarvinTeichmann/MultiNet.git
  2. Initialize all submodules: git submodule update --init --recursive
  3. cd submodules/KittiBox/submodules/utils/ && make to build cython code
  4. [Optional] Download Kitti Road Data:
    1. Retrieve kitti data url here: http://www.cvlibs.net/download.php?file=data_road.zip
    2. Call python download_data.py --kitti_url URL_YOU_RETRIEVED
  5. [Optional] Run cd submodules/KittiBox/submodules/KittiObjective2/ && make to build the Kitti evaluation code (see submodules/KittiBox/submodules/KittiObjective2/README.md for more information)

Running the model using demo.py only requires you to perform step 1-3. Step 4 and 5 is only required if you want to train your own model using train.py. Note that I recommend using download_data.py instead of downloading the data yourself. The script will also extract and prepare the data. See Section Manage data storage if you like to control where the data is stored.

To update MultiNet do:
  1. Pull all patches: git pull
  2. Update all submodules: git submodule update --init --recursive

If you forget the second step you might end up with an inconstant repository state. You will already have the new code for MultiNet but run it old submodule versions code. This can work, but I do not run any tests to verify this.

Tutorial

Getting started

Run: python demo.py --gpus 0 --input data/demo/um_000005.png to obtain a prediction using demo.png as input.

Run: python evaluate.py to evaluate a trained model.

Run: python train.py --hypes hypes/multinet2.json to train a multinet2

If you like to understand the code, I would recommend looking at demo.py first. I have documented each step as thoroughly as possible in this file.

Only training of MultiNet3 (joint detection and segmentation) is supported out of the box. The data to train the classification model is not public an those cannot be used to train the full MultiNet3 (detection, segmentation and classification). The full code is given here, so you can still train MultiNet3 if you have your own data.

Manage Data Storage

MultiNet allows to separate data storage from code. This is very useful in many server environments. By default, the data is stored in the folder MultiNet/DATA and the output of runs in MultiNet/RUNS. This behaviour can be changed by setting the bash environment variables: $TV_DIR_DATA and $TV_DIR_RUNS.

Include export TV_DIR_DATA="/MY/LARGE/HDD/DATA" in your .profile and the all data will be downloaded to /MY/LARGE/HDD/DATA/. Include export TV_DIR_RUNS="/MY/LARGE/HDD/RUNS" in your .profile and all runs will be saved to /MY/LARGE/HDD/RUNS/MultiNet

Modifying Model & Train on your own data

The model is controlled by the file hypes/multinet3.json. This file points the code to the implementation of the submodels. The MultiNet code then loads all models provided and integrates the decoders into one neural network. To train on your own data, it should be enough to modify the hype files of the submodels. A good start will be the KittiSeg model, which is very well documented.

    "models": {
        "segmentation" : "../submodules/KittiSeg/hypes/KittiSeg.json",
        "detection" : "../submodules/KittiBox/hypes/kittiBox.json",
        "road" : "../submodules/KittiClass/hypes/KittiClass.json"
    },

RUNDIR and Experiment Organization

MultiNet helps you to organize a large number of experiments. To do so, the output of each run is stored in its own rundir. Each rundir contains:

  • output.log a copy of the training output which was printed to your screen
  • tensorflow events tensorboard can be run in rundir
  • tensorflow checkpoints the trained model can be loaded from rundir
  • [dir] images a folder containing example output images. image_iter controls how often the whole validation set is dumped
  • [dir] model_files A copy of all source code need to build the model. This can be very useful of you have many versions of the model.

To keep track of all the experiments, you can give each rundir a unique name with the --name flag. The --project flag will store the run in a separate subfolder allowing to run different series of experiments. As an example, python train.py --project batch_size_bench --name size_5 will use the following dir as rundir: $TV_DIR_RUNS/KittiSeg/batch_size_bench/size_5_KittiSeg_2017_02_08_13.12.

The flag --nosave is very useful to not spam your rundir.

Useful Flags & Variabels

Here are some Flags which will be useful when working with KittiSeg and TensorVision. All flags are available across all scripts.

--hypes : specify which hype-file to use
--logdir : specify which logdir to use
--gpus : specify on which GPUs to run the code
--name : assign a name to the run
--project : assign a project to the run
--nosave : debug run, logdir will be set to debug

In addition the following TensorVision environment Variables will be useful:

$TV_DIR_DATA: specify meta directory for data
$TV_DIR_RUNS: specify meta directory for output
$TV_USE_GPUS: specify default GPU behaviour.

On a cluster it is useful to set $TV_USE_GPUS=force. This will make the flag --gpus mandatory and ensure, that run will be executed on the right GPU.

Citation

If you benefit from this code, please cite our paper:

@article{teichmann2016multinet,
  title={MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving},
  author={Teichmann, Marvin and Weber, Michael and Zoellner, Marius and Cipolla, Roberto and Urtasun, Raquel},
  journal={arXiv preprint arXiv:1612.07695},
  year={2016}
}
Owner
Marvin Teichmann
Germany Phd student. Working on Deep Learning and Computer Vision projects.
Marvin Teichmann
Meli Data Challenge 2021 - First Place Solution

My solution for the Meli Data Challenge 2021

Matias Moreyra 23 Mar 09, 2022
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 29, 2022
Official PyTorch Implementation of Convolutional Hough Matching Networks, CVPR 2021 (oral)

Convolutional Hough Matching Networks This is the implementation of the paper "Convolutional Hough Matching Network" by J. Min and M. Cho. Implemented

Juhong Min 70 Nov 22, 2022
Implementation of the GBST block from the Charformer paper, in Pytorch

Charformer - Pytorch Implementation of the GBST (gradient-based subword tokenization) module from the Charformer paper, in Pytorch. The paper proposes

Phil Wang 105 Dec 26, 2022
Pytorch implementation of CoCon: A Self-Supervised Approach for Controlled Text Generation

COCON_ICLR2021 This is our Pytorch implementation of COCON. CoCon: A Self-Supervised Approach for Controlled Text Generation (ICLR 2021) Alvin Chan, Y

alvinchangw 79 Dec 18, 2022
CBKH: The Cornell Biomedical Knowledge Hub

Cornell Biomedical Knowledge Hub (CBKH) CBKG integrates data from 18 publicly available biomedical databases. The current version of CBKG contains a t

44 Dec 21, 2022
Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022
This is the official source code of "BiCAT: Bi-Chronological Augmentation of Transformer for Sequential Recommendation".

BiCAT This is our TensorFlow implementation for the paper: "BiCAT: Sequential Recommendation with Bidirectional Chronological Augmentation of Transfor

John 15 Dec 06, 2022
Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO)

V-MPO Simple code to demonstrate Deep Reinforcement Learning by using an on-policy adaptation of Maximum a Posteriori Policy Optimization (MPO) in Pyt

Nugroho Dewantoro 9 Jun 06, 2022
Pytorch implementation for RelTransformer

RelTransformer Our Architecture This is a Pytorch implementation for RelTransformer The implementation for Evaluating on VG200 can be found here Requi

Vision CAIR Research Group, KAUST 21 Nov 22, 2022
ppo_pytorch_cpp - an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch

PPO Pytorch C++ This is an implementation of the proximal policy optimization algorithm for the C++ API of Pytorch. It uses a simple TestEnvironment t

Martin Huber 59 Dec 09, 2022
EMNLP'2021: Simple Entity-centric Questions Challenge Dense Retrievers

EntityQuestions This repository contains the EntityQuestions dataset as well as code to evaluate retrieval results from the the paper Simple Entity-ce

Princeton Natural Language Processing 119 Sep 28, 2022
CMT: Convolutional Neural Networks Meet Vision Transformers

CMT: Convolutional Neural Networks Meet Vision Transformers [arxiv] 1. Introduction This repo is the CMT model which impelement with pytorch, no refer

FlyEgle 83 Dec 30, 2022
Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

2D-TAN (Optimized) Introduction This is an optimized re-implementation repository for AAAI'2020 paper: Learning 2D Temporal Localization Networks for

Joya Chen 112 Dec 31, 2022
Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

Edward Beeching 326 Dec 30, 2022
Implementation of BI-RADS-BERT & The Advantages of Section Tokenization.

BI-RADS BERT Implementation of BI-RADS-BERT & The Advantages of Section Tokenization. This implementation could be used on other radiology in house co

1 May 17, 2022
The source code of CVPR 2019 paper "Deep Exemplar-based Video Colorization".

Deep Exemplar-based Video Colorization (Pytorch Implementation) Paper | Pretrained Model | Youtube video 🔥 | Colab demo Deep Exemplar-based Video Col

Bo Zhang 253 Dec 27, 2022
Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection PyTorch code release of the paper "Attentive Prototypes for Sour

Deepti Hegde 23 Oct 17, 2022
Signals-backend - A suite of card games written in Python

Card game A suite of card games written in the Python language. Features coming

1 Feb 15, 2022
Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

g-parki 7 Jul 15, 2022