MultiTaskLearning - Multi Task Learning for 3D segmentation

Last update: Sep 22, 2022

Related tags

Overview

Multi Task Learning for 3D segmentation

Perception stack of an Autonomous Driving system often contains multiple neural networks working together to predict bounding boxes, segmentation maps, depth maps, lane lines etc. Having a separate neural network for each task creates an heavy impact on system's processing speed.

This repository contains implementation of a multi task learning based neural network presented in [1]. The attempt is to implement an architecture that has an encoder decoder structure. It takes RGB image as an input and predicts a segmentation mask and a depth map in a single forward pass. The idea is to have a common backbone for extracting feature map. Then according to the required task decoder structure are plugged on to this encoder to generate predictions. This sort of networks are essential for Autonomous Driving.

Architecture

Model architecture can be understood by perceiving it as an encoder decoder structure.

For Encoder : A lightweight MobileNet V2 was used. Feature maps are extracted from multiple levels of the network. These feature maps are concatenated during upsampling to the layer outputs in decoders at corresponding levels

For Decoder : A lightweight RefineNet architecture was used which contains CRP blocks. The decoder consistently upsamples feature maps from encoder. Before the penultimate layer level, decoder splits into two heads for segmentation mask of input image and depth of image.

Dataset:

Model has been tested with KITTI and NYU-D dataset. Both datasets provide set of RGB Image, Segmentation Mask and Depth Map for each data point.

Results:

The model was tested on KITTI scenes for highway and residential drives. As an output model predicts a segmentation map and a depth map in a single forward pass. The segmentation mask and the depth map can be fused using libraries like Open3D to create a Point Cloud representation of 3D objects in each scene. We can not only get classification and pixel coordinates of each object in the image but we can also go a step ahead and compute their depth from the vehicle in real world.

Another way these results can be interpreted is in the form of a point cloud of depth segmentation map. Open3D has functionality to reproduce a full fledged Point Cloud using RGB and Depth image pair.

Model Input/output:

3D Segmentation point cloud:

References [1] Real-Time Joint Semantic Segmentation and Depth Estimation Using Asymmetric Annotations Vladimir Nekrasov, Thanuja Dharmasiri, Andrew Spek, Tom Drummond, Chunhua Shen, Ian Reid In ICRA 2019 (https://arxiv.org/pdf/1809.04766.pdf)

MultiTaskLearning - Multi Task Learning for 3D segmentation

Related tags

Overview

Multi Task Learning for 3D segmentation

Architecture

Dataset:

Results:

Owner

Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

UFPR-ADMR-v2 Dataset

SoGCN: Second-Order Graph Convolutional Networks

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

ECLARE: Extreme Classification with Label Graph Correlations

Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

This repo is customed for VisDrone.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

Python Actor concurrency library

This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Inteligência artificial criada para realizar interação social com idosos.

SIEM Logstash parsing for more than hundred technologies

Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

ObjDetApp deploys a pytorch model for object detection

Hooks for VCOCO

MultiTaskLearning - Multi Task Learning for 3D segmentation

Related tags

Overview

Multi Task Learning for 3D segmentation

Architecture

Dataset:

Results:

Owner

Dataset and Code for the paper "DepthTrack: Unveiling the Power of RGBD Tracking" (ICCV2021), and "Depth-only Object Tracking" (BMVC2021)

UFPR-ADMR-v2 Dataset

SoGCN: Second-Order Graph Convolutional Networks

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

Pytorch Code for "Medical Transformer: Gated Axial-Attention for Medical Image Segmentation"

ECLARE: Extreme Classification with Label Graph Correlations

Differentiable Factor Graph Optimization for Learning Smoothers @ IROS 2021

This repo is customed for VisDrone.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

Reviving Iterative Training with Mask Guidance for Interactive Segmentation

Python Actor concurrency library

This is code to fit per-pixel environment map with spherical Gaussian lobes, using LBFGS optimization

Code for the CVPR 2021 paper "Triple-cooperative Video Shadow Detection"

PyTorch implementation of the ExORL: Exploratory Data for Offline Reinforcement Learning

Inteligência artificial criada para realizar interação social com idosos.

SIEM Logstash parsing for more than hundred technologies

Unofficial PyTorch implementation of Fastformer based on paper "Fastformer: Additive Attention Can Be All You Need"."

*ObjDetApp* deploys a pytorch model for object detection

Hooks for VCOCO

ObjDetApp deploys a pytorch model for object detection