[CVPR 2021] Forecasting the panoptic segmentation of future video frames

Overview

Panoptic Segmentation Forecasting

Colin Graber, Grace Tsai, Michael Firman, Gabriel Brostow, Alexander Schwing - CVPR 2021

[Link to paper]

Animated gif showing visual comparison of our model's results compared against the hybrid baseline

We propose to study the novel task of ‘panoptic segmentation forecasting’: given a set of observed frames, the goal is to forecast the panoptic segmentation for a set of unobserved frames. We also propose a first approach to forecasting future panoptic segmentations. In contrast to typical semantic forecasting, we model the motion of individual object instances and the background separately. This makes instance information persistent during forecasting, and allows us to understand the motion of each moving object.

Image presenting the model diagram

⚙️ Setup

Dependencies

Install the code using the following command: pip install -e ./

Data

  • To run this code, the gtFine_trainvaltest dataset will need to be downloaded from the Cityscapes website into the data/ directory.
  • The remainder of the required data can be downloaded using the script download_data.sh. By default, everything is downloaded into the data/ directory.
  • Training the background model requires generating a version of the semantic segmentation annotations where foreground regions have been removed. This can be done by running the script scripts/preprocessing/remove_fg_from_gt.sh.
  • Training the foreground model requires additionally downloading a pretrained MaskRCNN model. This can be found at this link. This should be saved as pretrained_models/fg/mask_rcnn_pretrain.pkl.
  • Training the background model requires additionally downloading a pretrained HarDNet model. This can be found at this link. This should be saved as pretrained_models/bg/hardnet70_cityscapes_model.pkl.

Running our code

The scripts directory contains scripts which can be used to train and evaluate the foreground, background, and egomotion models. Specifically:

  • scripts/odom/run_odom_train.sh trains the egomotion prediction model.
  • scripts/odom/export_odom.sh exports the odometry predictions, which can then be used during evaluation by other models
  • scripts/bg/run_bg_train.sh trains the background prediction model.
  • scripts/bg/run_export_bg_val.sh exports predictions make by the background using input reprojected point clouds which come from using predicted egomotion.
  • scripts/fg/run_fg_train.sh trains the foreground prediction model.
  • scripts/fg/run_fg_eval_panoptic.sh produces final panoptic semgnetation predictions based on the trained foreground model and exported background predictions. This also uses predicted egomotion as input.

We provide our pretrained foreground, background, and egomotion prediction models. The data downloading script additionally downloads these models into the directory pretrained_models/

✏️ 📄 Citation

If you found our work relevant to yours, please consider citing our paper:

@inproceedings{graber-2021-panopticforecasting,
 title   = {Panoptic Segmentation Forecasting},
 author  = {Colin Graber and
            Grace Tsai and
            Michael Firman and
            Gabriel Brostow and
            Alexander Schwing},
 booktitle = {Computer Vision and Pattern Recognition ({CVPR})},
 year = {2021}
}

👩‍⚖️ License

Copyright © Niantic, Inc. 2021. Patent Pending. All rights reserved. Please see the license file for terms.

Owner
Niantic Labs
Building technologies and ideas that move us
Niantic Labs
Compute execution plan: A DAG representation of work that you want to get done. Individual nodes of the DAG could be simple python or shell tasks or complex deeply nested parallel branches or embedded DAGs themselves.

Hello from magnus Magnus provides four capabilities for data teams: Compute execution plan: A DAG representation of work that you want to get done. In

12 Feb 08, 2022
cl;asification problem using classification models in supervised learning

wine-quality-predition---classification cl;asification problem using classification models in supervised learning Wine Quality Prediction Analysis - C

Vineeth Reddy Gangula 1 Jan 18, 2022
Code repository for the paper "Doubly-Trained Adversarial Data Augmentation for Neural Machine Translation" with instructions to reproduce the results.

Doubly Trained Neural Machine Translation System for Adversarial Attack and Data Augmentation Languages Experimented: Data Overview: Source Target Tra

Steven Tan 1 Aug 18, 2022
Single-Stage 6D Object Pose Estimation, CVPR 2020

Overview This repository contains the code for the paper Single-Stage 6D Object Pose Estimation. Yinlin Hu, Pascal Fua, Wei Wang and Mathieu Salzmann.

CVLAB @ EPFL 89 Dec 26, 2022
Unsupervised Image to Image Translation with Generative Adversarial Networks

Unsupervised Image to Image Translation with Generative Adversarial Networks Paper: Unsupervised Image to Image Translation with Generative Adversaria

Hao 71 Oct 30, 2022
Official implementation of the paper DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows

DeFlow: Learning Complex Image Degradations from Unpaired Data with Conditional Flows Official implementation of the paper DeFlow: Learning Complex Im

Valentin Wolf 86 Nov 16, 2022
A tensorflow implementation of GCN-LPA

GCN-LPA This repository is the implementation of GCN-LPA (arXiv): Unifying Graph Convolutional Neural Networks and Label Propagation Hongwei Wang, Jur

Hongwei Wang 83 Nov 28, 2022
Torch-based tool for quantizing high-dimensional vectors using additive codebooks

Trainable multi-codebook quantization This repository implements a utility for use with PyTorch, and ideally GPUs, for training an efficient quantizer

Daniel Povey 41 Jan 07, 2023
This is the code related to "Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation" (ICCV 2021).

Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation This is the code relat

39 Sep 23, 2022
Leaf: Multiple-Choice Question Generation

Leaf: Multiple-Choice Question Generation Easy to use and understand multiple-choice question generation algorithm using T5 Transformers. The applicat

Kristiyan Vachev 62 Dec 20, 2022
Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

SUO-SLAM This repository hosts the code for our CVPR 2022 paper "Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation". ArXiv li

Robot Perception & Navigation Group (RPNG) 97 Jan 03, 2023
Code for "Single-view robot pose and joint angle estimation via render & compare", CVPR 2021 (Oral).

Single-view robot pose and joint angle estimation via render & compare Yann Labbé, Justin Carpentier, Mathieu Aubry, Josef Sivic CVPR: Conference on C

Yann Labbé 51 Oct 14, 2022
PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

PyTorch/TorchScript compiler for NVIDIA GPUs using TensorRT

NVIDIA Corporation 1.8k Dec 30, 2022
On Effective Scheduling of Model-based Reinforcement Learning

On Effective Scheduling of Model-based Reinforcement Learning Code to reproduce the experiments in On Effective Scheduling of Model-based Reinforcemen

laihang 8 Oct 07, 2022
Revealing and Protecting Labels in Distributed Training

Revealing and Protecting Labels in Distributed Training

Google Interns 0 Nov 09, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intenti

NVIDIA Corporation 6.9k Jan 03, 2023
RAMA: Rapid algorithm for multicut problem

RAMA: Rapid algorithm for multicut problem Solves multicut (correlation clustering) problems orders of magnitude faster than CPU based solvers without

Paul Swoboda 60 Dec 13, 2022
Codebase for INVASE: Instance-wise Variable Selection - 2019 ICLR

Codebase for "INVASE: Instance-wise Variable Selection" Authors: Jinsung Yoon, James Jordon, Mihaela van der Schaar Paper: Jinsung Yoon, James Jordon,

Jinsung Yoon 50 Nov 11, 2022
Official Pytorch Implementation of Relational Self-Attention: What's Missing in Attention for Video Understanding

Relational Self-Attention: What's Missing in Attention for Video Understanding This repository is the official implementation of "Relational Self-Atte

mandos 43 Dec 07, 2022
Convolutional neural network web app trained to track our infant’s sleep schedule using our Google Nest camera.

Machine Learning Sleep Schedule Tracker What is it? Convolutional neural network web app trained to track our infant’s sleep schedule using our Google

g-parki 7 Jul 15, 2022