Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Last update: Dec 29, 2022

Related tags

Overview

PCAN for Multiple Object Tracking and Segmentation

This is the offical implementation of paper PCAN for MOTS.

We also present a trailer that consists of method illustrations and tracking & segmentation visualizations. Our project website contains more information: vis.xyz/pub/pcan. Code is under organization.

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation
NeurIPS 2021, Spotlight
Lei Ke, Xia Li, Martin Danelljan, Yu-Wing Tai, Chi-Keung Tang, Fisher Yu

Abstract

Multiple object tracking and segmentation requires detecting, tracking, and segmenting objects belonging to a set of given classes. Most approaches only exploit the temporal dimension to address the association problem, while relying on single frame predictions for the segmentation mask itself. We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation. PCAN first distills a space-time memory into a set of prototypes and then employs cross-attention to retrieve rich information from the past frames. To segment each object, PCAN adopts a prototypical appearance module to learn a set of contrastive foreground and background prototypes, which are then propagated over time. Extensive experiments demonstrate that PCAN outperforms current video instance tracking and segmentation competition winners on both Youtube-VIS and BDD100K datasets, and shows efficacy to both one-stage and two-stage segmentation frameworks.

Prototypical Cross-Attention Networks (PCAN)

Main results

Results on BDD100K

Detector	mMOTSA-val	mIDF1-val	ID Sw.-val	Scores-val	mMOTSA-test	mIDF1-test	ID Sw.-test	Scores-test	Config	Weights	Preds	Visuals
ResNet-50	28.1	45.4	874	scores	31.9	50.4	845	scores	config	model \| MD5	preds	visuals

Installation

Please refer to INSTALL.md for installation instructions.

Usages

Please refer to GET_STARTED.md for dataset preparation and running instructions.

Citation

If you find PCAN useful in your research or refer to the provided baseline results, please star ⭐ this repository and consider citing 📝 :

@inproceedings{pcan,
    author    = {Ke, Lei and Li, Xia and Danelljan, Martin and Tai, Yu-Wing and Tang, Chi-Keung and Yu, Fisher},
    booktitle = {Advances in Neural Information Processing Systems},
    title     = {Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation},
    year      = {2021}
}

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Related tags

Overview

PCAN for Multiple Object Tracking and Segmentation

Abstract

Prototypical Cross-Attention Networks (PCAN)

Main results

Results on BDD100K

Installation

Usages

Citation

Owner

ETH VIS Group

Transformers are Graph Neural Networks!

Official Pytorch Implementation of 3DV2021 paper: SAFA: Structure Aware Face Animation.

A PyTorch implementation of "Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks" (KDD 2019).

Code for our ACL 2021 paper - ConSERT: A Contrastive Framework for Self-Supervised Sentence Representation Transfer

Invert and perturb GAN images for test-time ensembling

This is an official source code for implementation on Extensive Deep Temporal Point Process

Simple (but Strong) Baselines for POMDPs

Simulation-based inference for the Galactic Center Excess

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

The LaTeX and Python code for generating the paper, experiments' results and visualizations reported in each paper is available (whenever possible) in the paper's directory

DA2Lite is an automated model compression toolkit for PyTorch.

공공장소에서 눈만 돌리면 CCTV가 보인다는 말이 과언이 아닐 정도로 CCTV가 우리 생활에 깊숙이 자리 잡았습니다.

InferPy: Deep Probabilistic Modeling with Tensorflow Made Easy

User-friendly bulk RNAseq deconvolution using simulated annealing

Dual Attention Network for Scene Segmentation (CVPR2019)

Code for "Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance" at NeurIPS 2021

CasualHealthcare's Pneumonia detection with Artificial Intelligence (Convolutional Neural Network)

The final project of "Applying AI to 2D Medical Imaging Data" of "AI for Healthcare" nanodegree - Udacity.

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction