[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Last update: Dec 04, 2022

Overview

Reliable Propagation-Correction Modulation for Video Object Segmentation (AAAI22)

Preview version paper of this work is available at: https://arxiv.org/abs/2112.02853

Qualitative results and comparisons with previous SOTAs are available at: https://youtu.be/X6BsS3t3wnc

This repo is a preview version. More details will be added later.

Abstract

Error propagation is a general but crucial problem in online semi-supervised video object segmentation. We aim to suppress error propagation through a correction mechanism with high reliability.

The key insight is to disentangle the correction from the conventional mask propagation process with reliable cues.

We introduce two modulators, propagation and correction modulators, to separately perform channel-wise re-calibration on the target frame embeddings according to local temporal correlations and reliable references respectively. Specifically, we assemble the modulators with a cascaded propagation-correction scheme. This avoids overriding the effects of the reliable correction modulator by the propagation modulator.

Although the reference frame with the ground truth label provides reliable cues, it could be very different from the target frame and introduce uncertain or incomplete correlations. We augment the reference cues by supplementing reliable feature patches to a maintained pool, thus offering more comprehensive and expressive object representations to the modulators. In addition, a reliability filter is designed to retrieve reliable patches and pass them in subsequent frames.

Our model achieves state-of-the-art performance on YouTube-VOS18/19 and DAVIS17-Val/Test benchmarks. Extensive experiments demonstrate that the correction mechanism provides considerable performance gain by fully utilizing reliable guidance.

Requirements

This docker image may contain some redundent packages. A more light-weight one will be generated later.

docker image: xxiaoh/vos:10.1-cudnn7-torch1.4_v3

Citation

If you find this work is useful for your research, please consider citing:

@misc{xu2021reliable,
  title={Reliable Propagation-Correction Modulation for Video Object Segmentation}, 
  author={Xiaohao Xu and Jinglu Wang and Xiao Li and Yan Lu},
  year={2021},
  eprint={2112.02853},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

Credit

CFBI: https://github.com/z-x-yang/CFBI

Deeplab: https://github.com/VainF/DeepLabV3Plus-Pytorch

GCT: https://github.com/z-x-yang/GCT

Acknowledgement

Firstly, the author would like to thank Rex for his insightful viewpoints about VOS during e-mail discussion! Also, this work is largely built upon the codebase of CFBI. Thanks for the author of CFBI to release such a wonderful code repo for further work to build upon!

Related impressive works in VOS

AOT [NeurIPS 2021]: https://github.com/z-x-yang/AOT

STCN [NeurIPS 2021]: https://github.com/hkchengrex/STCN

MiVOS [CVPR 2021]: https://github.com/hkchengrex/MiVOS

SSTVOS [CVPR 2021]: https://github.com/dukebw/SSTVOS

GraphMemVOS [ECCV 2020]: https://github.com/carrierlxk/GraphMemVOS

CFBI [ECCV 2020]: https://github.com/z-x-yang/CFBI

STM [ICCV 2019]: https://github.com/seoungwugoh/STM

FEELVOS [CVPR 2019]: https://github.com/kim-younghan/FEELVOS

Useful websites for VOS

The 1st Large-scale Video Object Segmentation Challenge: https://competitions.codalab.org/competitions/19544#learn_the_details

The 2nd Large-scale Video Object Segmentation Challenge - Track 1: Video Object Segmentation: https://competitions.codalab.org/competitions/20127#learn_the_details

The Semi-Supervised DAVIS Challenge on Video Object Segmentation @ CVPR 2020: https://competitions.codalab.org/competitions/20516#participate-submit_results

DAVIS: https://davischallenge.org/

YouTube-VOS: https://youtube-vos.org/

Papers with code for Semi-VOS: https://paperswithcode.com/task/semi-supervised-video-object-segmentation

Welcome to comments and discussions!!

Xiaohao Xu: [email protected]

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Related tags

Overview

Reliable Propagation-Correction Modulation for Video Object Segmentation (AAAI22)

Abstract

Requirements

Citation

Credit

Acknowledgement

Related impressive works in VOS

Useful websites for VOS

Welcome to comments and discussions!!

Owner

Xiaohao Xu

Learn other languages using artificial intelligence with python.

DABO: Data Augmentation with Bilevel Optimization

SBINN: Systems-biology informed neural network

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Simple tutorials using Google's TensorFlow Framework

[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Does Pretraining for Summarization Reuqire Knowledge Transfer?

PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

Exe-to-xlsm - Simple script to create VBscript of exe and inject to xlsm

ML-based medical imaging using Azure

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Few-shot Learning of GPT-3

PyTorch code accompanying the paper "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning" (NeurIPS 2021).

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Related tags

Overview

Reliable Propagation-Correction Modulation for Video Object Segmentation (AAAI22)

Abstract

Requirements

Citation

Credit

Acknowledgement

Related impressive works in VOS

Useful websites for VOS

Welcome to comments and discussions!!

Owner

Xiaohao Xu

Learn other languages ​​using artificial intelligence with python.

DABO: Data Augmentation with Bilevel Optimization

SBINN: Systems-biology informed neural network

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Simple tutorials using Google's TensorFlow Framework

[BMVC 2021] Official PyTorch Implementation of Self-supervised learning of Image Scale and Orientation Estimation

Does Pretraining for Summarization Reuqire Knowledge Transfer?

PyTorch Implementation of PIXOR: Real-time 3D Object Detection from Point Clouds

Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning (ICLR 2021)

It's a implement of this paper：Relation extraction via Multi-Level attention CNNs

Exe-to-xlsm - Simple script to create VBscript of exe and inject to xlsm

ML-based medical imaging using Azure

deep-table implements various state-of-the-art deep learning and self-supervised learning algorithms for tabular data using PyTorch.

You Only Sample (Almost) Once: Linear Cost Self-Attention Via Bernoulli Sampling

A new codebase for Group Activity Recognition. It contains codes for ICCV 2021 paper: Spatio-Temporal Dynamic Inference Network for Group Activity Recognition and some other methods.

Few-shot Learning of GPT-3

PyTorch code accompanying the paper "Landmark-Guided Subgoal Generation in Hierarchical Reinforcement Learning" (NeurIPS 2021).

[CVPR 2021] Few-shot 3D Point Cloud Semantic Segmentation

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

Learn other languages using artificial intelligence with python.