A Survey on Deep Learning Technique for Video Segmentation

A Survey on Deep Learning Technique for Video Segmentation
Wenguan Wang, Tianfei Zhou, Fatih Porikli, David Crandall, and Luc Van Gool.

Contributing

Please feel free to create issues or pull requests to add papers.

Welcome any discussions on video segmentation at

1. Introduction

Video segmentation, i.e., partitioning video frames into multiple segments or objects, plays a critical role in a broad range of practical applications, from enhancing visual effects in movie, to understanding scenes in autonomous driving, to virtual background creation in video conferencing. In this survey, we comprehensively review two basic lines of research — video object segmentation and video semantic segmentation — by introducing their respective task settings, background concepts, perceived need, development history, and main challenges. In particular, we review eight sub-fields as given in the following figure:

2. Deep Learning-based Video Object Segmentation

3. Deep Learning-based Video Semantic Segmentation

4. Datasets

Popular Datasets in VOS and VSS

Citation

If you find our survey and repository useful for your research, please consider citing our paper:

@article{wang2021survey,
  title={A survey on deep learning technique for video segmentation},
  author={Wang, Wenguan and Zhou, Tianfei and Porikli, Fatih and Crandall, David and Van Gool, Luc},
  journal={arXiv preprint arXiv:2107.01153},
  year={2021}
}

A Survey on Deep Learning Technique for Video Segmentation

Related tags

Overview

A Survey on Deep Learning Technique for Video Segmentation

Contributing

1. Introduction

2. Deep Learning-based Video Object Segmentation

3. Deep Learning-based Video Semantic Segmentation

4. Datasets

Citation

Owner

Tianfei Zhou

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

Efficient electromagnetic solver based on rigorous coupled-wave analysis for 3D and 2D multi-layered structures with in-plane periodicity

Food Drinks and groceries Images Multi Lingual (FooDI-ML) dataset.

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

PyTorch implementation of Lip to Speech Synthesis with Visual Context Attentional GAN (NeurIPS2021)

Realistic lighting in ursina!

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

Resources complimenting the Machine Learning Course led in the Faculty of mathematics and informatics part of Sofia University.

MediaPipeのPythonパッケージのサンプルです。2020/12/11時点でPython実装のある4機能(Hands、Pose、Face Mesh、Holistic)について用意しています。

the official implementation of the paper "Isometric Multi-Shape Matching" (CVPR 2021)

Official codebase for Pretrained Transformers as Universal Computation Engines.

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

Predictive Modeling on Electronic Health Records(EHR) using Pytorch

Cancer-and-Tumor-Detection-Using-Inception-model - In this repo i am gonna show you how i did cancer/tumor detection in lungs using deep neural networks, specifically here the Inception model by google.

Save-restricted-v-3 - Save restricted content Bot For telegram

VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

Study of human inductive biases in CNNs and Transformers.

🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

face_recognization (FaceNet) + TFHE (HNP) + hand_face_detection (Mediapipe)