[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Overview

IVOS-W

Paper

Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

Zhaoyun Yin, Jia Zheng, Weixin Luo, Shenhan Qian, Hanling Zhang, Shenghua Gao.

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.

[Preprint] [Supplementary Material] [Poster]

Getting Started

Create the environment

# create conda env
conda create -n ivosw python=3.7
# activate conda env
conda activate ivosw
# install pytorch
conda install pytorch=1.3 torchvision
# install other dependencies
pip install -r requirements.txt

We adopt MANet, IPN, and ATNet as the VOS algorithms. Please follow the instructions to install the dependencies.

git clone https://github.com/yuk6heo/IVOS-ATNet.git VOS/ATNet
git clone https://github.com/lightas/CVPR2020_MANet.git VOS/MANet
git clone https://github.com/zyy-cn/IPN.git VOS/IPN

Dataset Preparation

  • DAVIS 2017 Dataset
    • Download the data and human annotated scribbles here.
    • Place DAVIS folder into root/data.
  • YouTube-VOS Dataset
    • Download the YouTube-VOS 2018 version here.
    • Clean up the annotations following here.
    • Download our annotated scribbles here.

Create a DAVIS-like structure of YouTube-VOS by running the following commands:

python datasets/prepare_ytbvos.py --src path/to/youtube_vos --scb path/to/scribble_dir

Evaluation

For evaluation, please download the pretrained agent model and quality assessment model, then place them into root/weights and run the following commands:

python eval_agent_{atnet/manet/ipn}.py with setting={oracle/wild} dataset={davis/ytbvos} method={random/linspace/worst/ours}

The results will be stored in results/{VOS}/{setting}/{dataset}/{method}/summary.json

Note: The results may fluctuate slightly with different versions of networkx, which is used by davisinteractive to generate simulated scribbles.

Training

First, prepare the data used to train the agent by downloading reward records and pretrained experience buffer, place them into root/train, or generate them from scratch:

python produce_reward.py
python pretrain_agent.py

To train the agent:

python train_agent.py

To train the segmentation quality assessment model:

python generate_data.py
python quality_assessment.py

Citation

@inproceedings{IVOSW,
  title     = {Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild},
  author    = {Zhaoyuan Yin and
               Jia Zheng and
               Weixin Luo and
               Shenhan Qian and
               Hanling Zhang and
               Shenghua Gao},
  booktitle = {CVPR},
  year      = {2021}
}

LICENSE

The code is released under the MIT license.

Owner
SVIP Lab
ShanghaiTech Vision and Intelligent Perception Lab
SVIP Lab
atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

#11 atmaCup 2021-07-09 ~ 2020-07-21 に行われた #11 [初心者歓迎! / 画像編] atmaCup のリポジトリです。結果は Public 4th / Private 5th でした。 フレームワークは PyTorch で、実装は pytorch-image-m

Tawara 12 Apr 07, 2022
This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust.

Demo BERT ONNX pipeline written in rust This demo showcase the use of onnxruntime-rs with a GPU on CUDA 11 to run Bert in a data pipeline with Rust. R

Xavier Tao 14 Dec 17, 2022
Affine / perspective transformation in Pose Estimation with Tensorflow 2

Pose Transformation Affine / Perspective transformation in Pose Estimation with Tensorflow 2 Introduction 이 repo는 pose estimation을 연구하고 개발하는 데 도움이 되기

Kim Junho 1 Dec 22, 2021
PSPNet in Chainer

PSPNet This is an unofficial implementation of Pyramid Scene Parsing Network (PSPNet) in Chainer. Training Requirement Python 3.4.4+ Chainer 3.0.0b1+

Shunta Saito 76 Dec 12, 2022
Learning to Communicate with Deep Multi-Agent Reinforcement Learning in PyTorch

Learning to Communicate with Deep Multi-Agent Reinforcement Learning This is a PyTorch implementation of the original Lua code release. Overview This

Minqi 297 Dec 12, 2022
Fast SHAP value computation for interpreting tree-based models

FastTreeSHAP FastTreeSHAP package is built based on the paper Fast TreeSHAP: Accelerating SHAP Value Computation for Trees published in NeurIPS 2021 X

LinkedIn 369 Jan 04, 2023
LogAvgExp - Pytorch Implementation of LogAvgExp

LogAvgExp - Pytorch Implementation of LogAvgExp for Pytorch Install $ pip instal

Phil Wang 31 Oct 14, 2022
A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (चित्र) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

Aniket Maurya 210 Dec 21, 2022
A treasure chest for visual recognition powered by PaddlePaddle

简体中文 | English PaddleClas 简介 飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。 近期更新 2021.11.1 发布PP-ShiTu技术报告,新增饮料识别demo 2021.10.23 发

4.6k Dec 31, 2022
Fully Convolutional DenseNets for semantic segmentation.

Introduction This repo contains the code to train and evaluate FC-DenseNets as described in The One Hundred Layers Tiramisu: Fully Convolutional Dense

485 Nov 26, 2022
ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D Data

ARKitScenes This repo accompanies the research paper, ARKitScenes - A Diverse Real-World Dataset for 3D Indoor Scene Understanding Using Mobile RGB-D

Apple 371 Jan 05, 2023
(ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning"

CLNet (ICCV 2021) PyTorch implementation of Paper "Progressive Correspondence Pruning by Consensus Learning" [project page] [paper] Citing CLNet If yo

Chen Zhao 22 Aug 26, 2022
PrimitiveNet: Primitive Instance Segmentation with Local Primitive Embedding under Adversarial Metric (ICCV 2021)

PrimitiveNet Source code for the paper: Jingwei Huang, Yanfeng Zhang, Mingwei Sun. [PrimitiveNet: Primitive Instance Segmentation with Local Primitive

Jingwei Huang 47 Dec 06, 2022
Official Repo of my work for SREC Nandyal Machine Learning Bootcamp

About the Bootcamp A 3-day Machine Learning Bootcamp organised by Department of Electronics and Communication Engineering, Santhiram Engineering Colle

MS 1 Nov 29, 2021
Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset

Lighting the Darkness in the Deep Learning Era: A Survey, An Online Platform, A New Dataset This repository provides a unified online platform, LoLi-P

Chongyi Li 457 Jan 03, 2023
A Factor Model for Persistence in Investment Manager Performance

Factor-Model-Manager-Performance A Factor Model for Persistence in Investment Manager Performance I apply methods and processes similar to those used

Omid Arhami 1 Dec 01, 2021
Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

Self-Supervised Multi-Frame Monocular Scene Flow 3D visualization of estimated depth and scene flow (overlayed with input image) from temporally conse

Visual Inference Lab @TU Darmstadt 85 Dec 22, 2022
PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop. VoiceLoop is a n

Meta Archive 873 Dec 15, 2022
Instantaneous Motion Generation for Robots and Machines.

Ruckig Instantaneous Motion Generation for Robots and Machines. Ruckig generates trajectories on-the-fly, allowing robots and machines to react instan

Berscheid 374 Dec 23, 2022
Data-depth-inference - Data depth inference with python

Welcome! This readme will guide you through the use of the code in this reposito

Marco 3 Feb 08, 2022