DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Overview

NVIDIA Source Code License Python 3.8

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

Paper | Project page | Demo (Youtube) | Demo (Bilibili)

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.
Shiyi Lan, Zhiding Yu, Chris Choy, Subhashree Radhakrishnan, Guilin Liu, Yuke Zhu, Larry Davis, Anima Anandkumar
International Conference on Computer Vision (ICCV) 2021

This repository contains the official Pytorch implementation of training & evaluation code and pretrained models for DiscoBox. DiscoBox is a state of the art framework that can jointly predict high quality instance segmentation and semantic correspondence from box annotations.

We use MMDetection v2.10.0 as the codebase.

All of our models are trained and tested using automatic mixed precision, which leverages float16 for speedup and less GPU memory consumption.

Installation

This implementation is based on PyTorch==1.9.0, mmcv==2.13.0, and mmdetection==2.10.0

Please refer to get_started.md for installation.

Or you can download the docker image from our dockerhub repository.

Models

Results on COCO val 2017

Backbone Weights AP [email protected] [email protected] [email protected] [email protected] [email protected]
ResNet-50 download 30.7 52.6 30.6 13.3 34.1 45.6
ResNet-101-DCN download 35.3 59.1 35.4 16.9 39.2 53.0
ResNeXt-101-DCN download 37.3 60.4 39.1 17.8 41.1 55.4

Results on COCO test-dev

We also evaluate the models in the section Results on COCO val 2017 with the same weights on COCO test-dev.

Backbone Weights AP [email protected] [email protected] [email protected] [email protected] [email protected]
ResNet-50 download 32.0 53.6 32.6 11.7 33.7 48.4
ResNet-101-DCN download 35.8 59.8 36.4 16.9 38.7 52.1
ResNeXt-101-DCN download 37.9 61.4 40.0 18.0 41.1 53.9

Training

COCO

ResNet-50 (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py 8

ResNet-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py 8

ResNeXt-101-DCN (8 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x.py 8

Pascal VOC 2012

ResNet-50 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_6x.py 4

ResNet-101 (4 GPUs):

bash tools/dist_train.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_6x.py 4

Testing

COCO

ResNet-50 (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r50_fpn_3x.py \
     work_dirs/coco_r50_fpn_3x.pth 8 --eval segm

ResNet-101-DCN (8 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_r101_dcn_fpn_3x.py \
     work_dirs/coco_r101_dcn_fpn_3x.pth 8 --eval segm

ResNeXt-101-DCN (GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_x101_dcn_fpn_3x_fp16.py \
     work_dirs/coco_x101_dcn_fpn_3x.pth 8 --eval segm

Pascal VOC 2012 (COCO API)

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 --eval segm

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 --eval segm

Pascal VOC 2012 (Matlab)

Step 1: generate results

ResNet-50 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r50_fpn_3x_fp16.py \
     work_dirs/voc_r50_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r50_results.json"

ResNet-101 (4 GPUs):

bash tools/dist_test.sh \
     configs/discobox/discobox_solov2_voc_r101_fpn_3x_fp16.py \
     work_dirs/voc_r101_6x.pth 4 \
     --format-only \
     --options "jsonfile_prefix=work_dirs/voc_r101_results.json"

Step 2: format conversion

ResNet-50:

python tools/json2mat.pywork_dirs/voc_r50_results.json work_dirs/voc_r50_results.mat

ResNet-101:

python tools/json2mat.pywork_dirs/voc_r101_results.json work_dirs/voc_r101_results.mat

Step 3: evaluation

Please visit BBTP for the evaluation code written in Matlab.

PF-Pascal

Please visit this repository.

LICENSE

Please check the LICENSE file. DiscoBox may be used non-commercially, meaning for research or evaluation purposes only. For business inquiries, please contact [email protected].

Citation

@article{lan2021discobox,
  title={DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision},
  author={Lan, Shiyi and Yu, Zhiding and Choy, Christopher and Radhakrishnan, Subhashree and Liu, Guilin and Zhu, Yuke and Davis, Larry S and Anandkumar, Anima},
  journal={arXiv preprint arXiv:2105.06464},
  year={2021}
}
Owner
Shiyi Lan
PhD Candidate. Research Interests: Object Detection, Instance segmentation, 3D Object Detection, 3D vehicle trajectory, Weakly/Semi-supervised learning
Shiyi Lan
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 55 Nov 14, 2022
[ICLR 2022 Oral] F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization

F8Net Fixed-Point 8-bit Only Multiplication for Network Quantization (ICLR 2022 Oral) OpenReview | arXiv | PDF | Model Zoo | BibTex PyTorch implementa

Snap Research 76 Dec 13, 2022
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Visual Understanding Lab @ Samsung AI Center Moscow 190 Dec 30, 2022
Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Temporally Efficient Vision Transformer for Video Instance Segmentation Temporally Efficient Vision Transformer for Video Instance Segmentation (CVPR

Hust Visual Learning Team 203 Dec 31, 2022
[KDD 2021, Research Track] DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neural Networks

DiffMG This repository contains the code for our KDD 2021 Research Track paper: DiffMG: Differentiable Meta Graph Search for Heterogeneous Graph Neura

AutoML Research 24 Nov 29, 2022
Code for the Paper "Diffusion Models for Handwriting Generation"

Code for the Paper "Diffusion Models for Handwriting Generation"

62 Dec 21, 2022
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

ModelNet-C Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions". For the latest updates, see: sites.google.com

Jiawei Ren 45 Dec 28, 2022
VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning

VisualGPT Our Paper VisualGPT: Data-efficient Adaptation of Pretrained Language Models for Image Captioning Main Architecture of Our VisualGPT Downloa

Vision CAIR Research Group, KAUST 140 Dec 28, 2022
This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language Models"

GreaseLM: Graph REASoning Enhanced Language Models This repo provides the source code & data of our paper "GreaseLM: Graph REASoning Enhanced Language

137 Jan 02, 2023
Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

AI Secure 57 Dec 15, 2022
This repository contains the reference implementation for our proposed Convolutional CRFs.

ConvCRF This repository contains the reference implementation for our proposed Convolutional CRFs in PyTorch (Tensorflow planned). The two main entry-

Marvin Teichmann 553 Dec 07, 2022
Prometheus Exporter for data scraped from datenplattform.darmstadt.de

darmstadt-opendata-exporter Scrapes data from https://datenplattform.darmstadt.de and presents it in the Prometheus Exposition format. Pull requests w

Martin Weinelt 2 Apr 12, 2022
tmm_fast is a lightweight package to speed up optical planar multilayer thin-film device computation.

tmm_fast tmm_fast or transfer-matrix-method_fast is a lightweight package to speed up optical planar multilayer thin-film device computation. It is es

26 Dec 11, 2022
AdaDM: Enabling Normalization for Image Super-Resolution

AdaDM AdaDM: Enabling Normalization for Image Super-Resolution. You can apply BN, LN or GN in SR networks with our AdaDM. Pretrained models (EDSR*/RDN

58 Jan 08, 2023
Using pretrained language models for biomedical knowledge graph completion.

LMs for biomedical KG completion This repository contains code to run the experiments described in: Scientific Language Models for Biomedical Knowledg

Rahul Nadkarni 41 Nov 30, 2022
The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The offical PyTorch implementation of Neural View Sy

Angtian Wang 20 Oct 09, 2022
TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning

TaCL: Improving BERT Pre-training with Token-aware Contrastive Learning Authors: Yixuan Su, Fangyu Liu, Zaiqiao Meng, Lei Shu, Ehsan Shareghi, and Nig

Yixuan Su 79 Nov 04, 2022
An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available actions

Agar.io_Q-Learning_AI An experiment on the performance of homemade Q-learning AIs in Agar.io depending on their state representation and available act

1 Jun 09, 2022
Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks

Adversarially-Robust-Periphery Code + Data from the paper "Finding Biological Plausibility for Adversarially Robust Features via Metameric Tasks" by A

Anne Harrington 2 Feb 07, 2022