[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Last update: Jan 02, 2023

Overview

K-Net: Towards Unified Image Segmentation

Introduction

This is an official release of the paper K-Net:Towards Unified Image Segmentation. K-Net will also be integrated in the future release of MMDetection and MMSegmentation.

K-Net:Towards Unified Image Segmentation,
Wenwei Zhang, Jiangmiao Pang, Kai Chen, Chen Change Loy
In: Proc. Advances in Neural Information Processing Systems (NeurIPS), 2021
[arXiv][project page][Bibetex]

Results

The results of K-Net and their corresponding configs on each segmentation task are shown as below. We have released the full model zoo of panoptic segmentation. The complete model checkpoints and logs for instance and semantic segmentation will be released soon.

Semantic Segmentation on ADE20K

Backbone	Method	Crop Size	Lr Schd	mIoU	Config	Download
R-50	K-Net + FCN	512x512	80K	43.3	config	model \| log
R-50	K-Net + PSPNet	512x512	80K	43.9	config	model \| log
R-50	K-Net + DeepLabv3	512x512	80K	44.6	config	model \| log
R-50	K-Net + UPerNet	512x512	80K	43.6	config	model \| log
Swin-T	K-Net + UPerNet	512x512	80K	45.4	config	model \| log
Swin-L	K-Net + UPerNet	512x512	80K	52.0	config	model \| log
Swin-L	K-Net + UPerNet	640x640	80K	52.7	config	model \| log

Instance Segmentation on COCO

Backbone	Method	Lr Schd	Mask mAP	Config	Download
R-50	K-Net	1x	34.0	config	model \| log
R-50	K-Net	ms-3x	37.8	config	model \| log
R-101	K-Net	ms-3x	39.2	config	model \| log
R-101-DCN	K-Net	ms-3x	40.5	config	model \| log

Panoptic Segmentation on COCO

Backbone	Method	Lr Schd	PQ	Config	Download
R-50	K-Net	1x	44.3	config	model \| log
R-50	K-Net	ms-3x	47.1	config	model \| log
R-101	K-Net	ms-3x	48.4	config	model \| log
R-101-DCN	K-Net	ms-3x	49.6	config	model \| log
Swin-L (window size 7)	K-Net	ms-3x	54.6	config	model \| log
Above on test-dev			55.2

Installation

It requires the following OpenMMLab packages:

MIM >= 0.1.5
MMCV-full >= v1.3.14
MMDetection >= v2.17.0
MMSegmentation >= v0.18.0
scipy
panopticapi

pip install openmim scipy mmdet mmsegmentation
pip install git+https://github.com/cocodataset/panopticapi.git
mim install mmcv-full

License

This project is released under the Apache 2.0 license.

Usage

Data preparation

Prepare data following MMDetection and MMSegmentation. The data structure looks like below:

data/
├── ade
│   ├── ADEChallengeData2016
│   │   ├── annotations
│   │   ├── images
├── coco
│   ├── annotations
│   │   ├── panoptic_{train,val}2017.json
│   │   ├── instance_{train,val}2017.json
│   │   ├── panoptic_{train,val}2017/  # panoptic png annotations
│   │   ├── image_info_test-dev2017.json  # for test-dev submissions
│   ├── train2017
│   ├── val2017
│   ├── test2017

Training and testing

For training and testing, you can directly use mim to train and test the model

# train instance/panoptic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmdet $CONFIG $WORK_DIR

# test instance segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval segm

# test panoptic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT --eval pq

# train semantic segmentation models
sh ./tools/mim_slurm_train.sh $PARTITION mmseg $CONFIG $WORK_DIR

# test semantic segmentation models
sh ./tools/mim_slurm_test.sh $PARTITION mmseg $CONFIG $CHECKPOINT --eval mIoU

For test submission for panoptic segmentation, you can use the command below:

# we should update the category information in the original image test-dev pkl file
# for panoptic segmentation
python -u tools/gen_panoptic_test_info.py
# run test-dev submission
sh ./tools/mim_slurm_test.sh $PARTITION mmdet $CONFIG $CHECKPOINT  --format-only --cfg-options data.test.ann_file=data/coco/annotations/panoptic_image_info_test-dev2017.json data.test.img_prefix=data/coco/test2017 --eval-options jsonfile_prefix=$WORK_DIR

You can also run training and testing without slurm by directly using mim for instance/semantic/panoptic segmentation like below:

PYTHONPATH='.':$PYTHONPATH mim train mmdet $CONFIG $WORK_DIR
PYTHONPATH='.':$PYTHONPATH mim train mmseg $CONFIG $WORK_DIR

PARTITION: the slurm partition you are using
CHECKPOINT: the path of the checkpoint downloaded from our model zoo or trained by yourself
WORK_DIR: the working directory to save configs, logs, and checkpoints
CONFIG: the config files under the directory configs/
JOB_NAME: the name of the job that are necessary for slurm

Citation

@inproceedings{zhang2021knet,
    title={{K-Net: Towards} Unified Image Segmentation},
    author={Wenwei Zhang and Jiangmiao Pang and Kai Chen and Chen Change Loy},
    year={2021},
    booktitle={NeurIPS},
}

[NeurIPS2021] Code Release of K-Net: Towards Unified Image Segmentation

Related tags

Overview

K-Net: Towards Unified Image Segmentation

Introduction

Results

Semantic Segmentation on ADE20K

Instance Segmentation on COCO

Panoptic Segmentation on COCO

Installation

License

Usage

Data preparation

Training and testing

Citation

Owner

Wenwei Zhang

[CVPR 2022] TransEditor: Transformer-Based Dual-Space GAN for Highly Controllable Facial Editing

Official Repository for "Robust On-Policy Data Collection for Data Efficient Policy Evaluation" (NeurIPS 2021 Workshop on OfflineRL).

Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

Official pytorch implementation of "DSPoint: Dual-scale Point Cloud Recognition with High-frequency Fusion"

The 7th edition of NTIRE: New Trends in Image Restoration and Enhancement workshop will be held on June 2022 in conjunction with CVPR 2022.

Explainability of the Implications of Supervised and Unsupervised Face Image Quality Estimations Through Activation Map Variation Analyses in Face Recognition Models

Here is the implementation of our paper S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations.

Kohei's 5th place solution for xview3 challenge

This repository is based on Ultralytics/yolov5, with adjustments to enable rotate prediction boxes.

Personal thermal comfort models using digital twins: Preference prediction with BIM-extracted spatial-temporal proximity data from Build2Vec

Animatable Neural Radiance Fields for Modeling Dynamic Human Bodies

Exploration & Research into cross-domain MEV. Initial focus on ETH/POLYGON.

This is a beginner-friendly repo to make a collection of some unique and awesome projects. Everyone in the community can benefit & get inspired by the amazing projects present over here.

a grammar based feedback fuzzer

Training data extraction on GPT-2

A study project using the AA-RMVSNet to reconstruct buildings from multiple images

A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

style mixing for animation face

Official code for ICCV2021 paper "M3D-VTON: A Monocular-to-3D Virtual Try-on Network"