Zsseg.baseline - Zero-Shot Semantic Segmentation

Last update: Dec 20, 2022

Related tags

Overview

This repo is for our paper A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model. It is based on the official repo of MaskFormer.

@article{xu2021ss,
  title={End-to-End Semi-Supervised Object Detection with Soft Teacher},
  author={Xu, Mengde and Zhang, Zheng and Hu, Han and Wang, Jianfeng and Wang, Lijuan and Wei, Fangyun and Bai, Xiang and Liu, Zicheng},
  journal={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021}
}

Guideline

Enviroment

torch==1.8.0
torchvision==0.9.0
detectron2==0.5 #Following https://detectron2.readthedocs.io/en/latest/tutorials/install.html to install it and some required packages
mmcv==1.3.14

FurtherMore, install the modified clip package.

cd third_party/CLIP
python -m pip install -Ue .

Data Preparation

In our experiments, four datasets are used. For Cityscapes and ADE20k, follow the tutorial in MaskFormer.

For COCO Stuff 164k:

Download data from the offical dataset website and extract it like below.

Datasets/
     coco/
          #http://images.cocodataset.org/zips/train2017.zip
          train2017/ 
          #http://images.cocodataset.org/zips/val2017.zip
          val2017/   
          #http://images.cocodataset.org/annotations/annotations_trainval2017.zip
          annotations/ 
          #http://images.cocodataset.org/annotations/stuff_annotations_trainval2017.zip
          stuffthingmaps/

Format the data to detecttron2 style and split it into Seen (Base) subset and Unseen (Novel) subset.

python datasets/prepare_coco_stuff_164k_sem_seg.py datasets/coco

python tools/mask_cls_collect.py datasets/coco/stuffthingmaps_detectron2/train2017_base datasets/coco/stuffthingmaps_detectron2/train2017_base_label_count.pkl

python tools/mask_cls_collect.py datasets/coco/stuffthingmaps_detectron2/val2017 datasets/coco/stuffthingmaps_detectron2/val2017_label_count.pkl

For Pascal VOC 11k:

Download data from the offical dataset website and extract it like below.

datasets/
   VOC2012/
        #http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
        JPEGImages/
        val.txt
        #http://home.bharathh.info/pubs/codes/SBD/download.html
        SegmentationClassAug/
        #https://gist.githubusercontent.com/sun11/2dbda6b31acc7c6292d14a872d0c90b7/raw/5f5a5270089239ef2f6b65b1cc55208355b5acca/trainaug.txt
        train.txt

Format the data to detecttron2 style and split it into Seen (Base) subset and Unseen (Novel) subset.

python datasets/prepare_voc_sem_seg.py datasets/VOC2012

python tools/mask_cls_collect.py datasets/VOC2012/annotations_detectron2/train datasets/VOC2012/annotations_detectron2/train_base_label_count.json

python tools/mask_cls_collect.py datasets/VOC2012/annotations_detectron2/val datasets/VOC2012/annotations_detectron2/val_label_count.json

Training and Evaluation

Before training and evaluation, see the tutorial in detectron2. For example, to training a zero shot semantic segmentation model on COCO Stuff:

Training with manually designed prompts:

python train_net.py --config-file configs/coco-stuff-164k-156/zero_shot_maskformer_R101c_single_prompt_bs32_60k.yaml

Training with learned prompts:

# Training prompts
python train_net.py --config-file configs/coco-stuff-164k-156/zero_shot_proposal_classification_learn_prompt_bs32_10k.yaml --num-gpus 8 
# Training seg model
python train_net.py --config-file configs/coco-stuff-164k-156/zero_shot_maskformer_R101c_bs32_60k.yaml --num-gpus 8 MODEL.CLIP_ADAPTER.PROMPT_CHECKPOINT ${TRAINED_PROMPTS}

Note: the prompts training will be affected by the random seed. It is better to run it multiple times.

For evaluation, add --eval-only flag to the traing command.

Trained Model

😄 Coming soon.

Zsseg.baseline - Zero-Shot Semantic Segmentation

Related tags

Overview

Guideline

Enviroment

Data Preparation

Training and Evaluation

Owner

Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

Code for "LoRA: Low-Rank Adaptation of Large Language Models"

Brain tumor detection using CNN (InceptionResNetV2 Model)

Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

An official implementation of MobileStyleGAN in PyTorch

Tensorflow Implementation of the paper "Spectral Normalization for Generative Adversarial Networks" (ICML 2017 workshop)

Pun Detection and Location

Learning Neural Network Subspaces

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

Deep learning with TensorFlow and earth observation data.

A framework for using LSTMs to detect anomalies in multivariate time series data. Includes spacecraft anomaly data and experiments from the Mars Science Laboratory and SMAP missions.

OpenDILab Multi-Agent Environment

[NIPS 2021] UOTA: Improving Self-supervised Learning with Automated Unsupervised Outlier Arbitration.

Author's PyTorch implementation of Randomized Ensembled Double Q-Learning (REDQ) algorithm.

Efficient semidefinite bounds for multi-label discrete graphical models.

Request execution of Galaxy SARS-CoV-2 variation analysis workflows on input data you provide.

PyTorch implementation of the Value Iteration Networks (VIN) (NIPS '16 best paper)

Yas CRNN model training - Yet Another Genshin Impact Scanner

U-Net implementation in PyTorch for FLAIR abnormality segmentation in brain MRI