Semi-supervised semantic segmentation needs strong, varied perturbations

Overview

Semi-supervised semantic segmentation using CutMix and Colour Augmentation

Implementations of our papers:

Licensed under MIT license.

Colour augmentation

Please see our new paper for a full discussion, but a summary of our findings can be found in our [colour augmentation](Colour augmentation.ipynb) Jupyter notebook.

Requirements

We provide an environment.yml file that can be used to re-create a conda environment that provides the required packages:

conda env create -f environment.yml

Then activate with:

conda activate cutmix_semisup_seg

(note: this will not install the library needed to use the PSPNet architecture; see below)

In general we need:

  • Python >= 3.6
  • PyTorch >= 1.4
  • torchvision 0.5
  • OpenCV
  • Pillow
  • Scikit-image
  • Scikit-learn
  • click
  • tqdm
  • Jupyter notebook for the notebooks
  • numpy 1.18

Requirements for PSPNet

To use the PSPNet architecture (see Pyramid Scene Parsing Network by Zhao et al.), you will need to install the logits-from_models branch of https://github.com/Britefury/semantic-segmentation-pytorch:

pip install git+https://github.com/Britefury/[email protected]

Datasets

You need to:

  1. Download/acquire the datsets
  2. Write the config file semantic_segmentation.cfg giving their paths
  3. Convert them if necessary; the CamVid, Cityscapes and ISIC 2017 datasets must be converted to a ZIP-based format prior to use. You must run the provided conversion utilities to create these ZIP files.

Dataset preparation instructions can be found here.

Running the experiments

We provide four programs for running experiments:

  • train_seg_semisup_mask_mt.py: mask driven consistency loss (the main experiment)
  • train_seg_semisup_aug_mt.py: augmentation driven consistency loss; used to attempt to replicate the ISIC 2017 baselines of Li et al.
  • train_seg_semisup_ict.py: Interpolation Consistency Training; a baseline for contrast with our main approach
  • train_seg_semisup_vat_mt.py: Virtual Adversarial Training adapted for semantic segmentation

They can be configured via command line arguments that are described here.

Shell scripts

To replicate our results, we provide shell scripts to run our experiments.

Cityscapes
> sh run_cityscapes_experiments.sh <run> <split_rng_seed>

where <run> is the name of the run and <split_rng_seed> is an integer RNG seed used to select the supervised samples. Please see the comments at the top of run_cityscapes_experiments.sh for further explanation.

To re-create the 5 runs we used for our experiments:

> sh run_cityscapes_experiments.sh 01 12345
> sh run_cityscapes_experiments.sh 02 23456
> sh run_cityscapes_experiments.sh 03 34567
> sh run_cityscapes_experiments.sh 04 45678
> sh run_cityscapes_experiments.sh 05 56789
Pascal VOC 2012 (augmented)
> sh run_pascal_aug_experiments.sh <n_supervised> <n_supervised_txt>

where <n_supervised> is the number of supervised samples and <n_supervised_txt> is that number as text. Please see the comments at the top of run_pascal_aug_experiments.sh for further explanation.

We use the same data split as Mittal et al. It is stored in data/splits/pascal_aug/split_0.pkl that is included in the repo.

Pascal VOC 2012 (augmented) with DeepLab v3+
> sh run_pascal_aug_deeplab3plus_experiments.sh <n_supervised> <n_supervised_txt>
ISIC 2017 Segmentation
> sh run_isic2017_experiments.sh <run> <split_rng_seed>

where <run> is the name of the run and <split_rng_seed> is an integer RNG seed used to select the supervised samples. Please see the comments at the top of run_isic2017_experiments.sh for further explanation.

To re-create the 5 runs we used for our experiments:

> sh run_isic2017_experiments.sh 01 12345
> sh run_isic2017_experiments.sh 02 23456
> sh run_isic2017_experiments.sh 07 78901
> sh run_isic2017_experiments.sh 08 89012
> sh run_isic2017_experiments.sh 09 90123

In early experiments, we test 10 seeds and selected the middle 5 when ranked in terms of performance, hence the specific seed choice.

Exploring the input data distribution present in semantic segmentation problems

Cluster assumption

First we examine the input data distribution presented by semantic segmentation problems with a view to determining if the low density separation assumption holds, in the notebook Semantic segmentation input data distribution.ipynb This notebook also contains the code used to generate the images from Figure 1 in the paper.

Inter-class and intra-class variance

Secondly we examine the inter-class and intra-class distance (as a proxy for inter-class and intra-class variance) in the notebook Plot inter-class and intra-class distances from files.ipynb

Note that running the second notebook requires that you generate some data files using the intra_inter_class_patch_dist.py program.

Toy 2D experiments

The toy 2D experiments used to produce Figure 3 in the paper can be run using the toy2d_train.py program, which is documented here.

You can re-create the toy 2D experiments by running the run_toy2d_experiments.sh shell script:

> sh run_toy2d_experiments.sh <run>
Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extens

TensorLayer Community 7.1k Dec 27, 2022
A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

Hypercomplex A Python library for working with quaternions, octonions, sedenions, and beyond following the Cayley-Dickson construction of hypercomplex

7 Nov 04, 2022
Visual Memorability for Robotic Interestingness via Unsupervised Online Learning (ECCV 2020 Oral and TRO)

Visual Interestingness Refer to the project description for more details. This code based on the following paper. Chen Wang, Yuheng Qiu, Wenshan Wang,

Chen Wang 36 Sep 08, 2022
Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Value Retrieval with Arbitrary Queries for Form-like Documents Introduction Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-

Salesforce 13 Sep 15, 2022
DrQ-v2: Improved Data-Augmented Reinforcement Learning

DrQ-v2: Improved Data-Augmented RL Agent Method DrQ-v2 is a model-free off-policy algorithm for image-based continuous control. DrQ-v2 builds on DrQ,

Facebook Research 234 Jan 01, 2023
PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision.

PyTorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{CV2018, author = {Donny You ( Donny You 40 Sep 14, 2022

School of Artificial Intelligence at the Nanjing University (NJU)School of Artificial Intelligence at the Nanjing University (NJU)

F-Principle This is an exercise problem of the digital signal processing (DSP) course at School of Artificial Intelligence at the Nanjing University (

Thyrix 5 Nov 23, 2022
Supplementary code for TISMIR paper "Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form"

Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form This is supplementary code for the TISMIR paper Sliding-Window Pitch-Class H

1 Nov 27, 2021
This repo provides function call to track multi-objects in videos

Custom Object Tracking Introduction This repo provides function call to track multi-objects in videos with a given trained object detection model and

Jeff Lo 51 Nov 22, 2022
DC540 hacking challenge 0x00005a.

dc540-0x00005a DC540 hacking challenge 0x00005a. PROMOTIONAL VIDEO - WATCH NOW HERE ON YOUTUBE CRITICAL PART 5A VIDEO - WATCH NOW HERE ON YOUTUBE Prio

Kevin Thomas 3 May 09, 2022
A PyTorch implementation of Radio Transformer Networks from the paper "An Introduction to Deep Learning for the Physical Layer".

An Introduction to Deep Learning for the Physical Layer An usable PyTorch implementation of the noisy autoencoder infrastructure in the paper "An Intr

Gram.AI 120 Nov 21, 2022
Video Representation Learning by Recognizing Temporal Transformations. In ECCV, 2020.

Video Representation Learning by Recognizing Temporal Transformations [Project Page] Simon Jenni, Givi Meishvili, and Paolo Favaro. In ECCV, 2020. Thi

Simon Jenni 46 Nov 14, 2022
Project for tracking occupancy in Tel-Aviv parking lots.

Ahuzat Dibuk - Tracking occupancy in Tel-Aviv parking lots main.py This module was set-up to be executed on Google Cloud Platform. I run it every 15 m

Geva Kipper 35 Nov 22, 2022
An experimental technique for efficiently exploring neural architectures.

SMASH: One-Shot Model Architecture Search through HyperNetworks An experimental technique for efficiently exploring neural architectures. This reposit

Andy Brock 478 Aug 04, 2022
Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Advanced Deep Learning with TensorFlow 2 and Keras (Updated for 2nd Edition)

Packt 1.5k Jan 03, 2023
Retrieval.pytorch - The code we used in [2020 DIGIX]

Retrieval.pytorch - The code we used in [2020 DIGIX]

Guo-Hua Wang 2 Feb 07, 2022
[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Deep Equilibrium Optical Flow Estimation This is the official repo for the paper Deep Equilibrium Optical Flow Estimation (CVPR 2022), by Shaojie Bai*

CMU Locus Lab 136 Dec 18, 2022
Generic ecosystem for feature extraction from aerial and satellite imagery

Note: Robosat is neither maintained not actively developed any longer by Mapbox. See this issue. The main developers (@daniel-j-h, @bkowshik) are no l

Mapbox 1.9k Jan 06, 2023
Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION.

LiMuSE Overview Pytorch implementation of our paper LIMUSE: LIGHTWEIGHT MULTI-MODAL SPEAKER EXTRACTION. LiMuSE explores group communication on a multi

Auditory Model and Cognitive Computing Lab 17 Oct 26, 2022
Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2

CoaDTI Multi-modal co-attention for drug-target interaction annotation and Its Application to SARS-CoV-2 Abstract Environment The test was conducted i

Layne_Huang 7 Nov 14, 2022