Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Last update: Dec 15, 2022

Overview

Learning Pixel-level Semantic Affinity with Image-level Supervision

This code is deprecated. Please see https://github.com/jiwoon-ahn/irn instead.

Introduction

The code and trained models of:

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, Jiwoon Ahn and Suha Kwak, CVPR 2018 [Paper]

We have developed a framework based on AffinityNet to generate accurate segmentation labels of training images given their image-level class labels only. A segmentation network learned with our synthesized labels outperforms previous state-of-the-arts by large margins on the PASCAL VOC 2012.

*Our code was first implemented in Tensorflow at the time of CVPR 2018 submssion, and later we migrated to PyTorch. Some trivial details (optimizer, channel size, and etc.) have been changed.

Citation

If you find the code useful, please consider citing our paper using the following BibTeX entry.

@InProceedings{Ahn_2018_CVPR,
author = {Ahn, Jiwoon and Kwak, Suha},
title = {Learning Pixel-Level Semantic Affinity With Image-Level Supervision for Weakly Supervised Semantic Segmentation},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2018}
}

Prerequisite

Tested on Ubuntu 16.04, with Python 3.5, PyTorch 0.4, Torchvision 0.2.1, CUDA 9.0, and 1x NVIDIA TITAN X (Pascal).
The PASCAL VOC 2012 development kit: You also need to specify the path ('voc12_root') of your downloaded dev kit.
(Optional) If you want to try with the VGG-16 based network, PyCaffe and VGG-16 ImageNet pretrained weights [vgg16_20M.caffemodel]
(Optional) If you want to try with the ResNet-38 based network, Mxnet and ResNet-38 pretrained weights [ilsvrc-cls_rna-a1_cls1000_ep-0001.params]

Usage

1. Train a classification network to get CAMs.

python3 train_cls.py --lr 0.1 --batch_size 16 --max_epoches 15 --crop_size 448 --network [network.vgg16_cls | network.resnet38_cls] --voc12_root [your_voc12_root_folder] --weights [your_weights_file] --wt_dec 5e-4

2. Generate labels for AffinityNet by applying dCRF on CAMs.

python3 infer_cls.py --infer_list voc12/train_aug.txt --voc12_root [your_voc12_root_folder] --network [network.vgg16_cls | network.resnet38_cls] --weights [your_weights_file] --out_cam [desired_folder] --out_la_crf [desired_folder] --out_ha_crf [desired_folder]

(Optional) Check the accuracy of CAMs.

python3 infer_cls.py --infer_list voc12/val.txt --voc12_root [your_voc12_root_folder] --network network.resnet38_cls --weights res38_cls.pth --out_cam_pred [desired_folder]

3. Train AffinityNet with the labels

python3 train_aff.py --lr 0.1 --batch_size 8 --max_epoches 8 --crop_size 448 --voc12_root [your_voc12_root_folder] --network [network.vgg16_aff | network.resnet38_aff] --weights [your_weights_file] --wt_dec 5e-4 --la_crf_dir [your_output_folder] --ha_crf_dir [your_output_folder]

4. Perform Random Walks on CAMs

python3 infer_aff.py --infer_list [voc12/val.txt | voc12/train.txt] --voc12_root [your_voc12_root_folder] --network [network.vgg16_aff | network.resnet38_aff] --weights [your_weights_file] --cam_dir [your_output_folder] --out_rw [desired_folder]

Results and Trained Models

Class Activation Map

Model	Train (mIoU)	Val (mIoU)
VGG-16	48.9	46.6	[Weights]
ResNet-38	47.7	47.2	[Weights]
ResNet-38	48.0	46.8	CVPR submission

Random Walk with AffinityNet

Model	alpha	Train (mIoU)	Val (mIoU)
VGG-16	4/16/32	59.6	54.0	[Weights]
ResNet-38	4/16/32	61.0	60.2	[Weights]
ResNet-38	4/16/24	58.1	57.0	CVPR submission

*beta=8, gamma=5, t=256 for all settings

Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation, CVPR 2018

Related tags

Overview

Learning Pixel-level Semantic Affinity with Image-level Supervision

Introduction

Citation

Prerequisite

Usage

1. Train a classification network to get CAMs.

2. Generate labels for AffinityNet by applying dCRF on CAMs.

(Optional) Check the accuracy of CAMs.

3. Train AffinityNet with the labels

4. Perform Random Walks on CAMs

Results and Trained Models

Class Activation Map

Random Walk with AffinityNet

Owner

Jiwoon Ahn

This repository contains the PyTorch implementation of the paper STaCK: Sentence Ordering with Temporal Commonsense Knowledge appearing at EMNLP 2021.

Repository for the paper "Online Domain Adaptation for Occupancy Mapping", RSS 2020

PClean: A Domain-Specific Probabilistic Programming Language for Bayesian Data Cleaning

You Only Look Once for Panopitic Driving Perception

An implementation of Fastformer: Additive Attention Can Be All You Need in TensorFlow

Cross-Image Region Mining with Region Prototypical Network for Weakly Supervised Segmentation

Code for "Adversarial Training for a Hybrid Approach to Aspect-Based Sentiment Analysis

ResNEsts and DenseNEsts: Block-based DNN Models with Improved Representation Guarantees

Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks

PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning"

Google AI Open Images - Object Detection Track: Open Solution

Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience

Omnidirectional Scene Text Detection with Sequential-free Box Discretization (IJCAI 2019). Including competition model, online demo, etc.

TYolov5: A Temporal Yolov5 Detector Based on Quasi-Recurrent Neural Networks for Real-Time Handgun Detection in Video

This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Knowledge Management for Humans using Machine Learning & Tags

Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

Python SDK for building, training, and deploying ML models

FreeSOLO for unsupervised instance segmentation, CVPR 2022