Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Last update: Dec 27, 2022

Related tags

Deep Learning image-segmentation

Overview

CCAM (Unsupervised)

Code repository for our paper "CCAM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation" in CVPR 2022.

The repository includes full training, evaluation, and visualization codes on CUB-200-2011, ILSVRC2012, and PASCAL VOC2012 datasets.

We provide the extracted class-agnostic bounding boxes (on CUB-200-2011 and ILSVRC2012) and background cues (on PASCAL VOC12) at here.

Dependencies

Python 3
PyTorch 1.7.1
OpenCV-Python
Numpy
Scipy
MatplotLib
Yaml
Easydict

Dataset

CUB-200-2011

You will need to download the images (JPEG format) in CUB-200-2011 dataset at here. Make sure your data/CUB_200_2011 folder is structured as follows:

├── CUB_200_2011/
|   ├── images
|   ├── images.txt
|   ├── bounding_boxes.txt
|   ...
|   └── train_test_split.txt

You will need to download the images (JPEG format) in ILSVRC2012 dataset at here. Make sure your data/ILSVRC2012 folder is structured as follows:

ILSVRC2012

├── ILSVRC2012/ 
|   ├── train
|   ├── val
|   ├── val_boxes
|   |   ├——val
|   |   |   ├—— ILSVRC2012_val_00050000.xml
|   |   |   ├—— ...
|   ├── train.txt
|   └── val.txt

PASCAL VOC2012

You will need to download the images (JPEG format) in PASCAL VOC2012 dataset at here. Make sure your data/VOC2012 folder is structured as follows:

├── VOC2012/
|   ├── Annotations
|   ├── ImageSets
|   ├── SegmentationClass
|   ├── SegmentationClassAug
|   └── SegmentationObject

For WSOL task

please refer to the directory of './WSOL'

cd WSOL

For WSSS task

please refer to the directory of './WSSS'

cd WSSS

Comparison with CAM

CUSTOM DATASET

As CCAM is an unsupervised method, it can be applied to various scenarios, like ReID, Saliency detection, or skin lesion detection. We provide an example to apply CCAM on your custom dataset like 'Market-1501'.

cd CUSTOM

Reference

If you are using our code, please consider citing our paper.

@article{xie2022contrastive,
  title={Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation},
  author={Xie, Jinheng and Xiang, Jianfeng and Chen, Junliang and Hou, Xianxu and Zhao, Xiaodong and Shen, Linlin},
  journal={arXiv preprint arXiv:2203.13505},
  year={2022}
}

Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation (CVPR 2022)

Related tags

Overview

CCAM (Unsupervised)

Dependencies

Dataset

CUB-200-2011

ILSVRC2012

PASCAL VOC2012

For WSOL task

For WSSS task

Comparison with CAM

CUSTOM DATASET

Reference

Owner

Computer Vision Insitute, SZU

PyTorch implementations of the NeRF model described in "NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis"

[arXiv] What-If Motion Prediction for Autonomous Driving ❓🚗💨

Romanian Automatic Speech Recognition from the ROBIN project

Change Detection in SAR Images Based on Multiscale Capsule Network

Lyapunov-guided Deep Reinforcement Learning for Stable Online Computation Offloading in Mobile-Edge Computing Networks

Python Library for Signal/Image Data Analysis with Transport Methods

Official Keras Implementation for UNet++ in IEEE Transactions on Medical Imaging and DLMIA 2018

A task-agnostic vision-language architecture as a step towards General Purpose Vision

Hypernetwork-Ensemble Learning of Segmentation Probability for Medical Image Segmentation with Ambiguous Labels

Code for the Paper: Alexandra Lindt and Emiel Hoogeboom.

A really easy-to-use and powerful sudoku solver.

The deployment framework aims to provide a simple, lightweight, fast integrated, pipelined deployment framework that ensures reliability, high concurrency and scalability of services.

Benchmarks for Object Detection in Aerial Images

Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

Seeing Dynamic Scene in the Dark: High-Quality Video Dataset with Mechatronic Alignment (ICCV2021)

A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

DeepHawkeye is a library to detect unusual patterns in images using features from pretrained neural networks

This repository contains the source codes for the paper AtlasNet V2 - Learning Elementary Structures.

Image processing in Python

Robust Consistent Video Depth Estimation