This is the code for CVPR 2021 oral paper: Jigsaw Clustering for Unsupervised Visual Representation Learning

Overview

JigsawClustering

Jigsaw Clustering for Unsupervised Visual Representation Learning

Pengguang Chen, Shu Liu, Jiaya Jia

Introduction

This project provides an implementation for the CVPR 2021 paper "Jigsaw Clustering for Unsupervised Visual Representation Learning"

Installation

Environment

We verify our code on

  • 4x2080Ti GPUs
  • CUDA 10.1
  • python 3.7
  • torch 1.6.0
  • torchvision 0.7.0

Other similar envirouments should also work properly.

Install

We use the SyncBN from apex, please install apex refer to https://github.com/NVIDIA/apex (SyncBN from pytorch should also work properly, we will verify it later.)

We use detectron2 for the training of detection tasks. If you are willing to finetune our pretrained model on the detection task, please install detectron2 refer to https://github.com/facebookresearch/detectron2

git clone https://github.com/Jia-Research-Lab/JigsawClustering.git
cd JigsawClustering/
pip install diffdist

Dataset

Please put the data under ./datasets. The directory looks like:

datasets
│
│───ImageNet/
│   │───class1/
│   │───class2/
│   │   ...
│   └───class1000/
│   
│───coco/
│   │───annotations/
│   │───train2017/
│   └───val2017/
│
│───VOC2012/
│   
└───VOC2007/

Results and pretrained model

The pretrained model is available at here.

Task Dataset Results
Linear Evaluation ImageNet 66.4
Semi-Supervised 1% ImageNet 40.7
Semi-Supervised 10% ImageNet 63.0
Detection COCO 39.3

Training

Pre-training on ImageNet

python main.py --dist-url 'tcp://localhost:10107' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --lr 0.03 --batch-size 256 --epoch 200 \
    --save-dir outputs/jigclu_pretrain/ \
    --resume outputs/jigclu_pretrain/model_best.pth.tar \
    --loss-t 0.3 \
    --cross-ratio 0.3 \
    datasets/ImageNet/

Linear evaluation on ImageNet

python main_lincls.py --dist-url 'tcp://localhost:10007' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --lr 10.0 --batch-size 256 \
    --prefix module.encoder. \
    --pretrained outputs/jigclu_pretrain/model_best.pth.tar \
    --save-dir outputs/jigclu_linear/ \
    datasets/ImageNet/

Semi-Supervised finetune on ImageNet

10% label

python main_semi.py --dist-url 'tcp://localhost:10102' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --batch-size 256 \
    --wd 0.0 --lr 0.01 --lr-last-layer 0.2 \
    --syncbn \
    --prefix module.encoder. \
    --labels-perc 10 \
    --pretrained outputs/jigclu_pretrain/model_best.pth.tar \
    --save-dir outputs/jigclu_semi_10p/ \
    datasets/ImageNet/

1% label

python main_semi.py --dist-url 'tcp://localhost:10101' --multiprocessing-distributed --world-size 1 --rank 0 \
    -a resnet50 \
    --batch-size 256 \
    --wd 0.0 --lr 0.02 --lr-last-layer 5.0 \
    --syncbn \
    --prefix module.encoder. \
    --labels-perc 1 \
    --pretrained outputs/jigclu_pretrain/model_best.pth.tar \
    --save-dir outputs/jigclu_semi_1p/ \
    datasets/ImageNet/

Transfer to COCO detection

Please convert the pretrained weight first

python detection/convert.py

Then start training using

python detection/train_net.py --config-file detection/configs/R50-JigClu.yaml --num-gpus 4

VOC detection

python detection/train_net.py --config-file detection/configs/voc-R50-JigClu.yaml --num-gpus 4

Citation

Please consider citing JigsawClustering in your publications if it helps your research.

@inproceedings{chen2021jigclu,
    title={Jigsaw Clustering for Unsupervised Visual Representation Learning},
    author={Pengguang Chen, Shu Liu, and Jiaya Jia},
    booktitle={IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2021},
}
Comments
  • Some question about trainning

    Some question about trainning

    Hi~Thanks for your excellent work! I have a machine with 2 1080Ti,and I want to train your model on CIFAR10 with resnet18.

    I use the parmeters like this ,but it seems don't work. 1632405015(1)

    The program is stuck in this situation.

    1632405115(1)

    opened by zbw0329 10
  • Some details about the training

    Some details about the training

    Hi, I have recently read your paper and find it very interesting. There are still some confusions about the experiments.

    The experiments require 4 2080ti for training. Does it mean we must have 4 2080ti on one single machine? What if I have 4 2080ti on different machines? Is there any suggestion for this situation? BTW, how long does it take when you train on ImageNet1k?

    Much appreciation for your reply.

    Best wishes!

    opened by Hanzy1996 3
  • Some questions about the results of ImageNet100

    Some questions about the results of ImageNet100

    Thank you for your wonderful work, I want to do some more works based on your code. But I meet some questions about the results. I use the JigsawClustering and the dataset ImageNet100 to train the model. I only changed one line in the model to fit this dataset(I added model.fc = nn.Linear(2048, 100) in line 162 of main_lincls.py). However, despite using 4 GPUs, and did not change the configuration file. I only got an accuracy of 79.24. There is still a certain gap between this and the 80.9 reported in the paper. How can I achieve the accuracy reported in the paper now? Once again, thank you for your excellent work and code. I am looking forward to your reply.

    opened by WilyZhao8 1
  • Results of Faster-RCNN R50-FPN with model pretrained on ImageNet with standard cross-entropy loss

    Results of Faster-RCNN R50-FPN with model pretrained on ImageNet with standard cross-entropy loss

    Hi, thanks for your work! In Objection Detection, do you apply ResNet-50 model pretrained on ImageNet with standard cross-entropy loss to Faster-RCNN R50-FPN?

    opened by fzfs 1
  • Training the model on a single GPU

    Training the model on a single GPU

    Hi! I'm aware that the question has been asked previously, but could you guide how to modify jigclu to remove the distributeddataparallel depedency?

    Thanks!

    opened by shuvam-creditmate 2
  • It seems that the model has not learned anything,What should I do?

    It seems that the model has not learned anything,What should I do?

    Thanks for your excellent work! I change the dataloader to use JigClu in CIFAR-10,and train the model on it by 1000epoch. But the prediction of my model is all the same. It seem that model always cluster into the same cluster

    opened by zbw0329 10
Releases(1.0)
Owner
DV Lab
Deep Vision Lab
DV Lab
A small library of 3D related utilities used in my research.

utils3D A small library of 3D related utilities used in my research. Installation Install via GitHub pip install git+https://github.com/Steve-Tod/util

Zhenyu Jiang 8 May 20, 2022
Structured Data Gradient Pruning (SDGP)

Structured Data Gradient Pruning (SDGP) Weight pruning is a technique to make Deep Neural Network (DNN) inference more computationally efficient by re

Bradley McDanel 10 Nov 11, 2022
Code repo for "Transformer on a Diet" paper

Transformer on a Diet Reference: C Wang, Z Ye, A Zhang, Z Zhang, A Smola. "Transformer on a Diet". arXiv preprint arXiv (2020). Installation pip insta

cgraywang 31 Sep 26, 2021
Imaginaire - NVIDIA's Deep Imagination Team's PyTorch Library

Imaginaire Docs | License | Installation | Model Zoo Imaginaire is a pytorch library that contains optimized implementation of several image and video

NVIDIA Research Projects 3.6k Dec 29, 2022
Video Matting via Consistency-Regularized Graph Neural Networks

Video Matting via Consistency-Regularized Graph Neural Networks Project Page | Real Data | Paper Installation Our code has been tested on Python 3.7,

41 Dec 26, 2022
Reference code for the paper CAMS: Color-Aware Multi-Style Transfer.

CAMS: Color-Aware Multi-Style Transfer Mahmoud Afifi1, Abdullah Abuolaim*1, Mostafa Hussien*2, Marcus A. Brubaker1, Michael S. Brown1 1York University

Mahmoud Afifi 36 Dec 04, 2022
Collection of NLP model explanations and accompanying analysis tools

Thermostat is a large collection of NLP model explanations and accompanying analysis tools. Combines explainability methods from the captum library wi

126 Nov 22, 2022
MicRank is a Learning to Rank neural channel selection framework where a DNN is trained to rank microphone channels.

MicRank: Learning to Rank Microphones for Distant Speech Recognition Application Scenario Many applications nowadays envision the presence of multiple

Samuele Cornell 20 Nov 10, 2022
IAST: Instance Adaptive Self-training for Unsupervised Domain Adaptation (ECCV 2020)

This repo is the official implementation of our paper "Instance Adaptive Self-training for Unsupervised Domain Adaptation". The purpose of this repo is to better communicate with you and respond to y

CVSM Group - email: <a href=[email protected]"> 84 Dec 12, 2022
Keras Model Implementation Walkthrough

Keras Model Implementation Walkthrough

Luke Wood 17 Sep 27, 2022
A demonstration of using a live Tensorflow session to create an interactive face-GAN explorer.

Streamlit Demo: The Controllable GAN Face Generator This project highlights Streamlit's new hash_func feature with an app that calls on TensorFlow to

Streamlit 257 Dec 31, 2022
Pytorch implementation for Patient Knowledge Distillation for BERT Model Compression

Patient Knowledge Distillation for BERT Model Compression Knowledge distillation for BERT model Installation Run command below to install the environm

Siqi 180 Dec 19, 2022
Automatic Number Plate Recognition using Contours and Convolution Neural Networks (CNN)

Cite our paper if you find this project useful https://www.ijariit.com/manuscripts/v7i4/V7I4-1139.pdf Abstract Image processing technology is used in

Adithya M 2 Jun 28, 2022
Efficient Training of Audio Transformers with Patchout

PaSST: Efficient Training of Audio Transformers with Patchout This is the implementation for Efficient Training of Audio Transformers with Patchout Pa

165 Dec 26, 2022
Deep Learning pipeline for motor-imagery classification.

BCI-ToolBox 1. Introduction BCI-ToolBox is deep learning pipeline for motor-imagery classification. This repo contains five models: ShallowConvNet, De

DongHee 18 Oct 31, 2022
Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification tasks

Uniformer - Pytorch Implementation of Uniformer, a simple attention and 3d convolutional net that achieved SOTA in a number of video classification ta

Phil Wang 90 Nov 24, 2022
Image process framework based on plugin like imagej, it is esay to glue with scipy.ndimage, scikit-image, opencv, simpleitk, mayavi...and any libraries based on numpy

Introduction ImagePy is an open source image processing framework written in Python. Its UI interface, image data structure and table data structure a

ImagePy 1.2k Dec 29, 2022
Efficient Deep Learning Systems course

Efficient Deep Learning Systems This repository contains materials for the Efficient Deep Learning Systems course taught at the Faculty of Computer Sc

Max Ryabinin 173 Dec 29, 2022
Code for "On Memorization in Probabilistic Deep Generative Models"

On Memorization in Probabilistic Deep Generative Models This repository contains the code necessary to reproduce the experiments in On Memorization in

The Alan Turing Institute 3 Jun 09, 2022
Visualizing Yolov5's layers using GradCam

YOLO-V5 GRADCAM I constantly desired to know to which part of an object the object-detection models pay more attention. So I searched for it, but I di

Pooya Mohammadi Kazaj 200 Jan 01, 2023