Self-labelling via simultaneous clustering and representation learning. (ICLR 2020)

Overview

Self-labelling via simultaneous clustering and representation learning

🆗 🆗 🎉 NEW models (20th August 2020): Added standard SeLa pretrained torchvision ResNet models to make loading much easier + added baselines using better MoCov2 augmentation (~69% LP performance) + added evaluation with K=1000 for ImageNet "unuspervised clustering"

🆕 ✅ 🎉 updated code: 23rd April 2020: bug fixes + CIFAR code + evaluation for resnet & alexnet.

Checkout our blogpost for a quick non-technical overview and an interactive visualization of our clusters.

Self-Label

This code is the official implementation of the ICLR 2020 paper Self-labelling via simultaneous clustering and representation learning.

Abstract

Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks. However, doing so naively leads to ill posed learning problems with degenerate solutions. In this paper, we propose a novel and principled learning formulation that addresses these issues. The method is obtained by maximizing the information between labels and input data indices. We show that this criterion extends standard crossentropy minimization to an optimal transport problem, which we solve efficiently for millions of input images and thousands of labels using a fast variant of the Sinkhorn-Knopp algorithm. The resulting method is able to self-label visual data so as to train highly competitive image representations without manual labels. Our method achieves state of the art representation learning performance for AlexNet and ResNet-50 on SVHN, CIFAR-10, CIFAR-100 and ImageNet.

Results at a glance

NMI(%) aNMI(%) ARI(%) LP Acc (%)
AlexNet 1k 50.5 12.2 2.7 42.1
AlexNet 10k 66.4 4.7 4.7 43.8
R50 10x3k 54.2 34.4 7.2 61.5

With better augmentations (all single crop)

Label-Acc NMI(%) aNMI(%) ARI(%) LP Acc (%) model_weights
Aug++ R18 1k (new) 26.9 62.7 36.4 12.5 53.3 here
Aug++ R50 1k (new) 30.5 65.7 42.0 16.2 63.5 here
Aug++ R50 10x3k (new) 38.1 75.7 52.8 27.6 68.8 here
(MoCo-v2 + k-means**, K=3k) 71.4 39.6 15.8 71.1
  • "Aug++" refers to the better augmentations used in SimCLR, taken from the MoCo-v2 repo, but I still only trained for 280 epochs, with three lr-drops as in CMC.
  • There are still further improvements to be made with a MLP or training 800 epochs (I train 280), as done in SimCLR, MoCov2 and SwAV.
  • **MoCo-v2 uses 800 epochs, MLP and cos-lr-schedule. On MoCo-v2 I run k-means (K=3000) on the avg-pooled features (after the MLP-head it's pretty much the same performance) to obtain NMI, aNMI and ARI numbers.
  • Models above use standard torchvision ResNet backbones so loading is now super easy:
import torch, torchvision
model = torchvision.models.resnet50(pretrained=False, num_classes=3000)
ckpt = torch.load('resnet50-10x3k_pp.pth')
model.load_state_dict(ckpt['state_dict'])
pseudolabels = ckpt['L']
  • note on improvement potential: by just using "aug+": I get LP-accuracy of 67.2% after 200 epochs. MoCo-v2 with "aug+" only has 63.4% after 200 epochs.

Clusters that were discovered by our method

Sorted

Imagenet validation images with clusters sorted by imagenet purity

Random

Imagenet validation images with random clusters

The edge-colors encode the true imagenet classes (which are not used for training). You can view all clusters here.

Requirements

  • Python >3.6
  • PyTorch > 1.0
  • CUDA
  • Numpy, SciPy
  • also, see requirements.txt
  • (optional:) TensorboardX

Running our code

Run the self-supervised training of an AlexNet with the command

$./scripts/alexnet.sh

or train a ResNet-50 with

$./scripts/resnet.sh

Note: you need to specify your dataset directory (it expects a format just like ImageNet with "train" and "val" folders). You also need to give the code enough GPUs to allow for storage of activations on the GPU. Otherwise you need to use the CPU variant which is significantly slower.

Full documentation of the unsupervised training code main.py:

usage: main.py [-h] [--epochs EPOCHS] [--batch-size BATCH_SIZE] [--lr LR]
               [--lrdrop LRDROP] [--wd WD] [--dtype {f64,f32}] [--nopts NOPTS]
               [--augs AUGS] [--paugs PAUGS] [--lamb LAMB] [--cpu]
               [--arch ARCH] [--archspec {big,small}] [--ncl NCL] [--hc HC]
               [--device DEVICE] [--modeldevice MODELDEVICE] [--exp EXP]
               [--workers WORKERS] [--imagenet-path IMAGENET_PATH]
               [--comment COMMENT] [--log-intv LOG_INTV] [--log-iter LOG_ITER]

PyTorch Implementation of Self-Label

optional arguments:
  -h, --help            show this help message and exit
  --epochs EPOCHS       number of epochs
  --batch-size BATCH_SIZE
                        batch size (default: 256)
  --lr LR               initial learning rate (default: 0.05)
  --lrdrop LRDROP       multiply LR by 0.1 every (default: 150 epochs)
  --wd WD               weight decay pow (default: (-5)
  --dtype {f64,f32}     SK-algo dtype (default: f64)
  --nopts NOPTS         number of pseudo-opts (default: 100)
  --augs AUGS           augmentation level (default: 3)
  --paugs PAUGS         for pseudoopt: augmentation level (default: 3)
  --lamb LAMB           for pseudoopt: lambda (default:25)
  --cpu                 use CPU variant (slow) (default: off)
  --arch ARCH           alexnet or resnet (default: alexnet)
  --archspec {big,small}
                        alexnet variant (default:big)
  --ncl NCL             number of clusters per head (default: 3000)
  --hc HC               number of heads (default: 1)
  --device DEVICE       GPU devices to use for storage and model
  --modeldevice MODELDEVICE
                        GPU numbers on which the CNN runs
  --exp EXP             path to experiment directory
  --workers WORKERS     number workers (default: 6)
  --imagenet-path IMAGENET_PATH
                        path to folder that contains `train` and `val`
  --comment COMMENT     name for tensorboardX
  --log-intv LOG_INTV   save stuff every x epochs (default: 1)
  --log-iter LOG_ITER   log every x-th batch (default: 200)

Evaluation

Linear Evaluation

We provide the linear evaluation methods in this repo. Simply download the models via . ./scripts/download_models.sh and then either run scripts/eval-alexnet.sh or scripts/eval-resnet.sh.

Pascal VOC

We follow the standard evaluation protocols for self-supervised visual representation learning.

Our extracted pseudolabels

As we show in the paper, the pseudolabels we generate from our training can be used to quickly train a neural network with regular cross-entropy. Moreover they seem to correctly group together similar images. Hence we provide the labels for everyone to use.

AlexNet

You can download the pseudolabels from our best (raw) AlexNet model with 10x3000 clusters here.

ResNet

You can download the pseudolabels from our best ResNet model with 10x3000 clusters here.

Trained models

You can also download our trained models by running

$./scripts/download_models.sh

Use them like this:

import torch
import models
d = torch.load('self-label_models/resnet-10x3k.pth')
m = models.resnet(num_classes = [3000]*10)
m.load_state_dict(d)

d = torch.load('self-label_models/alexnet-10x3k-wRot.pth')
m = models.alexnet(num_classes = [3000]*10)
m.load_state_dict(d)

Reference

If you use this code etc., please cite the following paper:

Yuki M. Asano, Christian Rupprecht and Andrea Vedaldi. "Self-labelling via simultaneous clustering and representation learning." Proc. ICLR (2020)

@inproceedings{asano2020self,
  title={Self-labelling via simultaneous clustering and representation learning},
  author={Asano, Yuki M. and Rupprecht, Christian and Vedaldi, Andrea},
  booktitle={International Conference on Learning Representations (ICLR)},
  year={2020},
}
Owner
Yuki M. Asano
I'm an Computer Vision researcher at the University of Amsterdam. Did my PhD at the Visual Geometry Group in Oxford.
Yuki M. Asano
An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

An algorithm that handles large-scale aerial photo co-registration, based on SURF, RANSAC and PyTorch autograd.

Luna Yue Huang 41 Oct 29, 2022
UI2I via StyleGAN2 - Unsupervised image-to-image translation method via pre-trained StyleGAN2 network

We proposed an unsupervised image-to-image translation method via pre-trained StyleGAN2 network. paper: Unsupervised Image-to-Image Translation via Pr

208 Dec 30, 2022
A stock generator that assess a list of stocks and returns the best stocks for investing and money allocations based on users choices of volatility, duration and number of stocks

Stock-Generator Please visit "Stock Generator.ipynb" for a clearer view and "Stock Generator.py" for scripts. The stock generator is designed to allow

jmengnyay 1 Aug 02, 2022
PyTorch implementation(s) of various ResNet models from Twitch streams.

pytorch-resnet-twitch PyTorch implementation(s) of various ResNet models from Twitch streams. Status: ResNet50 currently not working. Will update in n

Daniel Bourke 3 Jan 11, 2022
face property detection pytorch

This is the face property train code of project face-detection-project

i am x 2 Oct 18, 2021
Logistic Bandit experiments. Official code for the paper "Jointly Efficient and Optimal Algorithms for Logistic Bandits".

Code for the paper Jointly Efficient and Optimal Algorithms for Logistic Bandits, by Louis Faury, Marc Abeille, Clément Calauzènes and Kwang-Sun Jun.

Faury Louis 1 Jan 22, 2022
Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving

SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving Abstract In this paper, we introduce SalsaNext f

308 Jan 04, 2023
Deep learning image registration library for PyTorch

TorchIR: Pytorch Image Registration TorchIR is a image registration library for deep learning image registration (DLIR). I have integrated several ide

Bob de Vos 40 Dec 16, 2022
MMFlow is an open source optical flow toolbox based on PyTorch

Documentation: https://mmflow.readthedocs.io/ Introduction English | 简体中文 MMFlow is an open source optical flow toolbox based on PyTorch. It is a part

OpenMMLab 688 Jan 06, 2023
Face Mask Detector by live camera using tensorflow-keras, openCV and Python

Face Mask Detector 😷 by Live Camera Detecting masked or unmasked faces by live camera with percentange of mask occupation About Project: This an Arti

Karan Shingde 2 Apr 04, 2022
S2s2net - Sentinel-2 Super-Resolution Segmentation Network

S2S2Net Sentinel-2 Super-Resolution Segmentation Network Getting started Install

Wei Ji 10 Nov 10, 2022
Official PyTorch implementation of PS-KD

Self-Knowledge Distillation with Progressive Refinement of Targets (PS-KD) Accepted at ICCV 2021, oral presentation Official PyTorch implementation of

61 Dec 28, 2022
Efficient 3D human pose estimation in video using 2D keypoint trajectories

3D human pose estimation in video with temporal convolutions and semi-supervised training This is the implementation of the approach described in the

Meta Research 3.1k Dec 29, 2022
Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Multi-Anchor Active Domain Adaptation for Semantic Segmentation Munan Ning*, Donghuan Lu*, Dong Wei†, Cheng Bian, Chenglang Yuan, Shuang Yu, Kai Ma, Y

Munan Ning 36 Dec 07, 2022
Racing line optimization algorithm in python that uses Particle Swarm Optimization.

Racing Line Optimization with PSO This repository contains a racing line optimization algorithm in python that uses Particle Swarm Optimization. Requi

Parsa Dahesh 6 Dec 14, 2022
AdvStyle - Official PyTorch Implementation

AdvStyle - Official PyTorch Implementation Paper | Supp Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes. Huiting Ya

Beryl 37 Oct 21, 2022
Hysterese plugin with two temperature offset areas

craftbeerpi4 plugin OffsetHysterese Temperatur-Steuerungs-Plugin mit zwei tempereaturbereich abhängigen Offsets. Installation sudo pip3 install https:

HappyHibo 1 Dec 21, 2021
HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval

HCQ: Hybrid Contrastive Quantization for Efficient Cross-View Video Retrieval [toc] 1. Introduction This repository provides the code for our paper at

13 Dec 08, 2022
A tool for calculating distortion parameters in coordination complexes.

OctaDist Octahedral distortion calculator: A tool for calculating distortion parameters in coordination complexes. https://octadist.github.io/ Registe

OctaDist 12 Oct 04, 2022
Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

IterMVS official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo' Introduction IterMVS is a novel lear

Fangjinhua Wang 127 Jan 04, 2023