PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

Related tags

Deep Learningmoco
Overview

MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

This is a PyTorch implementation of the MoCo paper:

@Article{he2019moco,
  author  = {Kaiming He and Haoqi Fan and Yuxin Wu and Saining Xie and Ross Girshick},
  title   = {Momentum Contrast for Unsupervised Visual Representation Learning},
  journal = {arXiv preprint arXiv:1911.05722},
  year    = {2019},
}

It also includes the implementation of the MoCo v2 paper:

@Article{chen2020mocov2,
  author  = {Xinlei Chen and Haoqi Fan and Ross Girshick and Kaiming He},
  title   = {Improved Baselines with Momentum Contrastive Learning},
  journal = {arXiv preprint arXiv:2003.04297},
  year    = {2020},
}

Preparation

Install PyTorch and ImageNet dataset following the official PyTorch ImageNet training code.

This repo aims to be minimal modifications on that code. Check the modifications by:

diff main_moco.py <(curl https://raw.githubusercontent.com/pytorch/examples/master/imagenet/main.py)
diff main_lincls.py <(curl https://raw.githubusercontent.com/pytorch/examples/master/imagenet/main.py)

Unsupervised Training

This implementation only supports multi-gpu, DistributedDataParallel training, which is faster and simpler; single-gpu or DataParallel training is not supported.

To do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine, run:

python main_moco.py \
  -a resnet50 \
  --lr 0.03 \
  --batch-size 256 \
  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \
  [your imagenet-folder with train and val folders]

This script uses all the default hyper-parameters as described in the MoCo v1 paper. To run MoCo v2, set --mlp --moco-t 0.2 --aug-plus --cos.

Note: for 4-gpu training, we recommend following the linear lr scaling recipe: --lr 0.015 --batch-size 128 with 4 gpus. We got similar results using this setting.

Linear Classification

With a pre-trained model, to train a supervised linear classifier on frozen features/weights in an 8-gpu machine, run:

python main_lincls.py \
  -a resnet50 \
  --lr 30.0 \
  --batch-size 256 \
  --pretrained [your checkpoint path]/checkpoint_0199.pth.tar \
  --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \
  [your imagenet-folder with train and val folders]

Linear classification results on ImageNet using this repo with 8 NVIDIA V100 GPUs :

pre-train
epochs
pre-train
time
MoCo v1
top-1 acc.
MoCo v2
top-1 acc.
ResNet-50 200 53 hours 60.8±0.2 67.5±0.1

Here we run 5 trials (of pre-training and linear classification) and report mean±std: the 5 results of MoCo v1 are {60.6, 60.6, 60.7, 60.9, 61.1}, and of MoCo v2 are {67.7, 67.6, 67.4, 67.6, 67.3}.

Models

Our pre-trained ResNet-50 models can be downloaded as following:

epochs mlp aug+ cos top-1 acc. model md5
MoCo v1 200 60.6 download b251726a
MoCo v2 200 67.7 download 59fd9945
MoCo v2 800 71.1 download a04e12f8

Transferring to Object Detection

See ./detection.

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

See Also

Owner
Meta Research
Meta Research
Entity-Based Knowledge Conflicts in Question Answering.

Entity-Based Knowledge Conflicts in Question Answering Run Instructions | Paper | Citation | License This repository provides the Substitution Framewo

Apple 35 Oct 19, 2022
Official implementation of Protected Attribute Suppression System, ICCV 2021

Official implementation of Protected Attribute Suppression System, ICCV 2021

Prithviraj Dhar 6 Jan 01, 2023
User-friendly bulk RNAseq deconvolution using simulated annealing

Welcome to cellanneal - The user-friendly application for deconvolving omics data sets. cellanneal is an application for deconvolving biological mixtu

11 Dec 16, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

The official code for the NeurIPS 2021 paper Generalized Jensen-Shannon Divergence Loss for Learning with Noisy Labels

13 Dec 22, 2022
PyTorch evaluation code for Delving Deep into the Generalization of Vision Transformers under Distribution Shifts.

Out-of-distribution Generalization Investigation on Vision Transformers This repository contains PyTorch evaluation code for Delving Deep into the Gen

Chongzhi Zhang 72 Dec 13, 2022
Official repository of my book: "Deep Learning with PyTorch Step-by-Step: A Beginner's Guide"

This is the official repository of my book "Deep Learning with PyTorch Step-by-Step". Here you will find one Jupyter notebook for every chapter in the book.

Daniel Voigt Godoy 340 Jan 01, 2023
We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

Overview This repository provides the implementation for the paper "Improved Regularization and Robustness for Fine-tuning in Neural Networks", which

NEU-StatsML-Research 21 Sep 08, 2022
Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

53 Nov 22, 2022
Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog

Mario Sessa 8 Dec 12, 2022
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

This is a fork of Fairseq(-py) with implementations of the following models: Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Se

Maha 490 Dec 15, 2022
CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning

CRLT: A Unified Contrastive Learning Toolkit for Unsupervised Text Representation Learning This repository contains the code and relevant instructions

XiaoMing 5 Aug 19, 2022
Use .csv files to record, play and evaluate motion capture data.

Purpose These scripts allow you to record mocap data to, and play from .csv files. This approach facilitates parsing of body movement data in statisti

21 Dec 12, 2022
A Temporal Extension Library for PyTorch Geometric

Documentation | External Resources | Datasets PyTorch Geometric Temporal is a temporal (dynamic) extension library for PyTorch Geometric. The library

Benedek Rozemberczki 1.9k Jan 07, 2023
Best Practices on Recommendation Systems

Recommenders What's New (February 4, 2021) We have a new relase Recommenders 2021.2! It comes with lots of bug fixes, optimizations and 3 new algorith

Microsoft 14.8k Jan 03, 2023
Algorithm to texture 3D reconstructions from multi-view stereo images

MVS-Texturing Welcome to our project that textures 3D reconstructions from images. This project focuses on 3D reconstructions generated using structur

Nils Moehrle 766 Jan 04, 2023
[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

MDCA Calibration 21 Dec 22, 2022
Dense Passage Retriever - is a set of tools and models for open domain Q&A task.

Dense Passage Retrieval Dense Passage Retrieval (DPR) - is a set of tools and models for state-of-the-art open-domain Q&A research. It is based on the

Meta Research 1.1k Jan 03, 2023
TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision

TorchOk - The toolkit for fast Deep Learning experiments in Computer Vision

52 Dec 23, 2022