PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Last update: Jan 04, 2023

Related tags

Overview

Masked Autoencoders: A PyTorch Implementation

This is a PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners:

@Article{MaskedAutoencoders2021,
  author  = {Kaiming He and Xinlei Chen and Saining Xie and Yanghao Li and Piotr Doll{\'a}r and Ross Girshick},
  journal = {arXiv:2111.06377},
  title   = {Masked Autoencoders Are Scalable Vision Learners},
  year    = {2021},
}

The original implementation was in TensorFlow+TPU. This re-implementation is in PyTorch+GPU.
This repo is a modification on the DeiT repo. Installation and preparation follow that repo.
This repo is based on timm==0.3.2, for which a fix is needed to work with PyTorch 1.8.1+.

Catalog

Visualization demo
Pre-trained checkpoints + fine-tuning code
Pre-training code

Visualization demo

Run our interactive visualization demo using Colab notebook (no GPU needed):

Fine-tuning with pre-trained checkpoints

The following table provides the pre-trained checkpoints used in the paper, converted from TF/TPU to PT/GPU:

	ViT-Base	ViT-Large	ViT-Huge
pre-trained checkpoint	download	download	download
md5	`8cad7c`	`b8b06e`	`9bdbb0`

The fine-tuning instruction is in FINETUNE.md.

By fine-tuning these pre-trained models, we rank #1 in these classification tasks (detailed in the paper):

	ViT-B	ViT-L	ViT-H	ViT-H₄₄₈	prev best
ImageNet-1K (no external data)	83.6	85.9	86.9	87.8	87.1
following are evaluation of the same model weights (fine-tuned in original ImageNet-1K):
ImageNet-Corruption (error rate)	51.7	41.8	33.8	36.8	42.5
ImageNet-Adversarial	35.9	57.1	68.2	76.7	35.8
ImageNet-Rendition	48.3	59.9	64.4	66.5	48.7
ImageNet-Sketch	34.5	45.3	49.6	50.9	36.0
following are transfer learning by fine-tuning the pre-trained MAE on the target dataset:
iNaturalists 2017	70.5	75.7	79.3	83.4	75.4
iNaturalists 2018	75.4	80.1	83.0	86.8	81.2
iNaturalists 2019	80.5	83.4	85.7	88.3	84.1
Places205	63.9	65.8	65.9	66.8	66.0
Places365	57.9	59.4	59.8	60.3	58.0

Pre-training

The pre-training instruction is in PRETRAIN.md.

License

This project is under the CC-BY-NC 4.0 license. See LICENSE for details.

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Related tags

Overview

Masked Autoencoders: A PyTorch Implementation

Catalog

Visualization demo

Fine-tuning with pre-trained checkpoints

Pre-training

License

Owner

Meta Research

RIM: Reliable Influence-based Active Learning on Graphs.

Official repository for GCR rerank, a GCN-based reranking method for both image and video re-ID

Simple object detection app with streamlit

Multitask Learning Strengthens Adversarial Robustness

Detectron2-FC a fast construction platform of neural network algorithm based on detectron2

Prediction of MBA refinance Index (Mortgage prepayment)

How to Leverage Multimodal EHR Data for Better Medical Predictions?

Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Backend code to use MCPI's python API to make infinite worlds with custom generation

DeRF: Decomposed Radiance Fields

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

Normalization Matters in Weakly Supervised Object Localization (ICCV 2021)

This project aims to be a handler for input creation and running of multiple RICEWQ simulations.

pcnaDeep integrates cutting-edge detection techniques with tracking and cell cycle resolving models.

[ICCV 2021] Released code for Causal Attention for Unbiased Visual Recognition

Code accompanying "Dynamic Neural Relational Inference" from CVPR 2020

[NeurIPS'21] "AugMax: Adversarial Composition of Random Augmentations for Robust Training" by Haotao Wang, Chaowei Xiao, Jean Kossaifi, Zhiding Yu, Animashree Anandkumar, and Zhangyang Wang.

Ganilla - Official Pytorch implementation of GANILLA

TeST: Temporal-Stable Thresholding for Semi-supervised Learning

Learning What and Where to Draw