Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

Last update: Dec 17, 2022

Overview

ADE20k Semantic segmentation with MAE

Getting started

Install the mmsegmentation library and some required packages.

pip install mmcv-full==1.3.0 mmsegmentation==0.11.0
pip install scipy timm==0.3.2

Install apex for mixed-precision training

git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./

Follow the guide in mmseg to prepare the ADE20k dataset.

Fine-tuning for Reproducing Results of MAE ViT-Base

Command:

tools/dist_train.sh configs/mae/upernet_mae_base_12_512_slide_160k_ade20k.py 8 --seed 0  --options model.pretrained=https://dl.fbaipublicfiles.com/mae/pretrain/mae_pretrain_vit_base.pth

Expected results log(paper results: 48.1 mIoU):

+--------+-------+-------+-------+
| Scope  | mIoU  | mAcc  | aAcc  |
+--------+-------+-------+-------+
| global | 48.15 | 58.99 | 83.05 |
+--------+-------+-------+-------+

Evaluation

Command format:

tools/dist_test.sh  <CONFIG_PATH> <CHECKPOINT_PATH> <NUM_GPUS> --eval mIoU

Acknowledgment

This code is built using the mmsegmentation library, Timm library, the Swin repository, XCiT, SETR, BEiT and the MAE repository.

Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

Related tags

Overview

ADE20k Semantic segmentation with MAE

Getting started

Fine-tuning for Reproducing Results of MAE ViT-Base

Evaluation

Acknowledgment

Owner

Pytorch Implementation for Dilated Continuous Random Field

CausaLM: Causal Model Explanation Through Counterfactual Language Models

GeDML is an easy-to-use generalized deep metric learning library

A PyTorch Implementation of SphereFace.

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

Detecting Blurred Ground-based Sky/Cloud Images

Official PyTorch implementation of Spatial Dependency Networks.

Using Machine Learning to Create High-Res Fine Art

A pre-trained language model for social media text in Spanish

Generative Adversarial Text to Image Synthesis

Official PyTorch implementation of StyleGAN3

This repository contains the accompanying code for Deep Virtual Markers for Articulated 3D Shapes, ICCV'21

Pytorch implementation of "Training a 85.4% Top-1 Accuracy Vision Transformer with 56M Parameters on ImageNet"

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

Code for "Localization with Sampling-Argmax", NeurIPS 2021

Implementing a simplified copy of Shazam application from scratch using MinHashing and LSH.

CBKH: The Cornell Biomedical Knowledge Hub

Code release for "Making a Bird AI Expert Work for You and Me".

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Code for: Imagine by Reasoning: A Reasoning-Based Implicit Semantic Data Augmentation for Long-Tailed Classification