Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv)

This repository is built upon DeiT and timm

Usage

First, install PyTorch 1.7.0+ and torchvision 0.8.1+ and pytorch-image-models 0.3.2:

conda install -c pytorch pytorch torchvision
pip install timm==0.3.2

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is the standard layout for the torchvision datasets.ImageFolder, and the training and validation data is expected to be in the train/ folder and val folder respectively:

/path/to/imagenet/
  train/
    class1/
      img1.jpeg
    class2/
      img2.jpeg
  val/
    class1/
      img3.jpeg
    class/2
      img4.jpeg

Training

To train Conformer-S on ImageNet on a single node with 8 gpus for 300 epochs run:

Conformer-S

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
OUTPUT='./output/Conformer_small_patch16_batch_1024_lr1e-3_300epochs'

python -m torch.distributed.launch --master_port 50130 --nproc_per_node=8 --use_env main.py \
                                   --model Conformer_small_patch16 \
                                   --data-set IMNET \
                                   --batch-size 128 \
                                   --lr 0.001 \
                                   --num_workers 4 \
                                   --data-path /data/user/Dataset/ImageNet_ILSVRC2012/ \
                                   --output_dir ${OUTPUT} \
                                   --epochs 300

Model Zoo

Model	Parameters	MACs	Top-1 Acc	Link
Conformer-Ti	23.5 M	5.2 G	81.3 %	baidu(code: hzhm) google
Conformer-S	37.7 M	10.6 G	83.4 %	baidu(code: qvu8) google
Conformer-B	83.3 M	23.3 G	84.1 %	baidu(code: b4z9) google

Citation

@article{peng2021conformer,
      title={Conformer: Local Features Coupling Global Representations for Visual Recognition}, 
      author={Zhiliang Peng and Wei Huang and Shanzhi Gu and Lingxi Xie and Yaowei Wang and Jianbin Jiao and Qixiang Ye},
      journal={arXiv preprint arXiv:2105.03889},
      year={2021},
}

Conformer: Local Features Coupling Global Representations for Visual Recognition

Related tags

Overview

Conformer: Local Features Coupling Global Representations for Visual Recognition (arxiv)

Usage

Data preparation

Training

Model Zoo

Citation

Owner

Zhiliang Peng

Implementation of "Meta-rPPG: Remote Heart Rate Estimation Using a Transductive Meta-Learner"

Connecting Java/ImgLib2 + Python/NumPy

Long Expressive Memory (LEM)

Tgbox-bench - Simple TGBOX upload speed benchmark

This repository introduces a short project about Transfer Learning for Classification of MRI Images.

NLG evaluation via Statistical Measures of Similarity: BaryScore, DepthScore, InfoLM

PyArmadillo: an alternative approach to linear algebra in Python

A basic implementation of Layer-wise Relevance Propagation (LRP) in PyTorch.

DziriBERT: a Pre-trained Language Model for the Algerian Dialect

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs

Program your own vulkan.gpuinfo.org query in Python. Used to determine baseline hardware for WebGPU.

Implementation of CVPR'21: RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

SimDeblur is a simple framework for image and video deblurring, implemented by PyTorch

Employs neural networks to classify images into four categories: ship, automobile, dog or frog

4st place solution for the PBVS 2022 Multi-modal Aerial View Object Classification Challenge - Track 1 (SAR) at PBVS2022

PySOT - SenseTime Research platform for single object tracking, implementing algorithms like SiamRPN and SiamMask.

VID-Fusion: Robust Visual-Inertial-Dynamics Odometry for Accurate External Force Estimation

Official PyTorch implementation of the paper Image-Based CLIP-Guided Essence Transfer.

The Unreasonable Effectiveness of Random Pruning: Return of the Most Naive Baseline for Sparse Training

Source code for the paper "SEPP: Similarity Estimation of Predicted Probabilities for Defending and Detecting Adversarial Text" PACLIC 2021