This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Last update: Jan 03, 2023

Related tags

Deep Learning fine-grained-recognition

Overview

TransFG: A Transformer Architecture for Fine-grained Recognition

Official PyTorch code for the paper: TransFG: A Transformer Architecture for Fine-grained Recognition

Implementation based on DeiT pretrained on ImageNet-1K with distillation fine-tuning will be released soon.

Framework

Dependencies:

Python 3.7.3
PyTorch 1.5.1
torchvision 0.6.1
ml_collections

Usage

1. Download Google pre-trained ViT models

Get models in this link: ViT-B_16, ViT-B_32...

wget https://storage.googleapis.com/vit_models/imagenet21k/{MODEL_NAME}.npz

2. Prepare data

In the paper, we use data from 5 publicly available datasets:

Please download them from the official websites and put them in the corresponding folders.

3. Install required packages

Install dependencies with the following command:

pip3 install -r requirements.txt

4. Train

To train TransFG on CUB-200-2011 dataset with 4 gpus in FP-16 mode for 10000 steps run:

CUDA_VISIBLE_DEVICES=0,1,2,3 python3 -m torch.distributed.launch --nproc_per_node=4 train.py --dataset CUB_200_2011 --split overlap --num_steps 10000 --fp16 --name sample_run

Citation

If you find our work helpful in your research, please cite it as:

@article{he2021transfg,
  title={TransFG: A Transformer Architecture for Fine-grained Recognition},
  author={He, Ju and Chen, Jieneng and Liu, Shuai and Kortylewski, Adam and Yang, Cheng and Bai, Yutong and Wang, Changhu and Yuille, Alan},
  journal={arXiv preprint arXiv:2103.07976},
  year={2021}
}

Acknowledgement

Many thanks to ViT-pytorch for the PyTorch reimplementation of An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Related tags

Overview

TransFG: A Transformer Architecture for Fine-grained Recognition

Framework

Dependencies:

Usage

1. Download Google pre-trained ViT models

2. Prepare data

3. Install required packages

4. Train

Citation

Acknowledgement

Owner

Ju He

Python library for loading and using triangular meshes.

This is an official implementation of "Polarized Self-Attention: Towards High-quality Pixel-wise Regression"

LoveDA: A Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation (NeurIPS2021 Benchmark and Dataset Track)

The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

Deep motion transfer

Code for the CVPR2021 workshop paper "Noise Conditional Flow Model for Learning the Super-Resolution Space"

Fast convergence of detr with spatially modulated co-attention

Deep Surface Reconstruction from Point Clouds with Visibility Information

Auto grind btdb2 exp for tower

A Light in the Dark: Deep Learning Practices for Industrial Computer Vision

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Code for the paper "Location-aware Single Image Reflection Removal"

Repository containing detailed experiments related to the paper "Memotion Analysis through the Lens of Joint Embedding".

Sign Language Transformers (CVPR'20)

Python Assignments for the Deep Learning lectures by Andrew NG on coursera with complete submission for grading capability.

Official implementation of "An Image is Worth 16x16 Words, What is a Video Worth?" (2021 paper)

Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

Python scripts performing class agnostic object localization using the Object Localization Network model in ONNX.

PyTorch code for our ECCV 2020 paper "Single Image Super-Resolution via a Holistic Attention Network"

🐥A PyTorch implementation of OpenAI's finetuned transformer language model with a script to import the weights pre-trained by OpenAI