RVT: Robust Vision Transformers

This repository contains PyTorch code for Robust Vision Transformers.

For details see Rethinking the Design Principles of Robust Vision Transformer by Xiaofeng Mao, Gege Qi, Yuefeng Chen, Yuan He and Hui Xue.

Usage

First, clone the repository locally:

git clone https://github.com/vtddggg/Robust-Vision-Transformer.git

Then, install PyTorch 1.7.0+ and torchvision 0.8.1+ and pytorch-image-models 0.3.2:

conda install -c pytorch pytorch torchvision
pip install timm==0.3.2

We use 4 nodes with 8 gpus to train RVT-Ti, RVT-S and RVT-B:

Training RVT-Ti

python -m torch.distributed.launch --nproc_per_node=8 --nnodes=4 main.py --model rvt_tiny --data-path /path/to/imagenet --output_dir output --dist-eval

Training RVT-S

python -m torch.distributed.launch --nproc_per_node=8 --nnodes=4 main.py --model rvt_small --data-path /path/to/imagenet --output_dir output --dist-eval

Training RVT-B

python -m torch.distributed.launch --nproc_per_node=8 --nnodes=4 main.py --model rvt_base --data-path /path/to/imagenet --output_dir output --batch-size 32 --dist-eval

If you want to train RVT-Ti*, RVT-S* or RVT-B*, simply add --use_mask and --use_patch_aug to enable positon-aware attention scaling and patch-wise augmentation.

This repository contains PyTorch code for Robust Vision Transformers.

Related tags

Overview

RVT: Robust Vision Transformers

Usage

Training RVT-Ti

Training RVT-S

Training RVT-B

Owner

TF Image Segmentation: Image Segmentation framework

Leveraging Two Types of Global Graph for Sequential Fashion Recommendation, ICMR 2021

Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

BalaGAN: Image Translation Between Imbalanced Domains via Cross-Modal Transfer

Useful materials and tutorials for 110-1 NTU DBME5028 (Application of Deep Learning in Medical Imaging)

End-to-end speech secognition toolkit

Learning Optical Flow from a Few Matches (CVPR 2021)

Deep Learning ❤️ OneFlow

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

Demo code for paper "Learning optical flow from still images", CVPR 2021.

Official implementation of "Synthetic Temporal Anomaly Guided End-to-End Video Anomaly Detection" (ICCV Workshops 2021: RSL-CV).

Main repository for the HackBio'2021 Virtual Internship Experience for #Team-Greider ❤️

True per-item rarity for Loot

Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.

This is a simple backtesting framework to help you test your crypto currency trading. It includes a way to download and store historical crypto data and to execute a trading strategy.

Mahadi-Now - This Is Pakistani Just Now Login Tools

[CVPR 2021] Teachers Do More Than Teach: Compressing Image-to-Image Models (CAT)

Implementation of Restricted Boltzmann Machine (RBM) and its variants in Tensorflow

Image classification for projects and researches