Implementation of Convolutional enhanced image Transformer

Last update: Dec 13, 2022

Overview

CeiT : Convolutional enhanced image Transformer

This is an unofficial PyTorch implementation of Incorporating Convolution Designs into Visual Transformers .

Training :

python train.py -c configs/default.yaml --name "name_of_exp"

Usage :

import torch
from ceit import CeiT

img = torch.ones([1, 3, 224, 224])
    
model = CeiT(image_size = 224, patch_size = 4, num_classes = 100)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

model = CeiT(image_size = 224, patch_size = 4, num_classes = 100, with_lca = True)
out = model(img)

print("Shape of out :", out.shape)      # [B, num_classes]

Note :

LCA might not be properly implemented.

Citation :

@misc{yuan2021incorporating,
      title={Incorporating Convolution Designs into Visual Transformers}, 
      author={Kun Yuan and Shaopeng Guo and Ziwei Liu and Aojun Zhou and Fengwei Yu and Wei Wu},
      year={2021},
      eprint={2103.11816},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgement :

Base ViT code is borrowed from @lucidrains repo : https://github.com/lucidrains/vit-pytorch
Training and dataloader code is borrowed from @jeonsworld repo : https://github.com/jeonsworld/ViT-pytorch

Implementation of Convolutional enhanced image Transformer

Related tags

Overview

CeiT : Convolutional enhanced image Transformer

Training :

Usage :

Note :

Citation :

Acknowledgement :

Owner

Rishikesh (ऋषिकेश)

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

A vision library for performing sliced inference on large images/small objects

We have made you a wrapper you can't refuse

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

Pytorch reimplementation of PSM-Net: "Pyramid Stereo Matching Network"

Adaptive FNO transformer - official Pytorch implementation

GemNet model in PyTorch, as proposed in "GemNet: Universal Directional Graph Neural Networks for Molecules" (NeurIPS 2021)

Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

Project dự đoán giá cổ phiếu bằng thuật toán LSTM gồm: code train và code demo

Application of K-means algorithm on a music dataset after a dimensionality reduction with PCA

Official repository for the paper, MidiBERT-Piano: Large-scale Pre-training for Symbolic Music Understanding.

Python port of R's Comprehensive Dynamic Time Warp algorithm package

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

A Strong Baseline for Image Semantic Segmentation

This project uses Template Matching technique for object detecting by detection of template image over base image.

Continuous Conditional Random Field Convolution for Point Cloud Segmentation

"Neural Turing Machine" in Tensorflow

Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe