This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Last update: Jan 05, 2023

Related tags

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Paper Authors: Yifan Jiang, Shiyu Chang, Zhangyang Wang

CVPR 2021

This is re-implementation of TransGAN: Two Transformers Can Make One Strong GAN, and That Can Scale Up, CVPR 2021 in PyTorch.

Generative Adversarial Networks-GAN builded completely free of Convolutions and used Transformers architectures which became popular since Vision Transformers-ViT. In this implementation, CIFAR-10 dataset was used.

0 Epoch	40 Epoch	100 Epoch	200 Epoch

Related Work - Vision Transformers (ViT)

In this implementation, as a discriminator, Vision Transformer(ViT) Block was used. In order to get more info about ViT, you can look at the original paper here

Credits for illustration of ViT: @lucidrains

Installation

Before running train.py, check whether you have libraries in requirements.txt! Also, create ./fid_stat folder and download the fid_stats_cifar10_train.npz file in this folder. To save your model during training, create ./checkpoint folder using mkdir checkpoint.

Training

python train.py

Pretrained Model

You can find pretrained model here. You can download using:

wget https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

curl gdrive.sh | bash -s https://drive.google.com/file/d/134GJRMxXFEaZA0dF-aPpDS84YjjeXPdE/view

License

MIT

Citation

@article{jiang2021transgan,
  title={TransGAN: Two Transformers Can Make One Strong GAN},
  author={Jiang, Yifan and Chang, Shiyu and Wang, Zhangyang},
  journal={arXiv preprint arXiv:2102.07074},
  year={2021}
}

@article{dosovitskiy2020,
  title={An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale},
  author={Dosovitskiy, Alexey and Beyer, Lucas and Kolesnikov, Alexander and Weissenborn, Dirk and Zhai, Xiaohua and Unterthiner, Thomas and  Dehghani, Mostafa and Minderer, Matthias and Heigold, Georg and Gelly, Sylvain and Uszkoreit, Jakob and Houlsby, Neil},
  journal={arXiv preprint arXiv:2010.11929},
  year={2020}
}

@inproceedings{zhao2020diffaugment,
  title={Differentiable Augmentation for Data-Efficient GAN Training},
  author={Zhao, Shengyu and Liu, Zhijian and Lin, Ji and Zhu, Jun-Yan and Han, Song},
  booktitle={Conference on Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

Comments

GPU memory, Modifying batch size
Hello,

I saw your comment in VITA-Group's implementation of TransGAN and started looking at your implementation here.

Without modifying anything and attempting to run "python train.py" results in CUDA out of memory; I believe the GPU I'm using cannot handle the model size/training images that you've specified. I tried editing the batch size on lines 35 and 36 of train.py (--gener_batch_size, changing default from 64 to 32, etc.), but I get a RuntimeError of:

Output 0 of UnbindBackward is a view and is being modified inplace. This view is the output of a function that returns multiple views. Such fuctions do not allow the otutput views to be modified inplace. You should replace the inplace operation by an out-of-place one.

My two questions are:

How would you suggest modifying the training parameters to deal with GPU running out of memory? and,

Is there a better way to edit the batch size, and what else do I need to change in order for the code to not break when the batch size is changed?

Thanks!
opened by Andrew-X-Wang 10
Create your own FID stats file

Hello and thanks for the implementation. I'm trying to train this model on a different datset, but to do so I need a custom fid_stats file for my dataset. How can I create it ?

opened by IlyasMoutawwakil 2
FID score: nan

Thank you for your contribution. But in the training processing, FID score is Nan. I want to known whether it is appropriate. Should I make some chance to solve this problem?

opened by Jamie-Cheung 1
TransGAN fid problem

hello,I would like to humbly ask you what is the difference beetween TransGAN-main and TransGAN-master?can Trans-main reproduce similar results of the original paper? The results obtained by using CIFAR in TransGAN-main are quite different from those in the paper,and WGAN-EP loss concussion,so I want to ask you.

opened by Stephenlove 1
How do you test on your own dataset with the checkpoint.pth generated?

I want to use the checkpoint saved to generate my own results from a testing dataset and use those images later to calculate my own evaluation metrics. Please help

opened by meh-naz 0

Releases(v2.0)

v2.0(Jul 6, 2021)

More qualified generated images with TransGAN on CIFAR10 dataset.
Source code(tar.gz)
Source code(zip)
v1.0(May 31, 2021)

In this version of re-implementation, MNIST and CIFAR-10 datasets were used for TransGAN-S.
Source code(tar.gz)
Source code(zip)

Owner

Ahmet Sarigun

Yet, another human being!

GitHub Repository https://arxiv.org/abs/2102.07074

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

TCMR: Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video Qualtitative result Paper teaser video Introduction This r

215 Jan 06, 2023

This is a re-implementation of TransGAN: Two Pure Transformers Can Make One Strong GAN (CVPR 2021) in PyTorch.

Related tags

Overview

TransGAN: Two Transformers Can Make One Strong GAN [YouTube Video]

Related Work - Vision Transformers (ViT)

Installation

Training

Pretrained Model

License

Citation

Comments

GPU memory, Modifying batch size

Create your own FID stats file

FID score: nan

TransGAN fid problem

How do you test on your own dataset with the checkpoint.pth generated?

Releases(v2.0)

v2.0(Jul 6, 2021)

v1.0(May 31, 2021)

Owner

Ahmet Sarigun

Official Pytorch implementation of "Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video", CVPR 2021

This is a package for LiDARTag, described in paper: LiDARTag: A Real-Time Fiducial Tag System for Point Clouds

GestureSSD CBAM - A gesture recognition web system based on SSD and CBAM, using pytorch, flask and node.js

PyTorch implementation of probabilistic deep forecast applied to air quality.

Retinal Vessel Segmentation with Pixel-wise Adaptive Filters (ISBI 2022)

Exploring Versatile Prior for Human Motion via Motion Frequency Guidance (3DV2021)

Repository for "Improving evidential deep learning via multi-task learning," published in AAAI2022

Example scripts for the detection of lanes using the ultra fast lane detection model in Tensorflow Lite.

Code for the paper "Reinforcement Learning as One Big Sequence Modeling Problem"

ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Video Swin Transformer - PyTorch

Very Deep Convolutional Networks for Large-Scale Image Recognition

Indoor Panorama Planar 3D Reconstruction via Divide and Conquer

DC540 hacking challenge 0x00005a.

WeakVRD-Captioning - Implementation of paper Improving Image Captioning with Better Use of Caption

Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

Learning to Initialize Neural Networks for Stable and Efficient Training

Arch-Net: Model Distillation for Architecture Agnostic Model Deployment

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

Weakly Supervised Segmentation by Tensorflow.