Styleformer - Official Pytorch Implementation

Overview

Styleformer -- Official PyTorch implementation

Styleformer: Transformer based Generative Adversarial Networks with Style Vector(https://arxiv.org/abs/2106.07023)

PWC PWC

Requirements

  • We have done all testing and development using 4 Titan RTX GPUs with 24GB.
  • 64-bit Python 3.7 and PyTorch 1.7.1.
  • Python libraries: pip install click requests tqdm pyspng ninja imageio-ffmpeg==0.4.3. We use the Anaconda3 2020.11 distribution which installs most of these by default.

Pretrained pickle

CIFAR-10 Styleformer-Large with FID 2.82 IS 9.94

STL-10 Styleformer-Medium with FID 20.11 IS 10.16

CelebA Styleformer-Linformer with FID 3.66

LSUN-Church Styleformer-Linformer with FID 7.99

Generating images

Pre-trained networks are stored as *.pkl files that can be referenced using local filenames

# Generate images using pretrained_weight 
python generate.py --outdir=out --seeds=100-105 \
    --network=path_to_pkl_file

Outputs from the above commands are placed under out/*.png, controlled by --outdir. Downloaded network pickles are cached under $HOME/.cache/dnnlib, which can be overridden by setting the DNNLIB_CACHE_DIR environment variable. The default PyTorch extension build directory is $HOME/.cache/torch_extensions, which can be overridden by setting TORCH_EXTENSIONS_DIR.

Preparing datasets

CIFAR-10: Download the CIFAR-10 python version and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/cifar10.zip

STL-10: Download the stl-10 dataset 5k training, 100k unlabeled images from STL-10 dataset page and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/stl10.zip \
    ---width=48 --height=48

CelebA: Download the CelebA dataset Aligned&Cropped Images from CelebA dataset page and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/cifar-10-python.tar.gz --dest=~/datasets/stl10.zip \
    ---width=64 --height=64

LSUN Church: Download the desired categories(church) from the LSUN project page and convert to ZIP archive:

python dataset_tool.py --source=~/downloads/lsun/raw/church_lmdb --dest=~/datasets/lsunchurch.zip \
    --width=128 --height=128

Training new networks

In its most basic form, training new networks boils down to:

python train.py --outdir=~/training-runs --data=~/mydataset.zip --gpus=1 --batch=32 --cfg=cifar --g_dict=256,64,16 \
    --num_layers=1,2,2 --depth=32
  • --g_dict= it means 'Hidden size' in paper, and it must be match with image resolution.
  • --num_layers= it means 'Layers' in paper, and it must be match with image resolution.
  • --depth=32 it means minimum required depth is 32, described in Section 2 at paper.
  • --linformer=1 apply informer to Styleformer.

Please refer to python train.py --help for the full list. To train STL-10 dataset with same setting at paper, please fix the starting resolution 88 to 1212 at training/networks_Generator.py.

Quality metrics

Quality metrics can be computed after the training:

# Pre-trained network pickle: specify dataset explicitly, print result to stdout.
python calc_metrics.py --metrics=fid50k_full --data=~/datasets/lsunchurch.zip \
    --network=path_to_pretrained_lsunchurch_pkl_file
    
python calc_metrics.py --metrics=is50k --data=~/datasets/lsunchurch.zip \
    --network=path_to_pretrained_lsunchurch_pkl_file    

Citation

If you found our work useful, please don't forget to cite

@misc{park2021styleformer,
      title={Styleformer: Transformer based Generative Adversarial Networks with Style Vector}, 
      author={Jeeseung Park and Younggeun Kim},
      year={2021},
      eprint={2106.07023},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

The code is heavily based on the stylegan2-ada-pytorch implementation

Owner
Jeeseung Park
Machine learning
Jeeseung Park
Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression.

Spatio-Temporal Entropy Model A Pytorch Reproduction of Spatio-Temporal Entropy Model (STEM) for end-to-end leaned video compression. More details can

16 Nov 28, 2022
OpenMMLab Image and Video Editing Toolbox

Introduction MMEditing is an open source image and video editing toolbox based on PyTorch. It is a part of the OpenMMLab project. The master branch wo

OpenMMLab 3.9k Jan 04, 2023
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

MenghaoGuo 357 Dec 11, 2022
A novel Engagement Detection with Multi-Task Training (ED-MTT) system

A novel Engagement Detection with Multi-Task Training (ED-MTT) system which minimizes MSE and triplet loss together to determine the engagement level of students in an e-learning environment.

Onur Çopur 12 Nov 11, 2022
PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech

PortaSpeech - PyTorch Implementation PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech. Model Size Module Nor

Keon Lee 279 Jan 04, 2023
AAAI-22 paper: SimSR: Simple Distance-based State Representationfor Deep Reinforcement Learning

SimSR Code and dataset for the paper SimSR: Simple Distance-based State Representationfor Deep Reinforcement Learning (AAAI-22). Requirements We assum

7 Dec 19, 2022
Lab Materials for MIT 6.S191: Introduction to Deep Learning

This repository contains all of the code and software labs for MIT 6.S191: Introduction to Deep Learning! All lecture slides and videos are available

Alexander Amini 5.6k Dec 26, 2022
FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset (CVPR2022)

FaceVerse FaceVerse: a Fine-grained and Detail-controllable 3D Face Morphable Model from a Hybrid Dataset Lizhen Wang, Zhiyuan Chen, Tao Yu, Chenguang

Lizhen Wang 219 Dec 28, 2022
Show Me the Whole World: Towards Entire Item Space Exploration for Interactive Personalized Recommendations

HierarchicyBandit Introduction This is the implementation of WSDM 2022 paper : Show Me the Whole World: Towards Entire Item Space Exploration for Inte

yu song 5 Sep 09, 2022
TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

TensorFlow implementation of "A Simple Baseline for Bayesian Uncertainty in Deep Learning"

YeongHyeon Park 7 Aug 28, 2022
Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala, S. Krastanov, M. Eichenfield, and D. R. Englund, 2022

Supplementary materials to "Spin-optomechanical quantum interface enabled by an ultrasmall mechanical and optical mode volume cavity" by H. Raniwala,

Stefan Krastanov 1 Jan 17, 2022
This repo contains the official implementations of EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis

EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis This repo contains the official implementations of EigenDamage: Structured Prunin

Chaoqi Wang 107 Apr 20, 2022
This repository contains Prior-RObust Bayesian Optimization (PROBO) as introduced in our paper "Accounting for Gaussian Process Imprecision in Bayesian Optimization"

Prior-RObust Bayesian Optimization (PROBO) Introduction, TOC This repository contains Prior-RObust Bayesian Optimization (PROBO) as introduced in our

Julian Rodemann 2 Mar 19, 2022
disentanglement_lib is an open-source library for research on learning disentangled representations.

disentanglement_lib disentanglement_lib is an open-source library for research on learning disentangled representation. It supports a variety of diffe

Google Research 1.3k Dec 28, 2022
TensorFlow Implementation of Unsupervised Cross-Domain Image Generation

Domain Transfer Network (DTN) TensorFlow implementation of Unsupervised Cross-Domain Image Generation. Requirements Python 2.7 TensorFlow 0.12 Pickle

Yunjey Choi 864 Dec 30, 2022
Deep Residual Learning for Image Recognition

Deep Residual Learning for Image Recognition This is a Torch implementation of "Deep Residual Learning for Image Recognition",Kaiming He, Xiangyu Zhan

Kimmy 561 Dec 01, 2022
Python package for downloading ECMWF reanalysis data and converting it into a time series format.

ecmwf_models Readers and converters for data from the ECMWF reanalysis models. Written in Python. Works great in combination with pytesmo. Citation If

TU Wien - Department of Geodesy and Geoinformation 31 Dec 26, 2022
Fang Zhonghao 13 Nov 19, 2022
Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

nerf_pl Update: an improved NSFF implementation to handle dynamic scene is open! Update: NeRF-W (NeRF in the Wild) implementation is added to nerfw br

AI葵 1.8k Dec 30, 2022
In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021. Balestriero et

Sean M. Hendryx 1 Jan 27, 2022