StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

Related tags

Deep LearningStackGAN
Overview

StackGAN

Tensorflow implementation for reproducing main results in the paper StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.

Dependencies

python 2.7

TensorFlow 0.12

[Optional] Torch is needed, if use the pre-trained char-CNN-RNN text encoder.

[Optional] skip-thought is needed, if use the skip-thought text encoder.

In addition, please add the project folder to PYTHONPATH and pip install the following packages:

  • prettytensor
  • progressbar
  • python-dateutil
  • easydict
  • pandas
  • torchfile

Data

  1. Download our preprocessed char-CNN-RNN text embeddings for birds and flowers and save them to Data/.
  • [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
  1. Download the birds and flowers image data. Extract them to Data/birds/ and Data/flowers/, respectively.
  2. Preprocess images.
  • For birds: python misc/preprocess_birds.py
  • For flowers: python misc/preprocess_flowers.py

Training

  • The steps to train a StackGAN model on the CUB dataset using our preprocessed data for birds.
    • Step 1: train Stage-I GAN (e.g., for 600 epochs) python stageI/run_exp.py --cfg stageI/cfg/birds.yml --gpu 0
    • Step 2: train Stage-II GAN (e.g., for another 600 epochs) python stageII/run_exp.py --cfg stageII/cfg/birds.yml --gpu 1
  • Change birds.yml to flowers.yml to train a StackGAN model on Oxford-102 dataset using our preprocessed data for flowers.
  • *.yml files are example configuration files for training/testing our models.
  • If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.

Pretrained Model

  • StackGAN for birds trained from char-CNN-RNN text embeddings. Download and save it to models/.
  • StackGAN for flowers trained from char-CNN-RNN text embeddings. Download and save it to models/.
  • StackGAN for birds trained from skip-thought text embeddings. Download and save it to models/ (Just used the same setting as the char-CNN-RNN. We assume better results can be achieved by playing with the hyper-parameters).

Run Demos

  • Run sh demo/flowers_demo.sh to generate flower samples from sentences. The results will be saved to Data/flowers/example_captions/. (Need to download the char-CNN-RNN text encoder for flowers to models/text_encoder/. Note: this text encoder is provided by reedscot/icml2016).
  • Run sh demo/birds_demo.sh to generate bird samples from sentences. The results will be saved to Data/birds/example_captions/.(Need to download the char-CNN-RNN text encoder for birds to models/text_encoder/. Note: this text encoder is provided by reedscot/icml2016).
  • Run python demo/birds_skip_thought_demo.py --cfg demo/cfg/birds-skip-thought-demo.yml --gpu 2 to generate bird samples from sentences. The results will be saved to Data/birds/example_captions-skip-thought/. (Need to download vocabulary for skip-thought vectors to Data/skipthoughts/).

Examples for birds (char-CNN-RNN embeddings), more on youtube:

Examples for flowers (char-CNN-RNN embeddings), more on youtube:

Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same discription 😃

Citing StackGAN

If you find StackGAN useful in your research, please consider citing:

@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}

Our follow-up work

References

  • Generative Adversarial Text-to-Image Synthesis Paper Code
  • Learning Deep Representations of Fine-grained Visual Descriptions Paper Code
Owner
Han Zhang
Han Zhang
A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''.

P-tuning A novel method to tune language models. Codes and datasets for paper ``GPT understands, too''. How to use our code We have released the code

THUDM 562 Dec 27, 2022
Implementation of a Transformer, but completely in Triton

Transformer in Triton (wip) Implementation of a Transformer, but completely in Triton. I'm completely new to lower-level neural net code, so this repo

Phil Wang 152 Dec 22, 2022
Specification language for generating Generalized Linear Models (with or without mixed effects) from conceptual models

tisane Tisane: Authoring Statistical Models via Formal Reasoning from Conceptual and Data Relationships TL;DR: Analysts can use Tisane to author gener

Eunice Jun 11 Nov 15, 2022
How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

How to Become More Salient? Surfacing Representation Biases of the Saliency Prediction Model

Bogdan Kulynych 49 Nov 05, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 Jittor code will come soon

MenghaoGuo 357 Dec 11, 2022
Code for the Population-Based Bandits Algorithm, presented at NeurIPS 2020.

Population-Based Bandits (PB2) Code for the Population-Based Bandits (PB2) Algorithm, from the paper Provably Efficient Online Hyperparameter Optimiza

Jack Parker-Holder 22 Nov 16, 2022
LSTM model trained on a small dataset of 3000 names written in PyTorch

LSTM model trained on a small dataset of 3000 names. Model generates names from model by selecting one out of top 3 letters suggested by model at a time until an EOS (End Of Sentence) character is no

Sahil Lamba 1 Dec 20, 2021
7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

kaggle-hpa-2021-7th-place-solution Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle. A description of the met

8 Jul 09, 2021
Creating Multi Task Models With Keras

Creating Multi Task Models With Keras About The Project! I used the keras and Tensorflow Library, To build a Deep Learning Neural Network to Creating

Srajan Chourasia 4 Nov 28, 2022
Employee-Managment - Company employee registration software in the face recognition system

Employee-Managment Company employee registration software in the face recognitio

Alireza Kiaeipour 7 Jul 10, 2022
NFNets and Adaptive Gradient Clipping for SGD implemented in PyTorch

PyTorch implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping Paper: https://arxiv.org/abs/2102.06171.pdf Original code: htt

Vaibhav Balloli 320 Jan 02, 2023
A keras-based real-time model for medical image segmentation (CFPNet-M)

CFPNet-M: A Light-Weight Encoder-Decoder Based Network for Multimodal Biomedical Image Real-Time Segmentation This repository contains the implementat

268 Nov 27, 2022
Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

Optimum Transformers Accelerated NLP pipelines for fast inference 🚀 on CPU and GPU. Built with 🤗 Transformers, Optimum and ONNX runtime. Installatio

Aleksey Korshuk 115 Dec 16, 2022
Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.

Project Aquarium Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Cep

Aquarist Labs 73 Jul 21, 2022
VR-Caps: A Virtual Environment for Active Capsule Endoscopy

VR-Caps: A Virtual Environment for Capsule Endoscopy Overview We introduce a virtual active capsule endoscopy environment developed in Unity that prov

DeepMIA Lab 90 Dec 27, 2022
Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression", TIP 2020

Tiny Obstacle Discovery by Occlusion-aware Multilayer Regression Official Matlab Implementation for "Tiny Obstacle Discovery by Occlusion-aware Multil

Xuefeng 5 Jan 15, 2022
Controlling a game using mediapipe hand tracking

These scripts use the Google mediapipe hand tracking solution in combination with a webcam in order to send game instructions to a racing game. It features 2 methods of control

3 May 17, 2022
Sharing of contents on mitochondrial encounter networks

mito-network-sharing Sharing of contents on mitochondrial encounter networks Required: R with igraph, brainGraph, ggplot2, and XML libraries; igraph l

Stochastic Biology Group 0 Oct 01, 2021
✨✨✨An awesome open source toolbox for stereo matching.

OpenStereo This is an awesome open source toolbox for stereo matching. Supported Methods: BM SGM(T-PAMI'07) GCNet(ICCV'17) PSMNet(CVPR'18) StereoNet(E

Wang Qingyu 6 Nov 04, 2022