Learning What and Where to Draw

Last update: Nov 18, 2022

Related tags

Deep Learning nips2016

Overview

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee

This is the code for our NIPS 2016 paper on text- and location-controllable image synthesis using conditional GANs. Much of the code is adapted from reedscot/icml2016 and dcgan.torch.

####Setup Instructions

You will need to install Torch, CuDNN, stnbhwd and the display package.

####How to train a text to image model:

Download the data including captions, location annotations and pretrained models.
Download the birds and humans image data.
Modify the CONFIG file to point to your data.
Run one of the training scripts, e.g. ./scripts/train_cub_keypoints.sh

####How to generate samples:

./scripts/run_all_demos.sh.
html files will be generated with results like the following:

Moving the bird's position via bounding box:

Moving the bird's position via keypoints:

Birds text to image with ground-truth keypoints:

Birds text to image with generated keypoints:

Humans text to image with ground-truth keypoints:

Humans text to image with generated keypoints:

####Citation

If you find this useful, please cite our work as follows:

@inproceedings{reed2016learning,
  title={Learning What and Where to Draw},
  author={Scott Reed and Zeynep Akata and Santosh Mohan and Samuel Tenka and Bernt Schiele and Honglak Lee},
  booktitle={Advances in Neural Information Processing Systems},
  year={2016}
}

Learning What and Where to Draw

Related tags

Overview

Owner

Scott Ellison Reed

Syllabic Quantity Patterns as Rhythmic Features for Latin Authorship Attribution

DCA - Official Python implementation of Delaunay Component Analysis algorithm

PyTorch code for the "Deep Neural Networks with Box Convolutions" paper

The official repository for "Revealing unforeseen diagnostic image features with deep learning by detecting cardiovascular diseases from apical four-chamber ultrasounds"

SimDeblur is a simple framework for image and video deblurring, implemented by PyTorch

Categorizing comments on YouTube into different categories.

Implementation of Convolutional LSTM in PyTorch.

🌊 Online machine learning in Python

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

A TensorFlow implementation of Neural Program Synthesis from Diverse Demonstration Videos

Augmentation for Single-Image-Super-Resolution

Data, model training, and evaluation code for "PubTables-1M: Towards a universal dataset and metrics for training and evaluating table extraction models".

MERLOT: Multimodal Neural Script Knowledge Models

A Next Generation ConvNet by FaceBookResearch Implementation in PyTorch(Original) and TensorFlow.

LineBoard - Python+React+MySQL-白板即時系統改善人群行為

PyTorch-centric library for evaluating and enhancing the robustness of AI technologies

PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020

The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

Training Cifar-10 Classifier Using VGG16

Direct application of DALLE-2 to video synthesis, using factored space-time Unet and Transformers