A simple interface for editing natural photos with generative neural networks.

Last update: Dec 29, 2022

Overview

Neural Photo Editor

A simple interface for editing natural photos with generative neural networks.

This repository contains code for the paper "Neural Photo Editing with Introspective Adversarial Networks," and the Associated Video.

Installation

To run the Neural Photo Editor, you will need:

Python, likely version 2.7. You may be able to use early versions of Python2, but I'm pretty sure there's some incompatibilities with Python3 in here.
Theano, development version.
lasagne, development version.
I highly recommend cuDNN as speed is key, but it is not a dependency.
numpy, scipy, PIL, Tkinter and tkColorChooser, but it is likely that your python distribution already has those.

Running the NPE

By default, the NPE runs on IAN_simple. This is a slimmed-down version of the IAN without MDC or RGB-Beta blocks, which runs without lag on a laptop GPU with ~1GB of memory (GT730M)

If you're on a Windows machine, you will want to create a .theanorc file and at least set the flag FLOATX=float32.

If you're on a linux machine, you can just insert THEANO_FLAGS=floatX=float32 before the command line call.

If you don't have cuDNN, simply change line 56 of the NPE.py file from dnn=True to dnn=False. Note that I presently only have the non-cuDNN option working for IAN_simple.

Then, run the command:

python NPE.py

If you wish to use a different model, simply edit the line with "config path" in the NPE.py file.

You can make use of any model with an inference mechanism (VAE or ALI-based GAN).

Commands

You can paint the image by picking a color and painting on the image, or paint in the latent space canvas (the red and blue tiles below the image).
The long horizontal slider controls the magnitude of the latent brush, and the smaller horizontal slider controls the size of both the latent and the main image brush.
You can select different entries from the subset of the celebA validation set (included in this repository as an .npz) by typing in a number from 0-999 in the bottom left box and hitting "infer."
Use the reset button to return to the ground truth image.
Press "Update" to update the ground-truth image and corresponding reconstruction with the current image. Use "Infer" to return to an original ground truth image from the dataset.
Use the sample button to generate a random latent vector and corresponding image.
Use the scroll wheel to lighten or darken an image patch (equivalent to using a pure white or pure black paintbrush). Note that this automatically returns you to sample mode, and may require hitting "infer" rather than "reset" to get back to photo editing.

Training an IAN on celebA

You will need Fuel along with the 64x64 version of celebA. See here for instructions on downloading and preparing it.

If you wish to train a model, the IAN.py file contains the model configuration, and the train_IAN.py file contains the training code, which can be run like this:

python train_IAN.py IAN.py

By default, this code will save (and overwrite!) the weights to a .npz file with the same name as the config.py file (i.e. "IAN.py -> IAN.npz"), and will output a jsonl log of the training with metrics recorded after every chunk.

Use the --resume=True flag when calling to resume training a model--it will automatically pick up from the most recent epoch.

Sampling the IAN

You can generate a sample and reconstruction+interpolation grid with:

python sample_IAN.py IAN.py

Note that you will need matplotlib. to do so.

Known Issues/Bugs

My MADE layer currently only accepts hidden unit sizes that are equal to the size of the latent vector, which will present itself as a BAD_PARAM error.

Since the MADE really only acts as an autoregressive randomizer I'm not too worried about this, but it does bear looking into.

I messed around with the keywords for get_model, you'll need to deal with these if you wish to run any model other than IAN_simple through the editor.

Everything is presently just dumped into a single, unorganized directory. I'll be adding folders and cleaning things up soon.

Notes

Remainder of the IAN experiments (including SVHN) coming soon.

I've integrated the plat interface which makes the NPE itself independent of framework, so you should be able to run it with Blocks, TensorFlow, PyTorch, PyCaffe, what have you, by modifying the IAN class provided in models.py.

Acknowledgments

This code contains lasagne layers and other goodies adopted from a number of places:

MADE wrapped from the implementation by M. Germain et al: https://github.com/mgermain/MADE
Gaussian Sample layer from Tencia Lee's Recipe: https://github.com/Lasagne/Recipes/blob/master/examples/variational_autoencoder/variational_autoencoder.py
Minibatch Discrimination layer from OpenAI's Improved GAN Techniques: https://github.com/openai/improved-gan
Deconv Layer adapted from Radford's DCGAN: https://github.com/Newmu/dcgan_code
Image-Grid Plotter adopted from AlexMLamb's Discriminative Regularization: https://github.com/vdumoulin/discgen
Metrics_logging and checkpoints adopted from Daniel Maturana's VoxNet: https://github.com/dimatura/voxnet
Plat interface adopted from Tom White's plat: https://github.com/dribnet/plat

A simple interface for editing natural photos with generative neural networks.

Related tags

Overview

Neural Photo Editor

Installation

Running the NPE

Commands

Training an IAN on celebA

Sampling the IAN

Known Issues/Bugs

Notes

Acknowledgments

Owner

Andy Brock

MAterial del programa Misión TIC 2022

Implementation of our paper "Video Playback Rate Perception for Self-supervised Spatio-Temporal Representation Learning".

DPT: Deformable Patch-based Transformer for Visual Recognition (ACM MM2021)

ShapeGlot: Learning Language for Shape Differentiation

A library for uncertainty quantification based on PyTorch

An implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

A Simulated Optimal Intrusion Response Game

3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning (CoRL 2021)

This repository contains the code for "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP".

Code for CVPR2021 paper "Robust Reflection Removal with Reflection-free Flash-only Cues"

Just Go with the Flow: Self-Supervised Scene Flow Estimation

Code for the paper "A Study of Face Obfuscation in ImageNet"

Companion code for the paper "Meta-Learning the Search Distribution of Black-Box Random Search Based Adversarial Attacks" by Yatsura et al.

Face recognition project by matching the features extracted using SIFT.

A generalized framework for prototyping full-stack cooperative driving automation applications under CARLA+SUMO.

GLIP: Grounded Language-Image Pre-training

[CVPR 2021] Unsupervised 3D Shape Completion through GAN Inversion

Equipped customers with insights about their EVs Hourly energy consumption and helped predict future charging behavior using LSTM model

A way to store images in YAML.