Neural style transfer in PyTorch.

Overview

style-transfer-pytorch

An implementation of neural style transfer (A Neural Algorithm of Artistic Style) in PyTorch, supporting CPUs and Nvidia GPUs. It does automatic multi-scale (coarse-to-fine) stylization to produce high-quality high resolution stylizations, even up to print resolution if the GPUs have sufficient memory. If two GPUs are available, they can both be used to increase the maximum resolution. (Using two GPUs is not faster than using one.)

The algorithm has been modified from that in the literature by:

  • Using the PyTorch pre-trained VGG-19 weights instead of the original VGG-19 weights

  • Changing the padding mode of the first layer of VGG-19 to 'replicate', to reduce edge artifacts

  • When using average or L2 pooling, scaling the result by an empirically derived factor to ensure that the magnitude of the result stays the same on average (Gatys et al. (2015) did not do this)

  • Using an approximation to the MSE loss such that its gradient L1 norm is approximately 1 for content and style losses (in order to approximate the effects of gradient normalization, which produces better visual quality)

  • Normalizing the Gram matrices by the number of elements in each feature map channel rather than by the total number of elements (Johnson et al.) or not normalizing (Gatys et al. (2015))

  • Taking an exponential moving average over the iterates to reduce iterate noise (each new scale is initialized with the previous scale's averaged iterate)

  • Warm-starting the Adam optimizer with scaled-up versions of its first and second moment buffers at the beginning of each new scale, to prevent noise from being added to the iterates at the beginning of each scale

  • Using non-equal weights for the style layers to improve visual quality

  • Stylizing the image at progressively larger scales, each greater by a factor of sqrt(2) (this is improved from the multi-scale scheme given in Gatys et al. (2016))

Example outputs (click for the full-sized version)

Installation

Python 3.6+ is required.

PyTorch is required: follow their installation instructions before proceeding. If you do not have an Nvidia GPU, select None for CUDA. On Linux, you can find out your CUDA version using the nvidia-smi command. PyTorch packages for CUDA versions lower than yours will work, but select the highest you can.

To install style-transfer-pytorch, first clone the repository, then run the command:

pip install -e PATH_TO_REPO

This will install the style_transfer CLI tool. style_transfer uses a pre-trained VGG-19 model (Simonyan et al.), which is 548MB in size, and will download it when first run.

If you have a supported GPU and style_transfer is using the CPU, try using the argument --device cuda:0 to force it to try to use the first CUDA GPU. This should print an informative error message.

Basic usage

style_transfer CONTENT_IMAGE STYLE_IMAGE [STYLE_IMAGE ...] [-o OUTPUT_IMAGE]

Input images will be converted to sRGB when loaded, and output images have the sRGB colorspace. If the output image is a TIFF file, it will be written with 16 bits per channel. Alpha channels in the inputs will be ignored.

style_transfer has many optional arguments: run it with the --help argument to see a full list. Particularly notable ones include:

  • --web enables a simple web interface while the program is running that allows you to watch its progress. It runs on port 8080 by default, but you can change it with --port. If you just want to view the current image and refresh it manually, you can go to /image.

  • --devices manually sets the PyTorch device names. It can be set to cpu to force it to run on the CPU on a machine with a supported GPU, or to e.g. cuda:1 (zero indexed) to select the second CUDA GPU. Two GPUs can be specified, for instance --devices cuda:0 cuda:1. style_transfer will automatically use the first visible CUDA GPU, falling back to the CPU, if it is omitted.

  • -s (--end-scale) sets the maximum image dimension (height and width) of the output. A large image (e.g. 2896x2172) can take around fifteen minutes to generate on an RTX 3090 and will require nearly all of its 24GB of memory. Since both memory usage and runtime increase linearly in the number of pixels (quadratically in the value of the --end-scale parameter), users with less GPU memory or who do not want to wait very long are encouraged to use smaller resolutions. The default is 512.

  • -sw (--style-weights) specifies factors for the weighted average of multiple styles if there is more than one style image specified. These factors are automatically normalized to sum to 1. If omitted, the styles will be blended equally.

  • -cw (--content-weight) sets the degree to which features from the content image are included in the output image. The default is 0.015.

  • -tw (--tv-weight) sets the strength of the smoothness prior. The default is 2.

References

  1. L. Gatys, A. Ecker, M. Bethge (2015), "A Neural Algorithm of Artistic Style"

  2. L. Gatys, A. Ecker, M. Bethge, A. Hertzmann, E. Shechtman (2016), "Controlling Perceptual Factors in Neural Style Transfer"

  3. J. Johnson, A. Alahi, L. Fei-Fei (2016), "Perceptual Losses for Real-Time Style Transfer and Super-Resolution"

  4. A. Mahendran, A. Vedaldi (2014), "Understanding Deep Image Representations by Inverting Them"

  5. D. Kingma, J. Ba (2014), "Adam: A Method for Stochastic Optimization"

  6. K. Simonyan, A. Zisserman (2014), "Very Deep Convolutional Networks for Large-Scale Image Recognition"

Owner
Katherine Crowson
Katherine Crowson
Stacked Recurrent Hourglass Network for Stereo Matching

SRH-Net: Stacked Recurrent Hourglass Introduction This repository is supplementary material of our RA-L submission, which helps reviewers to understan

28 Jan 03, 2023
This folder contains the python code of UR5E's advanced forward kinematics model.

This folder contains the python code of UR5E's advanced forward kinematics model. By entering the angle of the joint of UR5e, the detailed coordinates of up to 48 points around the robot arm can be c

Qiang Wang 4 Sep 17, 2022
🧑‍🔬 verify your TEAL program by experiment and observation

Graviton - Testing TEAL with Dry Runs Tutorial Local Installation The following instructions assume that you have make available in your local environ

Algorand 18 Jan 03, 2023
Official PyTorch implementation of "RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on" (IJCAI-ECAI 2022)

RMGN-VITON RMGN: A Regional Mask Guided Network for Parser-free Virtual Try-on In IJCAI-ECAI 2022(short oral). [Paper] [Supplementary Material] Abstra

27 Dec 01, 2022
Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

AdversarialTexture Adversarial Texture Optimization from RGB-D Scans (CVPR 2020). Scanning Data Download Please refer to data directory for details. B

Jingwei Huang 153 Nov 28, 2022
A tool to estimate time varying instantaneous reproduction number during epidemics

EpiEstim A tool to estimate time varying instantaneous reproduction number during epidemics. It is described in the following paper: @article{Cori2013

MRC Centre for Global Infectious Disease Analysis 78 Dec 19, 2022
Learning to Prompt for Vision-Language Models.

CoOp Paper: Learning to Prompt for Vision-Language Models Authors: Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu CoOp (Context Optimization)

Kaiyang 679 Jan 04, 2023
Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

PWLQ Updates 2020/07/16 - We are working on getting permission from our institution to release our source code. We will release it once we are granted

54 Dec 15, 2022
Collaborative forensic timeline analysis

Timesketch Table of Contents About Timesketch Getting started Community Contributing About Timesketch Timesketch is an open-source tool for collaborat

Google 2.1k Dec 28, 2022
A motion detection system with RaspberryPi, OpenCV, Python

Human Detection System using Raspberry Pi Functionality Activates a relay on detecting motion. You may need following components to get the expected R

Omal Perera 55 Dec 04, 2022
DC3: A Learning Method for Optimization with Hard Constraints

DC3: A learning method for optimization with hard constraints This repository is by Priya L. Donti, David Rolnick, and J. Zico Kolter and contains the

CMU Locus Lab 57 Dec 26, 2022
Proof of concept GnuCash Webinterface

Proof of Concept GnuCash Webinterface This may one day be a something truly great. Milestones [ ] Browse accounts and view transactions [ ] Record sim

Josh 14 Dec 28, 2022
StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks

StackGAN Pytorch implementation Inception score evaluation StackGAN-v2-pytorch Tensorflow implementation for reproducing main results in the paper Sta

Han Zhang 1.8k Dec 21, 2022
Fully Convolutional Networks for Semantic Segmentation by Jonathan Long*, Evan Shelhamer*, and Trevor Darrell. CVPR 2015 and PAMI 2016.

Fully Convolutional Networks for Semantic Segmentation This is the reference implementation of the models and code for the fully convolutional network

Evan Shelhamer 3.2k Jan 08, 2023
This repository contains the code and models for the following paper.

DC-ShadowNet Introduction This is an implementation of the following paper DC-ShadowNet: Single-Image Hard and Soft Shadow Removal Using Unsupervised

AuAgCu 65 Dec 27, 2022
buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

buildseg buildseg is a building extraction plugin of QGIS based on PaddlePaddle. TODO Extract building on 512x512 remote sensing images. Extract build

Yizhou Chen 11 Sep 26, 2022
The fastai deep learning library

Welcome to fastai fastai simplifies training fast and accurate neural nets using modern best practices Important: This documentation covers fastai v2,

fast.ai 23.2k Jan 07, 2023
Neural Scene Flow Prior (NeurIPS 2021 spotlight)

Neural Scene Flow Prior Xueqian Li, Jhony Kaesemodel Pontes, Simon Lucey Will appear on Thirty-fifth Conference on Neural Information Processing Syste

Lilac Lee 85 Jan 03, 2023
A `Neural = Symbolic` framework for sound and complete weighted real-value logic

Logical Neural Networks LNNs are a novel Neuro = symbolic framework designed to seamlessly provide key properties of both neural nets (learning) and s

International Business Machines 138 Dec 19, 2022
TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning

TransZero++ This repository contains the testing code for the paper "TransZero++: Cross Attribute-guided Transformer for Zero-Shot Learning" submitted

Shiming Chen 6 Aug 16, 2022