An implementation of the paper "A Neural Algorithm of Artistic Style"

Overview

A Neural Algorithm of Artistic Style implementation - Neural Style Transfer

This is an implementation of the research paper "A Neural Algorithm of Artistic Style" written by Leon A. Gatys, Alexander S. Ecker, Matthias Bethge.

Inspiration

The mechanism acting behind perceiving artistic images through biological vision is still unclear among scientists across the world. There exists no proper artificial system that perfectly interprets our visual experiences while understanding art. The method proposed in this paper is a significant step towards explaining how the biological vision might work while perceiving fine art.


Introduction

To quote authors Leon A. Gatys, Alexander S. Ecker, Matthias Bethge, "in light of the striking similarities between performance-optimised artificial neural networks and biological vision, our work offers a path forward to an algorithmic understanding of how humans create and perceive artistic imagery.

The idea of Neural Style Transfer is taking a white noise as an input image, changing the input in such a way that it resembles the content of the content image and the texture/artistic style of the style image to reproduce it as a new artistic stylized image.

We define two distances, one for the content that measures how different the content between the two images is, and one for style that measures how different the style between the two images is. The aim is to transform the white noise input such that the the content-distance and style-distance is minimized (with the content and style image respectively).

Given below are some results from the original implementation


Model Componenets

Our Model architecture follows:

  • We have one module defining two classes responsible for calculating the loss functions for both content and style images and one for applying normalization on the desired values.
  • We have a second module which has three methods under one class NST -
    • A method for image preprocessing.
    • Content and Style Model Representation - We used the feature space provided by the 16 convolutional and 5 pooling layers of the VGG-19 Network. The five style reconstructions were generated by matching the style representations on layer 'conv1_1', 'conv2_1', 'conv3_1', 'conv4_1' and 'conv5_1. The generated style was matched with the content representation on layer 'conv4_2' to transform our input white noise into an image that applied the artistic style from the style image to the content of the content image by minimizing the values for both content and style loss respectively.
    • A method for training - We made a third method that calls the above methods to take content and style inputs from the user, preprocesses it and runs the neural style transfer algorithm on a white noise input image for 300 iterations using the LBFGS as the optimization function to output the generated image that is a combination of the given content and style images.


Implementation Details

  • PIL images have values between 0 and 255, but when transformed into torch tensors, their values are converted to be between 0 and 1. The images need to be resized to have the same dimensions. Neural networks from the torch library are trained with tensor values ranging from 0 to 1. The image_loader() function takes content and style image paths and loads them, creates a white noise input image, and returns the three tensors.
  • The style_model_and_losses() function is responsible for calculating and returning the content and style losses, and adding the content loss and style loss layers immediately after the convolution layer they are detecting.
  • To quote the authors, "To generate the images that mix the content of a photograph with the style of a painting we jointly minimise the distance of a white noise image from the content representation of the photograph in one layer of the network and the style representation of the painting in a number of layers of the CNN". The run_nst() function performs the neural transfer. For each iteration of the networks, an updated input is fed into it and new losses are computed. The backward methods of each loss module is run to dynamicaly compute their gradients. The optimizer requires a “closure()” function, to re-evaluate the module and return the loss.

Note - Owing to computational power limitations, the content and style images are resized to 512x512 when using a GPU or 128x128 when on a CPU. It is advisable to use a GPU for training because Neural Atyle Transfer is computationally very expensive.

Usage Guidelines

  • Cloning the Repository:

      git clone https://github.com/srijarkoroy/ArtiStyle
    
  • Entering the directory:

      cd ArtiStyle
    
  • Setting up the Python Environment with dependencies:

      pip install -r requirements.txt
    
  • Running the file:

      python3 test.py
    

Note: Before running the test file please ensure that you mention a valid path to a content and style image and also set path='path to save the output image' if you want to save your image

Check out the demo notebook here.

Results from implementation

Content Image Style Image Output Image

Contributors

Owner
Srijarko Roy
AI Enthusiast!
Srijarko Roy
Face recognize system

FRS Face_recognize_system This project contains my work that target on solving some problems of FRS: Face detection: Retinaface Face anti-spoofing: Fo

Tran Anh Tuan 4 Nov 18, 2021
A setup script to generate ITK Python Wheels

ITK Python Package This project provides a setup.py script to build ITK Python binary packages and infrastructure to build ITK external module Python

Insight Software Consortium 59 Dec 14, 2022
Easily Process a Batch of Cox Models

ezcox: Easily Process a Batch of Cox Models The goal of ezcox is to operate a batch of univariate or multivariate Cox models and return tidy result. ⏬

Shixiang Wang 15 May 23, 2022
JAX-based neural network library

Haiku: Sonnet for JAX Overview | Why Haiku? | Quickstart | Installation | Examples | User manual | Documentation | Citing Haiku What is Haiku? Haiku i

DeepMind 2.3k Jan 04, 2023
Gym-TORCS is the reinforcement learning (RL) environment in TORCS domain with OpenAI-gym-like interface.

Gym-TORCS Gym-TORCS is the reinforcement learning (RL) environment in TORCS domain with OpenAI-gym-like interface. TORCS is the open-rource realistic

naoto yoshida 400 Dec 27, 2022
CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

CS5242_2021 Neural Networks and Deep Learning, NUS CS5242, 2021 Cloud Machine #1 : Google Colab (Free GPU) Follow this Notebook installation : https:/

Xavier Bresson 165 Oct 25, 2022
RL and distillation in CARLA using a factorized world model

World on Rails Learning to drive from a world on rails Dian Chen, Vladlen Koltun, Philipp Krähenbühl, arXiv techical report (arXiv 2105.00636) This re

Dian Chen 131 Dec 16, 2022
Implementation of TabTransformer, attention network for tabular data, in Pytorch

Tab Transformer Implementation of Tab Transformer, attention network for tabular data, in Pytorch. This simple architecture came within a hair's bread

Phil Wang 420 Jan 05, 2023
Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

SMOP is Small Matlab and Octave to Python compiler. SMOP translates matlab to py

Tom Xu 1 Jan 12, 2022
Repositorio oficial del curso IIC2233 Programación Avanzada 🚀✨

IIC2233 - Programación Avanzada Evaluación Las evaluaciones serán efectuadas por medio de actividades prácticas en clases y tareas. Se calculará la no

IIC2233 @ UC 47 Sep 06, 2022
Neural Oblivious Decision Ensembles

Neural Oblivious Decision Ensembles A supplementary code for anonymous ICLR 2020 submission. What does it do? It learns deep ensembles of oblivious di

25 Sep 21, 2022
Fully Connected DenseNet for Image Segmentation

Fully Connected DenseNets for Semantic Segmentation Fully Connected DenseNet for Image Segmentation implementation of the paper The One Hundred Layers

Somshubra Majumdar 84 Oct 31, 2022
This is a project based on ConvNets used to identify whether a road is clean or dirty. We have used MobileNet as our base architecture and the weights are based on imagenet.

PROJECT TITLE: CLEAN/DIRTY ROAD DETECTION USING TRANSFER LEARNING Description: This is a project based on ConvNets used to identify whether a road is

Faizal Karim 3 Nov 06, 2022
Keyword spotting on Arm Cortex-M Microcontrollers

Keyword spotting for Microcontrollers This repository consists of the tensorflow models and training scripts used in the paper: Hello Edge: Keyword sp

Arm Software 1k Dec 30, 2022
A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

Website | Documentation | Tutorials | Installation | Release Notes CatBoost is a machine learning method based on gradient boosting over decision tree

CatBoost 6.9k Jan 04, 2023
We present a regularized self-labeling approach to improve the generalization and robustness properties of fine-tuning.

Overview This repository provides the implementation for the paper "Improved Regularization and Robustness for Fine-tuning in Neural Networks", which

NEU-StatsML-Research 21 Sep 08, 2022
A semismooth Newton method for elliptic PDE-constrained optimization

sNewton4PDEOpt The Python module implements a semismooth Newton method for solving finite-element discretizations of the strongly convex, linear ellip

2 Dec 08, 2022
A Tensorflow based library for Time Series Modelling with Gaussian Processes

Markovflow Documentation | Tutorials | API reference | Slack What does Markovflow do? Markovflow is a Python library for time-series analysis via prob

Secondmind Labs 24 Dec 12, 2022
Get the partition that a file belongs and the percentage of space that consumes

tinos_eisai_sy Get the partition that a file belongs and the percentage of space that consumes (works only with OSes that use the df command) tinos_ei

Konstantinos Patronas 6 Jan 24, 2022
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions" Environment requirement This code is based on Python

Rohan Kumar Gupta 1 Dec 19, 2021