A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

Overview

AnimeGAN

A simple PyTorch Implementation of Generative Adversarial Networks, focusing on anime face drawing.

Randomly Generated Images

The images are generated from a DCGAN model trained on 143,000 anime character faces for 100 epochs.

fake_sample_1

Image Interpolation

Manipulating latent codes, enables the transition from images in the first row to the last row.

transition

Original Images

The images are not clean, some outliers can be observed, which degrades the quality of the generated images.

real_sample

Usage

To run the experiment,

$ python main.py --dataRoot path_to_dataset/ 

The pretrained model for DCGAN are also in this repo, play it inside the jupyter notebook.

anime-faces Dataset

Anime-style images of 126 tags are collected from danbooru.donmai.us using the crawler tool gallery-dl. The images are then processed by a anime face detector python-animeface. The resulting dataset contains ~143,000 anime faces. Note that some of the tags may no longer meaningful after cropping, i.e. the cropped face images under 'uniform' tag may not contain visible parts of uniforms.

How to construct the dataset from scratch ?

Prequisites: gallery-dl, python-animeface

  1. Download anime-style images

    # download 1000 images under the tag "misaka_mikoto"
    gallery-dl --images 1000 "https://danbooru.donmai.us/posts?tags=misaka_mikoto"
    
    # in a multi-processing manner
    cat tags.txt | \
    xargs -n 1 -P 12 -I 'tag' \ 
    bash -c ' gallery-dl --images 1000 "https://danbooru.donmai.us/posts?tags=$tag" '
  2. Extract faces from the downloaded images

    import animeface
    from PIL import Image
    
    im = Image.open('images/anime_image_misaka_mikoto.png')
    faces = animeface.detect(im)
    x,y,w,h = faces[0].face.pos
    im = im.crop((x,y,x+w,y+h))
    im.show() # display

I've cleaned the original dataset, the new version of the dataset has 115085 images in 126 tags. You can access the images from:

Non-commercial use please.

Things I've learned

  1. GANs are really hard to train.
  2. DCGAN generally works well, simply add fully-connected layers causes problems.
  3. In my cases, more layers for G yields better images, in the sense that G should be more powerful than D.
  4. Add noise to D's inputs and labels helps stablize training.
  5. Use differnet input and generate resolution (64x64 vs 96x96), there seems no obvious difference during training, the generated images are also very similar.
  6. Binray Noise as G's input amazingly works, but the images are not as good as those with Gussian Noise, idea credit to @cwhy ['Binary Noise' here I mean a sequence of {-1,1} generated by bernoulli distribution at p=0.5 ]

I did not carefully verify them, if you are looking for some general GAN tips, see @soumith's ganhacks

Others

  1. This project is heavily influenced by chainer-DCGAN and IllustrationGAN, the codes are mostly borrowed from PyTorch DCGAN example, thanks the authors for the clean codes.
  2. Dependencies: pytorch, torchvision
  3. This is a toy project for me to learn PyTorch and GANs, most importantly, for fun! :) Any feedback is welcome.

@jayleicn

Comments
  • KeyError: 'module name can\'t contain

    KeyError: 'module name can\'t contain "."'

    The classes in module.py contains some nn.Module layers whose names contains some'.' in it, so I got error messages like the title, so how could I play it??

    opened by bolin12 2
  • no image under some tags

    no image under some tags

    Hi, The dataset from google drive contains 126 tags. However, some folders are emtpy:

    1girl apron blush collarbone hairclip honma_meiko japanese_clothes monochrome necktie nishizumi_miho purple_eyes scarf school_uniform sunglasses

    Is this normal? Thanks

    opened by samrere 1
  • set_sizes_contiguous is not allowed on a Tensor created from .data or .detach().

    set_sizes_contiguous is not allowed on a Tensor created from .data or .detach().

    input.data.resize_(real_cpu.size()).copy_(real_cpu) RuntimeError: set_sizes_contiguous is not allowed on a Tensor created from .data or .detach(). If your intent is to change the metadata of a Tensor (such as sizes / strides / storage / storage_offset) without autograd tracking the change, remove the .data / .detach() call and wrap the change in a with torch.no_grad(): block. For example, change: x.data.set_(y) to: with torch.no_grad(): x.set_(y)

    opened by athulvingt 0
  • can not run

    can not run

    Traceback (most recent call last): File "main.py", line 6, in import torch File "/Library/Python/2.7/site-packages/torch/init.py", line 81, in from torch._C import * RuntimeError: module compiled against API version 0xa but this version of numpy is 0x9

    opened by xtzero 0
  • How do I start with my own model

    How do I start with my own model

    I would like to know how I can use this image generation to generate my own images from a self made model?

    Where can I read upon on this. I find no concrete info on making the models.

    opened by quintendewilde 0
Owner
Jie Lei 雷杰
UNC CS PhD student, vision+language.
Jie Lei 雷杰
[AAAI22] Reliable Propagation-Correction Modulation for Video Object Segmentation

Reliable Propagation-Correction Modulation for Video Object Segmentation (AAAI22) Preview version paper of this work is available at: https://arxiv.or

Xiaohao Xu 70 Dec 04, 2022
Perform zero-order Hankel Transform for an 1D array (float or real valued).

perform zero-order Hankel Transform for an 1D array (float or real valued). An discrete form of Parseval theorem is guaranteed. Suit for iterative problems.

1 Jan 17, 2022
Unofficial keras(tensorflow) implementation of MAE model from Masked Autoencoders Are Scalable Vision Learners

MAE-keras Unofficial keras(tensorflow) implementation of MAE model described in 'Masked Autoencoders Are Scalable Vision Learners'. This work has been

Yewon 11 Jun 12, 2022
Keras-retinanet - Keras implementation of RetinaNet object detection.

Keras RetinaNet Keras implementation of RetinaNet object detection as described in Focal Loss for Dense Object Detection by Tsung-Yi Lin, Priya Goyal,

Fizyr 4.3k Jan 01, 2023
GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks

GPU-accelerated PyTorch implementation of Zero-shot User Intent Detection via Capsule Neural Networks This repository implements a capsule model Inten

Joel Huang 15 Dec 24, 2022
Pytorch implementation of the DeepDream computer vision algorithm

deep-dream-in-pytorch Pytorch (https://github.com/pytorch/pytorch) implementation of the deep dream (https://en.wikipedia.org/wiki/DeepDream) computer

102 Dec 05, 2022
rliable is an open-source Python library for reliable evaluation, even with a handful of runs, on reinforcement learning and machine learnings benchmarks.

Open-source library for reliable evaluation on reinforcement learning and machine learning benchmarks. See NeurIPS 2021 oral for details.

Google Research 529 Jan 01, 2023
Official code for: A Probabilistic Hard Attention Model For Sequentially Observed Scenes

"A Probabilistic Hard Attention Model For Sequentially Observed Scenes" Authors: Samrudhdhi Rangrej, James Clark Accepted to: BMVC'21 A recurrent atte

5 Nov 19, 2022
Algorithms for outlier, adversarial and drift detection

Alibi Detect is an open source Python library focused on outlier, adversarial and drift detection. The package aims to cover both online and offline d

Seldon 1.6k Dec 31, 2022
AI pipelines for Nvidia Jetson Platform

Jetson Multicamera Pipelines Easy-to-use realtime CV/AI pipelines for Nvidia Jetson Platform. This project: Builds a typical multi-camera pipeline, i.

NVIDIA AI IOT 96 Dec 23, 2022
Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Official repository of the paper Learning to Regress 3D Face Shape and Expression from an Image without 3D Supervision

Soubhik Sanyal 689 Dec 25, 2022
Interactive dimensionality reduction for large datasets

BlosSOM 🌼 BlosSOM is a graphical environment for running semi-supervised dimensionality reduction with EmbedSOM. You can use it to explore multidimen

19 Dec 14, 2022
This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!

Credit Card Fraud Detection Introduction Online transactions have become a crucial part of any business over the years. Many of those transactions use

Jonathan Hasbani 0 Jan 20, 2022
Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

Traductor de señas Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe Requerimientos 🔧 Python 3.8 o inferior para evitar

Jahaziel Hernandez Hoyos 3 Nov 12, 2022
The official implementation of You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient.

You Only Compress Once: Towards Effective and Elastic BERT Compression via Exploit-Explore Stochastic Nature Gradient (paper) @misc{zhang2021compress,

46 Dec 07, 2022
Proof of concept GnuCash Webinterface

Proof of Concept GnuCash Webinterface This may one day be a something truly great. Milestones [ ] Browse accounts and view transactions [ ] Record sim

Josh 14 Dec 28, 2022
AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models

AI-UPV at IberLEF-2021 DETOXIS task: Toxicity Detection in Immigration-Related Web News Comments Using Transformers and Statistical Models Description

Angel de Paula 0 Jun 08, 2022
Few-Shot-Intent-Detection includes popular challenging intent detection datasets with/without OOS queries and state-of-the-art baselines and results.

Few-Shot-Intent-Detection Few-Shot-Intent-Detection is a repository designed for few-shot intent detection with/without Out-of-Scope (OOS) intents. It

Jian-Guo Zhang 73 Dec 26, 2022
ICLR 2021: Pre-Training for Context Representation in Conversational Semantic Parsing

SCoRe: Pre-Training for Context Representation in Conversational Semantic Parsing This repository contains code for the ICLR 2021 paper "SCoRE: Pre-Tr

Microsoft 28 Oct 02, 2022
DeepAL: Deep Active Learning in Python

DeepAL: Deep Active Learning in Python Python implementations of the following active learning algorithms: Random Sampling Least Confidence [1] Margin

Kuan-Hao Huang 583 Jan 03, 2023