Variational autoencoder for anime face reconstruction

Overview

VAE animeface

Variational autoencoder for anime face reconstruction

Introduction

This repository is an exploratory example to train a variational autoencoder to extract meaningful feature representations of anime girl face images.

The code architecture is mostly borrowed and modified from Yann Dubois's disentangling-vae repository. It has nice summarization and comparison of the different VAE model proposed recently.

Dataset

Anime Face Dataset contains 63,632 anime faces. (all rescaled to 64x64 in training)

https://raw.githubusercontent.com/Mckinsey666/Anime-Face-Dataset/master/test.jpg

Model

The model used is the one proposed in the paper Understanding disentangling in β-VAE, which is summarized below:

https://github.com/YannDubs/disentangling-vae/raw/master/doc/imgs/architecture.png

I used laplace as the target distribution to calculate the reconstruction loss. From Yann's code, it suggests that bernoulli would generally a better choice, but it looks it converge slowly in my case. (I didn't do a fair comparison to be conclusive)

Loss function used is β-VAEH from β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.

Result

Latent feature number is set to 20 (10 gaussian mean, 10 log gaussian variance). VAE model is trained for 100 epochs. All data is used for training, no validation and testing applied.

Face reconstruction

results/laplace_betaH_loss/test1_recons.png

results/laplace_betaH_loss/test2_recons.png

results/laplace_betaH_loss/test3_recons.png

Prior space traversal

Based on the face reconstruction result while traversing across the latent space, we may speculate the generative property of each latent as following:

  1. Hair shade
  2. Hair length
  3. Face orientation
  4. Hair color
  5. Face rotation
  6. Bangs, face color
  7. Hair glossiness
  8. Unclear
  9. Eye size & color
  10. Bangs

results/laplace_betaH_loss/test_prior_traversals.png

Original faces clustering

Original anime faces are clustered based on latent features (selected feature is either below 1% (left 5) or above 99% (right 5) among all data points, while the rest latent features are closeto each other). Visulization of the original images mostly confirms the speculation above.

results/laplace_betaH_loss/test_original_traversals.png

Latent feature diagnosis

Learned latent features are all close to standard normal distribution, and show minimum correlation.

results/laplace_betaH_loss/latent_diagnosis.png

Owner
Minzhe Zhang
Graduate student in UT Southwestern Medical Center. Bioinformatician. Computational biologist.
Minzhe Zhang
Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, and finding their unique parameters (e.g. death rate).

DINN We introduce Disease Informed Neural Networks (DINNs) — neural networks capable of learning how diseases spread, forecasting their progression, a

19 Dec 10, 2022
A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Notice(2019.11.2) This repo was built back two years ago when there were no pytorch detection implementation that can achieve reasonable performance.

Ruotian(RT) Luo 1.8k Jan 01, 2023
The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

R2D2 This is the official code for paper titled "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Mode

Alipay 49 Dec 17, 2022
Official code for the publication "HyFactor: Hydrogen-count labelled graph-based defactorization Autoencoder".

HyFactor Graph-based architectures are becoming increasingly popular as a tool for structure generation. Here, we introduce a novel open-source archit

Laboratoire-de-Chemoinformatique 11 Oct 10, 2022
[ICML 2021] "Graph Contrastive Learning Automated" by Yuning You, Tianlong Chen, Yang Shen, Zhangyang Wang

Graph Contrastive Learning Automated PyTorch implementation for Graph Contrastive Learning Automated [talk] [poster] [appendix] Yuning You, Tianlong C

Shen Lab at Texas A&M University 80 Nov 23, 2022
"SOLQ: Segmenting Objects by Learning Queries", SOLQ is an end-to-end instance segmentation framework with Transformer.

SOLQ: Segmenting Objects by Learning Queries This repository is an official implementation of the paper SOLQ: Segmenting Objects by Learning Queries.

MEGVII Research 179 Jan 02, 2023
Incorporating Transformer and LSTM to Kalman Filter with EM algorithm

Deep learning based state estimation: incorporating Transformer and LSTM to Kalman Filter with EM algorithm Overview Kalman Filter requires the true p

zshicode 57 Dec 27, 2022
Code for reproducing experiments in "Improved Training of Wasserstein GANs"

Improved Training of Wasserstein GANs Code for reproducing experiments in "Improved Training of Wasserstein GANs". Prerequisites Python, NumPy, Tensor

Ishaan Gulrajani 2.2k Jan 01, 2023
An experiment to bait a generalized frontrunning MEV bot

Honeypot 🍯 A simple experiment that: Creates a honeypot contract Baits a generalized fronturnning bot with a unique transaction Analyze bot behaviour

0x1355 14 Nov 24, 2022
PyTorch implementation for 3D human pose estimation

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou 579 Dec 22, 2022
Pytorch library for end-to-end transformer models training and serving

Pytorch library for end-to-end transformer models training and serving

Mikhail Grankin 768 Jan 01, 2023
INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing Existing studies on semantic parsing focus primarily on mapping a natural-la

7 Aug 22, 2022
K-Means Clustering and Hierarchical Clustering Unsupervised Learning Solution in Python3.

Unsupervised Learning - K-Means Clustering and Hierarchical Clustering - The Heritage Foundation's Economic Freedom Index Analysis 2019 - By David Sal

David Salako 1 Jan 12, 2022
render sprites into your desktop environment as shaped windows using GTK

spritegtk render static or animated sprites into your desktop environment as dynamic shaped windows using GTK requires pycairo and PYGobject: pip inst

hermit 20 Oct 27, 2022
Real-Time Social Distance Monitoring tool using Computer Vision

Social Distance Detector A Real-Time Social Distance Monitoring Tool Table of Contents Motivation YOLO Theory Detection Output Tech Stack Functionalit

Pranav B 13 Oct 14, 2022
Prediction of MBA refinance Index (Mortgage prepayment)

Prediction of MBA refinance Index (Mortgage prepayment) Deep Neural Network based Model The ability to predict mortgage prepayment is of critical use

Ruchil Barya 1 Jan 16, 2022
Source code of SIGIR2021 Paper 'One Chatbot Per Person: Creating Personalized Chatbots based on Implicit Profiles'

DHAP Source code of SIGIR2021 Long Paper: One Chatbot Per Person: Creating Personalized Chatbots based on Implicit User Profiles . Preinstallation Fir

ZYMa 32 Dec 06, 2022
Multi-Glimpse Network With Python

Multi-Glimpse Network Multi-Glimpse Network: A Robust and Efficient Classification Architecture based on Recurrent Downsampled Attention arXiv Require

9 May 10, 2022
Python and Julia in harmony.

PythonCall & JuliaCall Bringing Python® and Julia together in seamless harmony: Call Python code from Julia and Julia code from Python via a symmetric

Christopher Rowley 414 Jan 07, 2023
HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

Lv Jun 113 Jan 06, 2023