Variational autoencoder for anime face reconstruction

Overview

VAE animeface

Variational autoencoder for anime face reconstruction

Introduction

This repository is an exploratory example to train a variational autoencoder to extract meaningful feature representations of anime girl face images.

The code architecture is mostly borrowed and modified from Yann Dubois's disentangling-vae repository. It has nice summarization and comparison of the different VAE model proposed recently.

Dataset

Anime Face Dataset contains 63,632 anime faces. (all rescaled to 64x64 in training)

https://raw.githubusercontent.com/Mckinsey666/Anime-Face-Dataset/master/test.jpg

Model

The model used is the one proposed in the paper Understanding disentangling in β-VAE, which is summarized below:

https://github.com/YannDubs/disentangling-vae/raw/master/doc/imgs/architecture.png

I used laplace as the target distribution to calculate the reconstruction loss. From Yann's code, it suggests that bernoulli would generally a better choice, but it looks it converge slowly in my case. (I didn't do a fair comparison to be conclusive)

Loss function used is β-VAEH from β-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework.

Result

Latent feature number is set to 20 (10 gaussian mean, 10 log gaussian variance). VAE model is trained for 100 epochs. All data is used for training, no validation and testing applied.

Face reconstruction

results/laplace_betaH_loss/test1_recons.png

results/laplace_betaH_loss/test2_recons.png

results/laplace_betaH_loss/test3_recons.png

Prior space traversal

Based on the face reconstruction result while traversing across the latent space, we may speculate the generative property of each latent as following:

  1. Hair shade
  2. Hair length
  3. Face orientation
  4. Hair color
  5. Face rotation
  6. Bangs, face color
  7. Hair glossiness
  8. Unclear
  9. Eye size & color
  10. Bangs

results/laplace_betaH_loss/test_prior_traversals.png

Original faces clustering

Original anime faces are clustered based on latent features (selected feature is either below 1% (left 5) or above 99% (right 5) among all data points, while the rest latent features are closeto each other). Visulization of the original images mostly confirms the speculation above.

results/laplace_betaH_loss/test_original_traversals.png

Latent feature diagnosis

Learned latent features are all close to standard normal distribution, and show minimum correlation.

results/laplace_betaH_loss/latent_diagnosis.png

Owner
Minzhe Zhang
Graduate student in UT Southwestern Medical Center. Bioinformatician. Computational biologist.
Minzhe Zhang
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.

Statutory Interpretation Data Set This repository contains the data set created for the following research papers: Savelka, Jaromir, and Kevin D. Ashl

17 Dec 23, 2022
Train an RL agent to execute natural language instructions in a 3D Environment (PyTorch)

Gated-Attention Architectures for Task-Oriented Language Grounding This is a PyTorch implementation of the AAAI-18 paper: Gated-Attention Architecture

Devendra Chaplot 234 Nov 05, 2022
Using NumPy to solve the equations of fluid mechanics together with Finite Differences, explicit time stepping and Chorin's Projection methods

Computational Fluid Dynamics in Python Using NumPy to solve the equations of fluid mechanics 🌊 🌊 🌊 together with Finite Differences, explicit time

Felix Köhler 4 Nov 12, 2022
Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

RSCD (BS-RSCD & JCD) Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021) by Zhihang Zhong, Yinqiang Zheng, Imari Sato We co

81 Dec 15, 2022
Code for "Diversity can be Transferred: Output Diversification for White- and Black-box Attacks"

Output Diversified Sampling (ODS) This is the github repository for the NeurIPS 2020 paper "Diversity can be Transferred: Output Diversification for W

50 Dec 11, 2022
Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集

English | 简体中文 Latest News 2021.10.25 Paper "Docking-based Virtual Screening with Multi-Task Learning" is accepted by BIBM 2021. 2021.07.29 PaddleHeli

633 Jan 04, 2023
EfficientMPC - Efficient Model Predictive Control Implementation

efficientMPC Efficient Model Predictive Control Implementation The original algo

Vin 8 Dec 04, 2022
An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

MetaICL: Learning to Learn In Context This includes an original implementation of "MetaICL: Learning to Learn In Context" by Sewon Min, Mike Lewis, Lu

Meta Research 141 Jan 07, 2023
Code release for Local Light Field Fusion at SIGGRAPH 2019

Local Light Field Fusion Project | Video | Paper Tensorflow implementation for novel view synthesis from sparse input images. Local Light Field Fusion

1.1k Dec 27, 2022
The implementation of our CIKM 2021 paper titled as: "Cross-Market Product Recommendation"

FOREC: A Cross-Market Recommendation System This repository provides the implementation of our CIKM 2021 paper titled as "Cross-Market Product Recomme

Hamed Bonab 16 Sep 12, 2022
Deep Q-learning for playing chrome dino game

[PYTORCH] Deep Q-learning for playing Chrome Dino

Viet Nguyen 68 Dec 05, 2022
[SIGGRAPH 2020] Attribute2Font: Creating Fonts You Want From Attributes

Attr2Font Introduction This is the official PyTorch implementation of the Attribute2Font: Creating Fonts You Want From Attributes. Paper: arXiv | Rese

Yue Gao 200 Dec 15, 2022
Reimplementation of the paper "Attention, Learn to Solve Routing Problems!" in jax/flax.

JAX + Attention Learn To Solve Routing Problems Reinplementation of the paper Attention, Learn to Solve Routing Problems! using Jax and Flax. Fully su

Gabriela Surita 7 Dec 01, 2022
3D HourGlass Networks for Human Pose Estimation Through Videos

3D-HourGlass-Network 3D CNN Based Hourglass Network for Human Pose Estimation (3D Human Pose) from videos. This was my summer'18 research project. Dis

Naman Jain 51 Jan 02, 2023
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

Swin Transformer 1.4k Dec 30, 2022
Small-bets - Ergodic Experiment With Python

Ergodic Experiment Based on this video. Run this experiment with this command: p

Michael Brant 3 Jan 11, 2022
Self-Guided Contrastive Learning for BERT Sentence Representations

Self-Guided Contrastive Learning for BERT Sentence Representations This repository is dedicated for releasing the implementation of the models utilize

Taeuk Kim 16 Dec 04, 2022
上海交通大学全自动抢课脚本,支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Welcome to Course-Bullying-in-SJTU-v3.1! 2021/6/8 紧急更新v3.1 更新说明 为了更好地保护用户隐私,将原来用户名+密码的登录方式改为微信扫二维码+cookie登录方式,不再需要配置使用pytesseract。在使用扫码登录模式时,请稍等,二维码将马

87 Sep 13, 2022
Motion planning algorithms commonly used on autonomous vehicles. (path planning + path tracking)

Overview This repository implemented some common motion planners used on autonomous vehicles, including Hybrid A* Planner Frenet Optimal Trajectory Hi

Huiming Zhou 1k Jan 09, 2023
Paper: De-rendering Stylized Texts

Paper: De-rendering Stylized Texts Wataru Shimoda1, Daichi Haraguchi2, Seiichi Uchida2, Kota Yamaguchi1 1CyberAgent.Inc, 2 Kyushu University Accepted

CyberAgent AI Lab 55 Dec 18, 2022