GAN encoders in PyTorch that could match PGGAN, StyleGAN v1/v2, and BigGAN. Code also integrates the implementation of these GANs.

Overview

MTV-TSA: Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions.

Python 3.7.3 PyTorch 1.8.1 Apache-2.0

cxx1 cxx2 msk dy zy

This is the official code release for "Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions".

The code contains a set of encoders that match pre-trained GANs (PGGAN, StyleGANv1, StyleGANv2, BigGAN) via multi-scale vectors with two-scale attentions.

Usage

  • training encoder with center attentions (align image)

python E_align.py

  • training encoder with Gram-based attentions (misalign image)

python E_mis_align.py

  • embedding real images to latent space (using StyleGANv1 and w).

    a. You can put real images at './checkpoint/realimg_file/' (default file as args.img_dir)

    b. You should load pre-trained Encoder at './checkpoint/E/E_blur(case2)_styleganv1_FFHQ_state_dict.pth'

    c. Then run:

python embedding_img.py

  • discovering attribute directions with latent space : embedded_img_processing.py

Note: Pre-trained Model should be download first , and default save to './chechpoint/'

Metric

  • validate performance (Pre-trained GANs and baseline)

    1. using generations.py to generate reconstructed images (generate GANs images if needed)
    2. Files in the directory "./baseline/" could help you to quickly format images and latent vectors (w).
    3. Put comparing images to different files, and run comparing-baseline.py
  • ablation study : look at ''./ablations-study/''

Setup

Encoders

  • Case 1: Training most pre-trained GANs with encoders. at './model/E/E.py' (quickly converge for reconstructed GANs' image)
  • Case 2: Training StyleGANv1 on FFHQ for ablation study and real face image process at './model/E/E_Blur.py' (margin blur and more GPU memory)

Pre-Trained GANs

note: put pre-trained GANs weight file at ''./checkpoint/' directory

  • StyleGAN_V1 (should contain 3 files: Gm, Gs, center-tensor):
    • Cat 256:
      • ./checkpoint/stylegan_V1/cat/cat256_Gs_dict.pth
      • ./checkpoint/stylegan_V1/cat/cat256_Gm_dict.pth
      • ./checkpoint/stylegan_V1/cat/cat256_tensor.pt
    • Car 256: same above
    • Bedroom 256:
  • StyleGAN_V2 (Only one files : pth):
    • FFHQ 1024:
      • ./checkpoint/stylegan_V2/stylegan2_ffhq1024.pth
  • PGGAN ((Only one files : pth)):
    • Horse 256:
      • ./checkpoint/PGGAN/
  • BigGAN (Two files : model as .pt and config as .json ):
    • Image-Net 256:
      • ./checkpoint/biggan/256/G-256.pt
      • ./checkpoint/biggan/256/biggan-deep-256-config.json

Options and Setting

note: different GANs should set different parameters carefully.

  • choose --mtype for StyleGANv1=1, StyleGANv2=2, PGGAN=3, BIGGAN=4
  • choose Encoder start_features (--z_dim) carefully, the value are: 16->1024x1024, 32->512x512, 64->256x256
  • if go on training, set --checkpoint_dir_E which path save pre-trained Encoder model
  • --checkpoint_dir_GAN is needed, StyleGANv1 is a directory(contains 3 filers: Gm, Gs, center-tensor) , others are file path (.pth or .pt)
    parser = argparse.ArgumentParser(description='the training args')
    parser.add_argument('--iterations', type=int, default=210000) # epoch = iterations//30000
    parser.add_argument('--lr', type=float, default=0.0015)
    parser.add_argument('--beta_1', type=float, default=0.0)
    parser.add_argument('--batch_size', type=int, default=2)
    parser.add_argument('--experiment_dir', default=None) #None
    parser.add_argument('--checkpoint_dir_GAN', default='./checkpoint/stylegan_v2/stylegan2_ffhq1024.pth') #None  ./checkpoint/stylegan_v1/ffhq1024/ or ./checkpoint/stylegan_v2/stylegan2_ffhq1024.pth or ./checkpoint/biggan/256/G-256.pt
    parser.add_argument('--config_dir', default='./checkpoint/biggan/256/biggan-deep-256-config.json') # BigGAN needs it
    parser.add_argument('--checkpoint_dir_E', default=None)
    parser.add_argument('--img_size',type=int, default=1024)
    parser.add_argument('--img_channels', type=int, default=3)# RGB:3 ,L:1
    parser.add_argument('--z_dim', type=int, default=512) # PGGAN , StyleGANs are 512. BIGGAN is 128
    parser.add_argument('--mtype', type=int, default=2) # StyleGANv1=1, StyleGANv2=2, PGGAN=3, BigGAN=4
    parser.add_argument('--start_features', type=int, default=16)  # 16->1024 32->512 64->256

Pre-trained Model

We offered pre-trainned GANs and their corresponding encoders here: models (default setting is the case1 ).

GANs:

  • StyleGANv1-(FFHQ1024, Car512, Cat256) models which contain 3 files Gm, Gs and center-tensor.
  • PGGAN and StyleGANv2. A single .pth file gets Gm, Gs and center-tensor together.
  • BigGAN 128x128 ,256x256, and 512x512: each type contain a config file and model (.pt)

Encoders:

  • StyleGANv1 FFHQ (case 2) for real-image embedding and process.
  • StyleGANv2 LSUN Cat 256, they are one models from case 1 (Grad-CAM based attentions) and both models from case 2 (Grad-Cam based and Center-aligned Attentions for ablation study):
  • StyleGANv2 FFHQ (case 1)
  • Biggan-256 (case 1)

If you want to try more GANs, cite more pre-trained GANs below:

Acknowledgements

Pre-trained GANs:

StyleGANv1: https://github.com/podgorskiy/StyleGan.git, ( Converting code for official pre-trained model is here: https://github.com/podgorskiy/StyleGAN_Blobless.git) StyleGANv2 and PGGAN: https://github.com/genforce/genforce.git BigGAN: https://github.com/huggingface/pytorch-pretrained-BigGAN

Comparing Works:

In-Domain GAN: https://github.com/genforce/idinvert_pytorch pSp: https://github.com/eladrich/pixel2style2pixel ALAE: https://github.com/podgorskiy/ALAE.git

Related Works:

Grad-CAM & Grad-CAM++: https://github.com/yizt/Grad-CAM.pytorch SSIM Index: https://github.com/Po-Hsun-Su/pytorch-ssim

Our method implementation partly borrow from the above works (ALAE and Related Works). We would like to thank those authors.

If you have any questions, please contact us by E-mail ( [email protected]). Pull request or any comment is also welcome.

License

The code of this repository is released under the Apache 2.0 license.
The directories models/biggan and models/stylegan2 are provided under the MIT license.

Cite

@misc{yu2021adaptable,
      title={Adaptable GAN Encoders for Image Reconstruction via Multi-type Latent Vectors with Two-scale Attentions}, 
      author={Cheng Yu and Wenmin Wang},
      year={2021},
      eprint={2108.10201},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

简体中文:

如何应用于编辑人脸

Owner
owl
Be a strong man & Try to be a great man
owl
Basics of 2D and 3D Human Pose Estimation.

Human Pose Estimation 101 If you want a slightly more rigorous tutorial and understand the basics of Human Pose Estimation and how the field has evolv

Sudharshan Chandra Babu 293 Dec 14, 2022
A scanpy extension to analyse single-cell TCR and BCR data.

Scirpy: A Scanpy extension for analyzing single-cell immune-cell receptor sequencing data Scirpy is a scalable python-toolkit to analyse T cell recept

ICBI 145 Jan 03, 2023
The description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts.

FMFCC-A This project is the description of FMFCC-A (audio track of FMFCC) dataset and Challenge resluts. The FMFCC-A dataset is shared through BaiduCl

18 Dec 24, 2022
Changing the Mind of Transformers for Topically-Controllable Language Generation

We will first introduce the how to run the IPython notebook demo by downloading our pretrained models. Then, we will introduce how to run our training and evaluation code.

IESL 20 Dec 06, 2022
EigenGAN Tensorflow, EigenGAN: Layer-Wise Eigen-Learning for GANs

Gender Bangs Body Side Pose (Yaw) Lighting Smile Face Shape Lipstick Color Painting Style Pose (Yaw) Pose (Pitch) Zoom & Rotate Flush & Eye Color Mout

Zhenliang He 321 Dec 01, 2022
Neural Module Network for VQA in Pytorch

Neural Module Network (NMN) for VQA in Pytorch Note: This is NOT an official repository for Neural Module Networks. NMN is a network that is assembled

Harsh Trivedi 111 Nov 24, 2022
Official PyTorch code of Holistic 3D Scene Understanding from a Single Image with Implicit Representation (CVPR 2021)

Implicit3DUnderstanding (Im3D) [Project Page] Holistic 3D Scene Understanding from a Single Image with Implicit Representation Cheng Zhang, Zhaopeng C

Cheng Zhang 149 Jan 08, 2023
General Virtual Sketching Framework for Vector Line Art (SIGGRAPH 2021)

General Virtual Sketching Framework for Vector Line Art - SIGGRAPH 2021 Paper | Project Page Outline Dependencies Testing with Trained Weights Trainin

Haoran MO 118 Dec 27, 2022
Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite and .pb from .tflite.

tflite2tensorflow Generate saved_model, tfjs, tf-trt, EdgeTPU, CoreML, quantized tflite and .pb from .tflite. 1. Supported Layers No. TFLite Layer TF

Katsuya Hyodo 214 Dec 29, 2022
Notes taking website build with Docker + Django + React.

Notes website. Try it in browser! / But how to run? Description. This is monorepository with notes website. Website provides web interface for creatin

Kirill Zhosul 2 Jul 27, 2022
Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

README Implementation of OpenAI paper "An Empirical Model of Large-Batch Training" for Fastai V2. The code is based on the batch size finder implement

13 Dec 10, 2021
DNA sequence classification by Deep Neural Network

DNA sequence classification by Deep Neural Network: Project Overview worked on the DNA sequence classification problem where the input is the DNA sequ

Mohammed Jawwadul Islam Fida 0 Aug 02, 2022
This repository is the official implementation of Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models

Using Time-Series Privileged Information for Provably Efficient Learning of Prediction Models Link to paper Abstract We study prediction of future out

Rickard Karlsson 2 Aug 19, 2022
Rule Based Classification Project

Kural Tabanlı Sınıflandırma ile Potansiyel Müşteri Getirisi Hesaplama İş Problemi: Bir oyun şirketi müşterilerinin bazı özelliklerini kullanaraknseviy

Şafak 1 Jan 12, 2022
Code for Motion Representations for Articulated Animation paper

Motion Representations for Articulated Animation This repository contains the source code for the CVPR'2021 paper Motion Representations for Articulat

Snap Research 851 Jan 09, 2023
Unofficial implementation of Google's FNet: Mixing Tokens with Fourier Transforms

FNet: Mixing Tokens with Fourier Transforms Pytorch implementation of Fnet : Mixing Tokens with Fourier Transforms. Citation: @misc{leethorp2021fnet,

Rishikesh (ऋषिकेश) 218 Jan 05, 2023
Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch

This repository is used to suspend the results of our paper "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement"

ScorpioMiku 19 Sep 30, 2022
Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR 2022)

Voxel Set Transformer: A Set-to-Set Approach to 3D Object Detection from Point Clouds (CVPR2022)[paper] Authors: Chenhang He, Ruihuang Li, Shuai Li, L

Billy HE 141 Dec 30, 2022
Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images

Anchor Retouching via Model Interaction for Robust Object Detection in Aerial Images In this paper, we present an effective Dynamic Enhancement Anchor

13 Dec 09, 2022
Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch

Implementing SYNTHESIZER: Rethinking Self-Attention in Transformer Models using Pytorch Reference Paper URL Author: Yi Tay, Dara Bahri, Donald Metzler

Myeongjun Kim 66 Nov 30, 2022