Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

Overview

HifiFace — Unofficial Pytorch Implementation

Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 1, pg. 1)

issueBadge starBadge repoSize lastCommit

This repository is an unofficial implementation of the face swapping model proposed by Wang et. al in their paper HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping. This implementation makes use of the Pytorch Lighting library, a light-weight wrapper for PyTorch.

HifiFace Overview

The task of face swapping applies the face and the identity of the source person to the head of the target.

The HifiFace architecture can be broken up into three primary structures. The 3D shape-aware identity extractor, the semantic facial fusion module, and an encoder-decoder structure. A high-level overview of the architecture can be seen in the image below.

Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 2, pg. 3)

Changes from the original paper

Dataset

In the paper, the author used VGGFace2 and Asian-Celeb as the training dataset. Unfortunately, the Asian-Celeb dataset can only be accessed with a Baidu account, which we do not have. Thus, we only use VGGFace2 for our training dateset.

Model

The paper proposes two versions of HifiFace model based on the output image size: 256x256 and 512x512 (referred to as Ours-256 and Ours-512 in the paper). The 512x512 model uses an extra data preprocessing before training. In this open source project, we implement the 256x256 model. For the discriminator, the original paperuses the discriminator from StarGAN v2. Our implementation uses the multi-scale discriminator from SPADE.

Installation

Build Docker Image

git clone https://github.com/mindslab-ai/hififace 
cd hififace
git clone https://github.com/sicxu/Deep3DFaceRecon_pytorch && git clone https://github.com/NVlabs/nvdiffrast && git clone https://github.com/deepinsight/insightface.git
cp -r insightface/recognition/arcface_torch/ Deep3DFaceRecon_pytorch/models/
cp -r insightface/recognition/arcface_torch/ ./model/
rm -rf insightface
cp -rf 3DMM/* Deep3DFaceRecon_pytorch
mv Deep3DFaceRecon_pytorch model/
rm -rf 3DMM
docker build -t hififace:latent .
rm -rf nvdiffrast

This Dockerfile was inspired by @yuzhou164, this issue from Deep3DFaceRecon_pytorch.

Pre-Trained Model for Deep3DFace PyTorch

Follow the guideline in Prepare prerequisite models

Set up at ./mode/Deep3DFaceRecon_pytorch/

Pre-Trained Models for ArcFace

We used official Arcface per-trained pytorch implementation Download pre-trained checkpoint from onedrive (IResNet-100 trained on MS1MV3)

Download HifiFace Pre-Trained Model

google drive link trained on VGGFace2, 300K iterations

Training

Dataset & Preprocessing

Align & Crop

We aligned the face images with the landmark extracted by 3DDFA_V2. The code will be added.

Face Segmentation Map

After finishing aligning the face images, you need to get the face segmentation map for each face images. We used face segmentation model that PSFRGAN provides. You can use their code and pre-trained model.

Dataset Folder Structure

Each face image and the corresponding segmentation map should have the same name and the same relative path from the top-level directory.

face_image_dataset_folder
└───identity1
│   │   image1.png
│   │   image2.png
│   │   ...
│   
└───identity2
│   │   image1.png
│   │   image2.png
│   │   ...
│ 
|   ...

face_segmentation_mask_folder
└───identity1
│   │   image1.png
│   │   image2.png
│   │   ...
│   
└───identity2
│   │   image1.png
│   │   image2.png
│   │   ...
│ 
|   ...

Wandb

Wandb is a powerful tool to manage your model training. Please make a wandb account and a wandb project for training HifiFace with our training code.

Changing the Configuration

  • config/model.yaml

    • dataset.train.params.image_root: directory path to the training dataset images
    • dataset.train.params.parsing_root: directory path to the training dataset parsing images
    • dataset.validation.params.image_root: directory path to the validation dataset images
    • dataset.validation.params.parsing_root: directory path to the validation dataset parsing images
  • config/trainer.yaml

    • checkpoint.save_dir: directory where the checkpoints will be saved
    • wandb: fill out your wandb entity and project name

Run Docker Container

docker run -it --ipc host --gpus all -v /PATH_TO/hififace:/workspace -v /PATH_TO/DATASET/FOLDER:/DATA --name hififace hififace:latent

Run Training Code

python hififace_trainer.py --model_config config/model.yaml --train_config config/trainer.yaml -n hififace

Inference

Single Image

python hififace_inference --gpus 0 --model_config config/model.yaml --model_checkpoint_path hififace_opensouce_299999.ckpt --source_image_path asset/inference_sample/01_source.png --target_image_path asset/inference_sample/01_target.png --output_image_path ./01_result.png

All Posible Pairs of Images in Directory

python hififace_inference --gpus 0 --model_config config/model.yaml --model_checkpoint_path hififace_opensouce_299999.ckpt  --input_directory_path asset/inference_sample --output_image_path ./result.png

Interpolation

# interpolates both the identity and the 3D shape.
python hififace_inference --gpus 0 --model_config config/model.yaml --model_checkpoint_path hififace_opensouce_299999.ckpt --source_image_path asset/inference_sample/01_source.png --target_image_path asset/inference_sample/01_target.png --output_image_path ./01_result_all.gif  --interpolation_all 

# interpolates only the identity.
python hififace_inference --gpus 0 --model_config config/model.yaml --model_checkpoint_path hififace_opensouce_299999.ckpt --source_image_path asset/inference_sample/01_source.png --target_image_path asset/inference_sample/01_target.png --output_image_path ./01_result_identity.gif  --interpolation_identity

# interpolates only the 3D shape.
python hififace_inference --gpus 0 --model_config config/model.yaml --model_checkpoint_path hififace_opensouce_299999.ckpt --source_image_path asset/inference_sample/01_source.png --target_image_path asset/inference_sample/01_target.png --output_image_path ./01_result_3d.gif  --interpolation_3d

Our Results

The results from our pre-trained model.

GIF interpolaiton results from Obama to Trump to Biden back to Obama. The left image interpolates both the identity and the 3D shape. The middle image interpolates only the identity. The right image interpolates only the 3D shape.

To-Do List

  • Pre-processing Code
  • Colab Notebook

License

BSD 3-Clause License.

Implementation Author

Changho Choi @ MINDs Lab, Inc. ([email protected])

Matthew B. Webster @ MINDs Lab, Inc. ([email protected])

Citations

@article{DBLP:journals/corr/abs-2106-09965,
  author    = {Yuhan Wang and
               Xu Chen and
               Junwei Zhu and
               Wenqing Chu and
               Ying Tai and
               Chengjie Wang and
               Jilin Li and
               Yongjian Wu and
               Feiyue Huang and
               Rongrong Ji},
  title     = {HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping},
  journal   = {CoRR},
  volume    = {abs/2106.09965},
  year      = {2021}
}
Owner
MINDs Lab
MINDsLab provides AI platform and various AI engines based on deep machine learning.
MINDs Lab
A Runtime method overload decorator which should behave like a compiled language

strongtyping-pyoverload A Runtime method overload decorator which should behave like a compiled language there is a override decorator from typing whi

20 Oct 31, 2022
Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

face-vid2vid Usage Dataset Preparation cd datasets wget https://yt-dl.org/downloads/latest/youtube-dl -O youtube-dl chmod a+rx youtube-dl python load_

worstcoder 68 Dec 30, 2022
CoRe: Contrastive Recurrent State-Space Models

CoRe: Contrastive Recurrent State-Space Models This code implements the CoRe model and reproduces experimental results found in Robust Robotic Control

Apple 21 Aug 11, 2022
Cards Against Humanity AI

cah-ai This is a Cards Against Humanity AI implemented using a pre-trained Semantic Search model. How it works A player is described by a combination

Alex Nichol 2 Aug 22, 2022
Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network."

R2RNet Official code of "R2RNet: Low-light Image Enhancement via Real-low to Real-normal Network." Jiang Hai, Zhu Xuan, Ren Yang, Yutong Hao, Fengzhu

77 Dec 24, 2022
Vrcwatch - Supply the local time to VRChat as Avatar Parameters through OSC

English: README-EN.md VRCWatch VRCWatch は、VRChat 内のアバター向けに現在時刻を送信するためのプログラムです。 使

Kosaki Mezumona 17 Nov 30, 2022
Generic template to bootstrap your PyTorch project with PyTorch Lightning, Hydra, W&B, and DVC.

NN Template Generic template to bootstrap your PyTorch project. Click on Use this Template and avoid writing boilerplate code for: PyTorch Lightning,

Luca Moschella 520 Dec 30, 2022
Automatic differentiation with weighted finite-state transducers.

GTN: Automatic Differentiation with WFSTs Quickstart | Installation | Documentation What is GTN? GTN is a framework for automatic differentiation with

100 Dec 29, 2022
The code of NeurIPS 2021 paper "Scalable Rule-Based Representation Learning for Interpretable Classification".

Rule-based Representation Learner This is a PyTorch implementation of Rule-based Representation Learner (RRL) as described in NeurIPS 2021 paper: Scal

Zhuo Wang 53 Dec 17, 2022
This repository for project that can Automate Number Plate Recognition (ANPR) in Morocco Licensed Vehicles. 💻 + 🚙 + 🇲🇦 = 🤖 🕵🏻‍♂️

MoroccoAI Data Challenge (Edition #001) This Reposotory is result of our work in the comepetiton organized by MoroccoAI in the context of the first Mo

SAFOINE EL KHABICH 14 Oct 31, 2022
This is a TensorFlow implementation for C2-Rec

This is a TensorFlow implementation for C2-Rec We refer to the repo SASRec. Requirements requirement.txt Datasets This repo includes Amazon Beauty dat

7 Nov 14, 2022
🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

🦕 nanosaur NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano Website: nanosaur.ai Do you need an help? Discord For tech

NanoSaur 162 Dec 09, 2022
Sequence lineage information extracted from RKI sequence data repo

Pango lineage information for German SARS-CoV-2 sequences This repository contains a join of the metadata and pango lineage tables of all German SARS-

Cornelius Roemer 24 Oct 26, 2022
Official PyTorch implementation of Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations Zhenyu Jiang, Yifeng Zhu, Maxwell Svetlik, Kuan Fang, Yu

UT-Austin Robot Perception and Learning Lab 63 Jan 03, 2023
DeepHyper: Scalable Asynchronous Neural Architecture and Hyperparameter Search for Deep Neural Networks

What is DeepHyper? DeepHyper is a software package that uses learning, optimization, and parallel computing to automate the design and development of

DeepHyper Team 214 Jan 08, 2023
LBK 20 Dec 02, 2022
Koç University deep learning framework.

Knet Knet (pronounced "kay-net") is the Koç University deep learning framework implemented in Julia by Deniz Yuret and collaborators. It supports GPU

1.4k Dec 31, 2022
The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose Release Notes The offical PyTorch implementation of Neural View Sy

Angtian Wang 20 Oct 09, 2022
The Ludii general game system, developed as part of the ERC-funded Digital Ludeme Project.

The Ludii General Game System Ludii is a general game system being developed as part of the ERC-funded Digital Ludeme Project (DLP). This repository h

Digital Ludeme Project 50 Jan 04, 2023
CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss

CAMoE + Dual SoftMax Loss (DSL): Improving Video-Text Retrieval by Multi-Stream Corpus Alignment and Dual Softmax Loss This is official implement of "

程星 87 Dec 24, 2022