Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Last update: Dec 21, 2022

Related tags

Overview

Parallel and High-Fidelity Text-to-Lip Generation

This repository is the official PyTorch implementation of our AAAI-2022 paper, in which we propose ParaLip (for text-based talking face synthesis) .

Video Demos

P+22M_si1076.mp4

Video samples can be found in our demo page.

🚀 News:

Feb.24, 2022: Our new work, NeuralSVB was accepted by ACL-2022 . Project Page.
Dec.01, 2021: ParaLip was accepted by AAAI-2022.
July.14, 2021: We submitted ParaLip to Arxiv .

Environments

conda create -n your_env_name python=3.7
source activate your_env_name 
pip install -r requirements.txt

ParaLip

1. Preparation

Data Preparation

We provide the first frame of each test example for inference. Besides, we include the audio pieces of 5 test examples to generate talking lip videos with human voice.

a) Download and decompress the TCD-TIMIT dataset, then put them in the data directory

tar -xvf timit.tar
mv timit data/

b) Run the following scripts to pack the dataset for inference.

export PYTHONPATH=.
python datasets/lipgen/timit/gen_timit.py --config configs/lipgen/timit/lipgen_timit.yaml

We don't provide the full datasets of TCD-TIMIT because of the licence issue. You can download it by yourself if necessary.

2. Inference Example

CUDA_VISIBLE_DEVICES=0 python tasks/timit_lipgen_task.py --config configs/lipgen/timit/lipgen_timit.yaml --exp_name timit_2 --infer --reset

We also provide:

the pre-trained model of ParaLip on TCD-TIMIT. Remember to put the pre-trained models in checkpoints/timit_2 directory respectively.

Citation

@misc{https://doi.org/10.48550/arxiv.2107.06831,
  doi = {10.48550/ARXIV.2107.06831},
  
  url = {https://arxiv.org/abs/2107.06831},
  
  author = {Liu, Jinglin and Zhu, Zhiying and Ren, Yi and Huang, Wencan and Huai, Baoxing and Yuan, Nicholas and Zhao, Zhou},
  
  keywords = {Multimedia (cs.MM), Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
  
  title = {Parallel and High-Fidelity Text-to-Lip Generation},
  
  publisher = {arXiv},
  
  year = {2021},
  
  copyright = {arXiv.org perpetual, non-exclusive license}
}

You might also like...

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis Jungil Kong, Jaehyeon Kim, Jaekyoung Bae In our paper, we p

31 Dec 8, 2022

Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.

Neural Spatio-Temporal Point Processes [arxiv] Ricky T. Q. Chen, Brandon Amos, Maximilian Nickel Abstract. We propose a new class of parameterizations

75 Dec 19, 2022

《Towards High Fidelity Face Relighting with Realistic Shadows》(CVPR 2021)

Towards High Fidelity Face-Relighting with Realistic Shadows Andrew Hou, Ze Zhang, Michel Sarkis, Ning Bi, Yiying Tong, Xiaoming Liu. In CVPR, 2021. T

114 Dec 10, 2022

Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos This repository is the official tensorflow python implementation

287 Jan 6, 2023

A two-stage U-Net for high-fidelity denoising of historical recordings

A two-stage U-Net for high-fidelity denoising of historical recordings Official repository of the paper (not submitted yet): E. Moliner and V. Välimäk

57 Jan 5, 2023

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing High-Fidelity GAN Inversion for Image Attribute Editing Update: We released the inferenc

371 Dec 30, 2022

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis Pretrained Models In this work, we created synthetic tissue

1 Feb 7, 2022

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation

76 Dec 5, 2022

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Knowledge Bridging for Empathetic Dialogue Generation This is the official implementation for paper Knowledge Bridging for Empathetic Dialogue Generat

50 Dec 20, 2022

Comments

How to create the *_PHN.txt for specific sentences?

Hi, I want the mouth to say sentences I specify, so I need to make phoneme files like *_PHN.txt in the timit directory. I would like to ask if there is any tool to do this?

opened by a312863063 1
How to train ParaLip?

I have already tested through the pretrained model, but i still cannot to train it. I think the code lack the trainning code. Is it available to share in the repository? Thank you!

opened by Zeqing-Wang 2

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

Related tags

Overview

Parallel and High-Fidelity Text-to-Lip Generation

Video Demos

Environments

ParaLip

1. Preparation

Data Preparation

2. Inference Example

Citation

You might also like...

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

Deep generative modeling for time-stamped heterogeneous data, enabling high-fidelity models for a large variety of spatio-temporal domains.

《Towards High Fidelity Face Relighting with Realistic Shadows》(CVPR 2021)

Tensorflow python implementation of "Learning High Fidelity Depths of Dressed Humans by Watching Social Media Dance Videos"

A two-stage U-Net for high-fidelity denoising of historical recordings

Implementation for HFGI: High-Fidelity GAN Inversion for Image Attribute Editing

SCI-AIDE : High-fidelity Few-shot Histopathology Image Synthesis for Rare Cancer Diagnosis

《LightXML: Transformer with dynamic negative sampling for High-Performance Extreme Multi-label Text Classiﬁcation》(AAAI 2021) GitHub:

Official implementation for paper Knowledge Bridging for Empathetic Dialogue Generation (AAAI 2021).

Comments

How to create the *_PHN.txt for specific sentences?

How to train ParaLip?

Releases(v0.1.0-alpha)

v0.1.0-alpha(Apr 24, 2022)

Owner

Zhying

Pytorch implementation of "ARM: Any-Time Super-Resolution Method"

the code for our CVPR 2021 paper Bilateral Grid Learning for Stereo Matching Network [BGNet]

Official repository for "Intriguing Properties of Vision Transformers" (2021)

Self-Adaptable Point Processes with Nonparametric Time Decays

Snapchat-filters-app-opencv-python - Here we used opencv and other inbuilt python modules to create filter application like snapchat

Transformer Tracking (CVPR2021)

Optimising chemical reactions using machine learning

Video-Captioning - A machine Learning project to generate captions for video frames indicating the relationship between the objects in the video

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

Multi-view 3D reconstruction using neural rendering. Unofficial implementation of UNISURF, VolSDF, NeuS and more.

The official code for paper "R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling".

Statsmodels: statistical modeling and econometrics in Python

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Self-Attention Between Datapoints: Going Beyond Individual Input-Output Pairs in Deep Learning

small collection of functions for neural networks

Recurrent Scale Approximation (RSA) for Object Detection

A python program to hack instagram

Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

Source Code for ICSE 2022 Paper - ``Can We Achieve Fairness Using Semi-Supervised Learning?''

Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"