AudioDVP:Photorealistic Audio-driven Video Portraits

Related tags

AudioAudioDVP
Overview

AudioDVP

This is the official implementation of Photorealistic Audio-driven Video Portraits.

Major Requirements

  • Ubuntu >= 18.04
  • PyTorch >= 1.2
  • GCC >= 7.5
  • NVCC >= 10.1
  • FFmpeg (with H.264 support)

FYI, detailed environment setup is in enviroment.yml. (You definitely don't have to install all of them, just install what you need when you encounter an import error.)

Major implementation differences against original paper

  • Geometry parameter and texture parameter of 3DMM is now initialized from zero and shared among all samples during fitting, since it is more reasonable.

  • Using OpenCV rather than PIL for image editing operation.

Usage

1. Download face model data

  • Download Basel Face Model 2009. (Register and get 01_MorphableModel.mat.)

  • Download expression basis from 3DFace. (There is an Exp_Pca.bin in CoarseData.)

  • Download auxiliary files from Deep3DFaceReconstruction.

  • Put the data in renderer/data like the structure below.

    renderer/data
    ├── 01_MorphableModel.mat
    ├── Exp_Pca.bin
    ├── BFM_front_idx.mat
    ├── BFM_exp_idx.mat
    ├── facemodel_info.mat
    ├── select_vertex_id.mat
    ├── std_exp.txt
    └── data.mat(This is generated by the step 2 below.)
    

2. Build data

cd renderer/
python build_data.py

3.Download pretrained model of ATnet

  • The link is here.
  • Put atnet_lstm_18.pth in vendor/ATVGnet/model.

4.Download pretrained ResNet on VGGFace2

  • The link is here.
  • Put resnet50_ft_weight.pkl in weights

5.Download Trump speech video

  • The link is here. (Video courtesy of The White House.)
  • Put it in data/video

6.Compile CUDA rasterizer kernel

cd renderer/kernels
python setup.py build_ext --inplace

7.Running demo script

# Explanation of every step is provided.
./scripts/demo.sh

Since we provide both training and inference code, we won't upload pretrained model for brevity at present. We provide expected result in data/sample_result.mp4 using synthesized audio in data/test_audio.

Acknowledgment

This work is build upon many great open source code and data.

Notification

  • Our method is built upon Deep Video Portraits.
  • Our method adopts a person-specific Audio2Expression module, which is not robust enough than a universal one trained on large dataset such as Lip Reading Sentences in the Wild. A universal one is encouraged! Fortunately, our method works quite well on WaveNet sythesized audio like provided in data/test_audio.
  • The code IS NOT fully tested on another clean machine.
  • There is a known bug in the rasterizer that several pixels of rendered face are black (not assigned with any color) in some corner conditions due to float point error which I can't fix.

Disclaimer

We made this code publicly available to benefit graphics and vision community. Please DO NOT abuse the code for devil things.

Citation

@article{wen2020audiodvp,
    author={Xin Wen and Miao Wang and Christian Richardt and Ze-Yin Chen and Shi-Min Hu},
    journal={IEEE Transactions on Visualization and Computer Graphics}, 
    title={Photorealistic Audio-driven Video Portraits}, 
    year={2020},
    volume={26},
    number={12},
    pages={3457-3466},
    doi={10.1109/TVCG.2020.3023573}
}

License

BSD

Datamoshing with FFmpeg

ffmosher Datamoshing with FFmpeg Drag and drop video onto mosh.bat to create a datamoshed video. To datamosh an image, please ensure the file is in a

18 Sep 11, 2022
Improved Python UI to convert Youtube URL to .mp3 file.

YT-MP3 Improved Python UI to convert Youtube URL to .mp3 file. How to use? Just run python3 main.py Enter the URL of the video Enter the PATH of where

8 Jun 19, 2022
A Music Player Bot for Discord Servers

A Music Player Bot for Discord Servers

Halil Acar 2 Oct 25, 2021
Algorithmic Multi-Instrumental MIDI Continuation Implementation

Matchmaker Algorithmic Multi-Instrumental MIDI Continuation Implementation Taming large-scale MIDI datasets with algorithms This is a WIP so please ch

Alex 2 Mar 11, 2022
Codes for "Efficient Long-Range Attention Network for Image Super-resolution"

ELAN Codes for "Efficient Long-Range Attention Network for Image Super-resolution", arxiv link. Dependencies & Installation Please refer to the follow

xindong zhang 124 Dec 22, 2022
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.

Audiomentations A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio a

Iver Jordal 1.2k Jan 07, 2023
A python library for working with praat, textgrids, time aligned audio transcripts, and audio files.

praatIO Questions? Comments? Feedback? A library for working with praat, time aligned audio transcripts, and audio files that comes with batteries inc

Tim 224 Dec 19, 2022
music library manager and MusicBrainz tagger

beets Beets is the media library management system for obsessive music geeks. The purpose of beets is to get your music collection right once and for

beetbox 11.3k Dec 31, 2022
Python CD-DA ripper preferring accuracy over speed

Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star

671 Jan 04, 2023
Mentos Music Bot With Python

Mentos Music Bot For Any Query Join Our Support Group 👥 Special Thanks - @OfficialYukki Hey Welcome To Here 💫 💫 You Can Make Your Own Music Bot Fo

Cyber Toxic 13 Oct 21, 2022
python wrapper for rubberband

pyrubberband A python wrapper for rubberband. For now, this just provides lightweight wrappers for pitch-shifting and time-stretching. All processing

Brian McFee 106 Nov 28, 2022
Make an audio file (really) long-winded

longwind Make an audio file (really) long-winded Daily repetitions are an illusion anyway.

Vincent Lostanlen 2 Sep 12, 2022
The official repository for Audio ALBERT

AALBERT Here is also the official repository of AALBERT, which is Pytorch lightning reimplementation of the paper, Audio ALBERT: A Lite Bert for Self-

pohan 55 Dec 11, 2022
A Quick Music Player Made Fully in Python

Quick Music Player Made Fully In Python. Pure Python, cross platform, single function module with no dependencies for playing sounds. Installation & S

1 Dec 24, 2021
MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling Demos | Blog Post | Colab Notebook | Paper | MIDI-DDSP is a hierarchical

Magenta 239 Jan 03, 2023
[Singing Log] Let your program learn to sing!

[Singing Log] Let your program learn to sing! You must have thought this was changelog when you saw the English title, but it's not, it's chànggēlog. What it does is allow your program to print logs

黄巍 22 Sep 03, 2022
A python program for visualizing MIDI files, and displaying them in a spiral layout

SpiralMusic_python A python program for visualizing MIDI files, and displaying them in a spiral layout For a hardware version using Teensy & LED displ

Gavin 6 Nov 23, 2022
𝙰 𝙼𝚞𝚜𝚒𝚌 𝙱𝚘𝚝 𝙲𝚛𝚎𝚊𝚝𝚎𝚍 𝙱𝚢 𝚃𝚎𝚊𝚖𝙳𝚕𝚝 💖

TeamDltmusic 𝙰 𝙼𝚞𝚜𝚒𝚌 𝙱𝚘𝚝 𝙲𝚛𝚎𝚊𝚝𝚎𝚍 𝙱𝚢 𝚃𝚎𝚊𝚖𝙳𝚕𝚝 💖 Deploy String Session String Click hear you can find string session OR join He

TeamDlt 5 Jan 18, 2022
📺Headless全自动B站直播录播、切片、上传一体工具

DDRecorder Headless全自动B站直播录播、切片、上传一体工具 感谢 FortuneDayssss/BilibiliUploader 安装指南(Windows) 在Release下载zip包解压。 修改配置文件config.json 双击运行DDRecorder.exe (这将使用co

322 Dec 27, 2022
Spotipy - Player de música simples em Python

Spotipy Player de música simples em Python, utilizando a biblioteca Pysimplegui para a interface gráfica. Este tocador é bastante simples em si, mas p

Adelino Almeida 4 Feb 28, 2022