Python codes for Lite Audio-Visual Speech Enhancement.

Last update: Dec 01, 2022

Related tags

Deep Learning LAVSE

Overview

Lite Audio-Visual Speech Enhancement (Interspeech 2020)

Introduction

This is the PyTorch implementation of Lite Audio-Visual Speech Enhancement (LAVSE).

We have also put some preprocessed sample data (including enhanced results) in this repository.

The dataset of TMSV (Taiwan Mandarin speech with video) used in LAVSE is released here.

Please cite the following paper if you find the codes useful in your research.

@inproceedings{chuang2020lite,
  title={Lite Audio-Visual Speech Enhancement},
  author={Chuang, Shang-Yi and Tsao, Yu and Lo, Chen-Chou and Wang, Hsin-Min},
  booktitle={Proc. Interspeech 2020}
}

Prerequisites

Ubuntu 18.04
Python 3.6
CUDA 10

You can use pip to install Python depedencies.

pip install -r requirements.txt

Usage

You can simply enter the command below and the average PESQ and STOI results will show on your terminal pane.

Remember to activate visdom (probably in a screen or tmux) for recording the training loss before bashing the script.

bash run.sh

Go check run.sh if you need further information about the command lines.

License

The LAVSE work is released under MIT License.

See LICENSE for more details.

Acknowledgments

Bio-ASP Lab, CITI, Academia Sinica, Taipei, Taiwan
SLAM Lab, IIS, Academia Sinica, Taipei, Taiwan

Python codes for Lite Audio-Visual Speech Enhancement.

Related tags

Overview

Lite Audio-Visual Speech Enhancement (Interspeech 2020)

Introduction

Prerequisites

Usage

License

Acknowledgments

Owner

Shang-Yi Chuang

Faster RCNN with PyTorch

Real-CUGAN - Real Cascade U-Nets for Anime Image Super Resolution

Deep Federated Learning for Autonomous Driving

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

[ICCV 2021] Learning A Single Network for Scale-Arbitrary Super-Resolution

Pytorch Implementation of PointNet and PointNet++++

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics

2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

tensorrt int8 量化yolov5 4.0 onnx模型

SEJE Pytorch implementation

Official repo of the paper "Surface Form Competition: Why the Highest Probability Answer Isn't Always Right"

Do Neural Networks for Segmentation Understand Insideness?

Numenta Platform for Intelligent Computing is an implementation of Hierarchical Temporal Memory (HTM), a theory of intelligence based strictly on the neuroscience of the neocortex.

Traffic4D: Single View Reconstruction of Repetitious Activity Using Longitudinal Self-Supervision

A PyTorch implementation of unsupervised SimCSE

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Simple tutorials using Google's TensorFlow Framework

Official implementation of MSR-GCN (ICCV 2021 paper)

This repository is maintained for the scientific paper tittled " Study of keyword extraction techniques for Electric Double Layer Capacitor domain using text similarity indexes: An experimental analysis "

NasirKhusraw - The TSP solved using genetic algorithm and show TSP path overlaid on a map of the Iran provinces & their capitals.