HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Last update: Dec 27, 2022

HiFiGAN Denoiser

This is a Unofficial Pytorch implementation of the paper HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks.

Citations

@misc{su2020hifigan,
      title={HiFi-GAN: High-Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks}, 
      author={Jiaqi Su and Zeyu Jin and Adam Finkelstein},
      year={2020},
      eprint={2006.05694},
      archivePrefix={arXiv},
      primaryClass={eess.AS}
}

Requirement

Tested on Python 3.6

pip install -r requirements.txt

Train & Tensorboard

python train.py -c [config yaml file]
tensorboard --logdir log_dir

Inference

python inference.py -p [checkpoint path] -i [input wav path]

Checkpoint :

References

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks
Denoising Wavenet Generator
StarGAN VC Discriminator
Melgan Multi-Scale Discriminator
Parallel Wavegan
HiFi GAN vocoder's MSD and multi-gpu training code

HiFi-GAN: High Fidelity Denoising and Dereverberation Based on Speech Deep Features in Adversarial Networks

Related tags

Overview

HiFiGAN Denoiser

Citations

Requirement

Train & Tensorboard

Inference

Checkpoint :

References

Owner

Rishikesh (ऋषिकेश)

sssegmentation is a general framework for our research on strongly supervised semantic segmentation.

Code for "Training Neural Networks with Fixed Sparse Masks" (NeurIPS 2021).

This is the official implementation of 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection, built on SECOND.

PConv-Keras - Unofficial implementation of "Image Inpainting for Irregular Holes Using Partial Convolutions". Try at: www.fixmyphoto.ai

Self-Supervised depth kalilia

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

SMPL-X: A new joint 3D model of the human body, face and hands together

Official implementation of "One-Shot Voice Conversion with Weight Adaptive Instance Normalization".

BMN: Boundary-Matching Network

Json2Xml tool will help you convert from json COCO format to VOC xml format in Object Detection Problem.

Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

DIRL: Domain-Invariant Representation Learning

Prior-Guided Multi-View 3D Head Reconstruction

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search

Code and data for ImageCoDe, a contextual vison-and-language benchmark

Repo for "Event-Stream Representation for Human Gaits Identification Using Deep Neural Networks"

sequitur is a library that lets you create and train an autoencoder for sequential data in just two lines of code

TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TorchGRL is the source code for our paper Graph Convolution-Based Deep Reinforcement Learning for Multi-Agent Decision-Making in Mixed Traffic Environments for IV 2022.

Fashion Entity Classification