ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library

ERISHA is a multilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available. The term ERISHA means speech in Sanskrit. The framework of ERISHA includes various deep learning architectures such as Global Style Token (GST), Variational Autoencoder (VAE), and Gaussian Mixture Variational Autoencoder (GMVAE), and X-vectors for building prosody encoder.

Currently, the library is in its initial stage of development and will be updated frequently in the coming days.

Stay tuned for more updates, and we are open to collaboration !!!

Installation and Training

Refer INSTALL for initial setup

Available recipes

Available Features

Resampling of speech waveforms to target sampling rate in recipes
Support to train TTS system for other languages
Support to train Multilingual TTS system for other languages

Upcoming updates

[User Documentation]
Pytorch Lightning
Multiclass N-pair loss
[Cluster sampling for improving latent representation of speaker and expressivity](Proposed work)

Acknowledgements

This implementation uses code from the following repos: NVIDIA, Keith Ito, Prem Seetharaman, Chengqi Deng,Dannynis, Jhosimar George Arias Figueroa

ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

Related tags

Overview

ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library

Installation and Training

Available recipes

Available Features

Upcoming updates

Acknowledgements

Owner

Ajinkya Kulkarni

Awesome Remote Sensing Toolkit based on PaddlePaddle.

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

FastFace: Lightweight Face Detection Framework

Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Siamese TabNet

The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal Transport Maps, ICLR 2022.

A hue shift helper for OBS

[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Training and Evaluation Code for Neural Volumes

Libtorch yolov3 deepsort

Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

GDSC-ML Team Interview Task

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Hand Gesture Volume Control | Open CV | Computer Vision

TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

ERISHA is a mulitilingual multispeaker expressive speech synthesis framework. It can transfer the expressivity to the speaker's voice for which no expressive speech corpus is available.

Related tags

Overview

ERISHA: Multilingual Multispeaker Expressive Text-to-Speech Library

Installation and Training

Available recipes

Available Features

Upcoming updates

Acknowledgements

Owner

Ajinkya Kulkarni

Awesome Remote Sensing Toolkit based on PaddlePaddle.

The Incredible PyTorch: a curated list of tutorials, papers, projects, communities and more relating to PyTorch.

FastFace: Lightweight Face Detection Framework

Official Pytorch implementation of ICLR 2018 paper Deep Learning for Physical Processes: Integrating Prior Scientific Knowledge.

Siamese TabNet

The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal Transport Maps, ICLR 2022.

A hue shift helper for OBS

[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition

Training and Evaluation Code for Neural Volumes

Libtorch yolov3 deepsort

Official implementation of the article "Unsupervised JPEG Domain Adaptation For Practical Digital Forensics"

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs

​ This is the Pytorch implementation of Progressive Attentional Manifold Alignment.

PyTorch implementations for our SIGGRAPH 2021 paper: Editable Free-viewpoint Video Using a Layered Neural Representation.

GDSC-ML Team Interview Task

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Defending graph neural networks against adversarial attacks (NeurIPS 2020)

Hand Gesture Volume Control | Open CV | Computer Vision

TraSw for FairMOT - A Single-Target Attack example (Attack ID: 19; Screener ID: 24):

Data-Uncertainty Guided Multi-Phase Learning for Semi-supervised Object Detection

This is the Pytorch implementation of Progressive Attentional Manifold Alignment.