A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Last update: Dec 28, 2022

Related tags

Deep Learning Pytorch-MBNet

Overview

Pytorch-MBNet

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Training

To train a new model, please run train.py, the input arguments are:

--data_path: The path of the directory containing all .wav files of VCC-2018 and the train/dev/test split files (the files in ./data).
--save_dir: The path of the directory to save the trained models. Please create the directory before training.
--total_steps: The total #training step in the training.
--valid_steps: Do the validation every #(valid_steps) of training update.
--log_steps: Log the tensorboard every #(log_steps) of training update.
--update_freq: Gradient accumulation, the default value is 1 (no accumulation).

Testing

To test on VCC-2018, please run test.py, the input arguments are:

--model_path: The path to the saved model.
--idtable_path: The path to the "judge id-number" mapping table file used during training.
--step: The time step for tensorboard log, which can be the same as the training steps.
--split: The valid/test split of data to be used in the testing.

Inference

After training on the VCC data, the model can be utilized to inference on other data. The input arguments are --data_path, --model_path, --save_dir, which are similar to the above. Notice that the bias-net is not used since in this code the ground-truth judge ids are assumed to be unavailable.

The pre-trained model can be found in ./pre_trained.

A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Related tags

Overview

Pytorch-MBNet

Training

Testing

Inference

Owner

Code for the ICCV2021 paper "Personalized Image Semantic Segmentation"

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

DCA - Official Python implementation of Delaunay Component Analysis algorithm

Code for our paper "Multi-scale Guided Attention for Medical Image Segmentation"

Compressed Video Action Recognition

Official implementation of EfficientPose

DC3: A Learning Method for Optimization with Hard Constraints

Retrieve and analysis data from SDSS (Sloan Digital Sky Survey)

Easy-to-use,Modular and Extendible package of deep-learning based CTR models .

The official project of SimSwap (ACM MM 2020)

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

Re-implememtation of MAE (Masked Autoencoders Are Scalable Vision Learners) using PyTorch.

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Official implementation of "Learning to Discover Cross-Domain Relations with Generative Adversarial Networks"

Code for our CVPR2021 paper coordinate attention

Repository for RNNs using TensorFlow and Keras - LSTM and GRU Implementation from Scratch - Simple Classification and Regression Problem using RNNs

LERP : Label-dependent and event-guided interpretable disease risk prediction using EHRs

Learning from Guided Play: A Scheduled Hierarchical Approach for Improving Exploration in Adversarial Imitation Learning Source Code

一个多模态内容理解算法框架，其中包含数据处理、预训练模型、常见模型以及模型加速等模块。

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.