Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Last update: Dec 30, 2022

Overview

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Jinglin Liu, Chengxi Li, Yi Ren, Zhiying Zhu, Zhou Zhao

Zhejiang University

ACL 2022 Main conference

Project Page

🚧 ⛏️ 🛠️ 👷

This repository is the official PyTorch implementation of our ACL-2022 paper. Now, we release the codes for SADTW algorithm in our paper. The current expected release time of the full version codes and data is at the ACL-2022 conference (before June. 2022). Please star us and stay tuned!

|--modules
    |--voice_conversion
        |--dtw
            |--enhance_sadtw.py  (Our algorithm)
|--tasks
    |--singing
        |--pitch_alignment_task.py  (Usage example)

🚀 News:

Feb.24, 2022: Our new work, NeuralSVB was accepted by ACL-2022. Demo Page.
Dec.01, 2021: Our recent work DiffSinger was accepted by AAAI-2022. | .
Sep.29, 2021: Our recent work PortaSpeech was accepted by NeurIPS-2021. .
May.06, 2021: We submitted DiffSinger to Arxiv .

Abstract

We are interested in a novel task, singing voice beautifying (SVB). Given the singing voice of an amateur singer, SVB aims to improve the intonation and vocal tone of the voice, while keeping the content and vocal timbre. Current automatic pitch correction techniques are immature, and most of them are restricted to intonation but ignore the overall aesthetic quality. Hence, we introduce Neural Singing Voice Beautifier (NSVB), the first generative model to solve the SVB task, which adopts a conditional variational autoencoder as the backbone and learns the latent representations of vocal tone. In NSVB, we propose a novel time-warping approach for pitch correction: Shape-Aware Dynamic Time Warping (SADTW), which ameliorates the robustness of existing time-warping approaches, to synchronize the amateur recording with the template pitch curve. Furthermore, we propose a latent-mapping algorithm in the latent space to convert the amateur vocal tone to the professional one. Extensive experiments on both Chinese and English songs demonstrate the effectiveness of our methods in terms of both objective and subjective metrics.

Issues

Before raising a issue, please check our Readme and other issues for possible solutions.
We will try to handle your problem in time but we could not guarantee a satisfying solution.
Please be friendly.

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

Balancing Training for Multilingual Neural Machine Translation Implementation of the paper Balancing Training for Multilingual Neural Machine Translat

21 May 18, 2022

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

MDCA Calibration This is the official PyTorch implementation for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved

21 Dec 22, 2022

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

AutoSF The code for our paper "AutoSF: Searching Scoring Functions for Knowledge Graph Embedding" and this paper has been accepted by ICDE2020. News:

64 Dec 17, 2022

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

107 Dec 2, 2022

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Inductive entity representations from text via link prediction This repository contains the code used for the experiments in the paper "Inductive enti

45 Jan 9, 2023

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

FOOD - Fast OOD Detector Pytorch implamentation of the confernce peper FOOD arxiv link. Abstract Deep neural networks (DNNs) perform well at classifyi

17 Jun 19, 2022

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Learning Opinion Summarizers by Selecting Informative Reviews This repository contains the codebase and the dataset for the corresponding EMNLP 2021

39 Jan 1, 2023

Ratatoskr: Worcester Tech's conference scheduling system

Ratatoskr: Worcester Tech's conference scheduling system In Norse mythology, Ratatoskr is a squirrel who runs up and down the world tree Yggdrasil to

4 Dec 22, 2022

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Code for "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval" (ACL 2021, Long) This is the repository for baseline m

25 Oct 30, 2022

Comments

Problem with proper data loading

Hi, I'd like to run your model by myself, however I cannot find proper way to load the dataset with .mp3 files you provided. Is there a chance to share the dataloader you've used or give some hints how to process the .mp3 files to valid dataset which could be used in your usage examples? I'll be very grateful!

opened by pstryczke 9
关于NSVB

听了demo后有些疑问， 1 如果实际使用来美化唱歌，那么Inference的时候是需要原唱的pitch curve对吧？ 2 虽然测试样例不在训练样本中，但是GT Professional和GT Amateur是同一个人录制的。Inference中GT Professional不可能是自己，这样泛化性有测试过吗？

opened by suzhenghang 0
hi, request for datasets and source code.

This work is very outstanding and we are insterested in it. Are there any plans to make the dataset and associated pretrained models public in the near future? Thank you

opened by hertz-pj 0

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

Related tags

Overview

Learning the Beauty in Songs: Neural Singing Voice Beautifier

Abstract

Issues

You might also like...

Code for the paper "Balancing Training for Multilingual Neural Machine Translation, ACL 2020"

[CVPR 2022] Official code for the paper: "A Stitch in Time Saves Nine: A Train-Time Regularizing Loss for Improved Neural Network Calibration"

Y. Zhang, Q. Yao, W. Dai, L. Chen. AutoSF: Searching Scoring Functions for Knowledge Graph Embedding. IEEE International Conference on Data Engineering (ICDE). 2020

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

"Inductive Entity Representations from Text via Link Prediction" @ The Web Conference 2021

Github for the conference paper GLOD-Gaussian Likelihood OOD detector

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

Ratatoskr: Worcester Tech's conference scheduling system

The official implementation for ACL 2021 "Challenges in Information Seeking QA: Unanswerable Questions and Paragraph Retrieval".

Comments

Problem with proper data loading

关于NSVB

hi, request for datasets and source code.

Releases(pre-release)

pre-release(May 27, 2022)

Owner

Jinglin Liu

Show-attend-and-tell - TensorFlow Implementation of "Show, Attend and Tell"

🐦 Quickly annotate data from the comfort of your Jupyter notebook

Safe Model-Based Reinforcement Learning using Robust Control Barrier Functions

🕹️ Official Implementation of Conditional Motion In-betweening (CMIB) 🏃

[NeurIPS '21] Adversarial Attacks on Graph Classification via Bayesian Optimisation (GRABNEL)

Numba-accelerated Pythonic implementation of MPDATA with examples in Python, Julia and Matlab

WarpRNNT loss ported in Numba CPU/CUDA for Pytorch

The software associated with a paper accepted at EMNLP 2021 titled "Open Knowledge Graphs Canonicalization using Variational Autoencoders".

A video scene detection algorithm is designed to detect a variety of different scenes within a video

The repo contains the code of the ACL2020 paper `Dice Loss for Data-imbalanced NLP Tasks`

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

Deeprl - Standard DQN and dueling network for simple games

Best Practices on Recommendation Systems

SoK: Vehicle Orientation Representations for Deep Rotation Estimation

This program automatically runs Python code copied in clipboard

ncnn is a high-performance neural network inference framework optimized for the mobile platform

Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

Towhee is a flexible machine learning framework currently focused on computing deep learning embeddings over unstructured data.

Omniscient Video Super-Resolution

RoadMap and preparation material for Machine Learning and Data Science - From beginner to expert.