This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Last update: Dec 04, 2022

Related tags

Text Data & NLP proteno

Overview

Proteno

This is the data release associated with the corresponding NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems (https://arxiv.org/abs/2104.07777)

Security

See CONTRIBUTING for more information.

License

This project is released under CC-BY-NC-4.0 and other licenses:

English: CC-BY-SA
Spanish: CC-BY-SA
Tamil: CC-BY-NC-SA

Citation

If you use our data, please cite the following paper:

@inproceedings{tyagi-etal-2021-proteno,
    title = "Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems",
    author = "Tyagi, Shubhi  and
      Bonafonte, Antonio  and
      Lorenzo-Trueba, Jaime  and
      Latorre, Javier",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2021.naacl-industry.10",
    pages = "72--79",
}

This repository contains data used in the NAACL 2021 Paper - Proteno: Text Normalization with Limited Data for Fast Deployment in Text to Speech Systems

Related tags

Overview

Proteno

Security

License

Citation

Owner

Natural Language Processing for Adverse Drug Reaction (ADR) Detection

This repository consists of a complete guide on natural language processing (NLP) in Python where we'll learn various techniques for implementing NLP including parsing & text processing and understand how to use NLP for text feature engineering.

This repository contains helper functions which can help you generate additional data points depending on your NLP task.

Pytorch implementation of Tacotron

This is a GUI program that will generate a word search puzzle image

German Text-To-Speech Engine using Tacotron and Griffin-Lim

Creating a python chatbot that Starbucks users can text to place an order + help cut wait time of a normal coffee.

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

A simple Speech Emotion Recognition (SER) API created using Flask and running in a Docker container.

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Fake Shakespearean Text Generator

Based on 125GB of data leaked from Twitch, you can see their monthly revenues from 2019-2021

Prompt tuning toolkit for GPT-2 and GPT-Neo

Persian Bert For Long-Range Sequences

Pre-training with Extracted Gap-sentences for Abstractive SUmmarization Sequence-to-sequence models

Simple telegram bot to convert files into direct download link.you can use telegram as a file server 🪁

p-tuning for few-shot NLU task

A BERT-based reverse dictionary of Korean proverbs

Unsupervised Language Model Pre-training for French

An easier way to build neural search on the cloud