Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Last update: Nov 16, 2022

Overview

Neural G2P to portuguese language

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form. It has a highly essential role for natural language processing, text-to-speech synthesis and automatic speech recognition systems. This project was adapted from https://github.com/hajix/G2P.

Dependencies

The following libraries are used:
pytorch
tqdm
matplotlib

Install dependencies using pip:

pip3 install -r requirements.txt

Dataset

The dataset used here was taken from site http://www.portaldalinguaportuguesa.org/, as well as some insertions made by me so that the dataset would give more coverage to common words in the daily life of the Brazilian Portuguese. Some ambiguities were also resolved as the intent of this dataset is to contain a specific speaker bias. The dictionary based on São Paulo speakers was chosen.

As in https://github.com/hajix/G2P, on which this implementation was based, you could easily provide and use your own language specific pronunciatin doctionary for training G2P. More details about data preparation and contribution could be found in resources.
Feel free to provide resources for other languages.

Attention Model

Both encoder-decoder seq2seq model and attention model could handle G2P problem. Here we train attention based model. The encoder model get sequence of graphemes and produces states at each timestep. Encoder states used during attention decoding. The decoder attends to appropriate encoder state (according to its state) and produces phonemes.

Train

To start training the model run:

python train.py

You can also use tensorboard to check the training loss:

tensorboard --logdir log --bind_all

Training parameters could be found at config.py.

Inference

To get pronunciation of a word:

# PT-BR example
python inference.py --sentence 'olá, vamos testar esse projeto.'
o|l|a| |,| |v|a|m|ʊ|s| |t|e|s|t|a| |e|s|i| |p|ɾ|o|ʒ|e|t|ʊ| |.

You could also visualize the attention weights, using --visualize:

# PT-BR example
python inference.py --visualize --sentence 'olá, vamos testar esse projeto.'
o|l|a| |,| |v|a|m|ʊ|s| |t|e|s|t|a| |e|s|i| |p|ɾ|o|ʒ|e|t|ʊ| |.

Grapheme-to-phoneme (G2P) conversion is the process of generating pronunciation for words based on their written form.

Related tags

Overview

Neural G2P to portuguese language

Dependencies

Dataset

Attention Model

Train

Inference

Owner

fluz

PRAnCER is a web platform that enables the rapid annotation of medical terms within clinical notes.

ETM - R package for Topic Modelling in Embedding Spaces

Using BERT-based models for toxic span detection

This is the writeup of all the challenges from Advent-of-cyber-2019 of TryHackMe

Korean Simple Contrastive Learning of Sentence Embeddings using SKT KoBERT and kakaobrain KorNLU dataset

Natural Language Processing Best Practices & Examples

A website which allows you to play with the GPT-2 transformer

無料で使える中品質なテキスト読み上げソフトウェア、VOICEVOXの音声合成エンジン

A method for cleaning and classifying text using transformers.

This script just scrapes the most recent Nepali news from Kathmandu Post and notifies the user about current events at regular intervals.It sends out the most recent news at random!

NLP topic mdel LDA - Gathered from New York Times website

Community and sentiment analysis based on tweets

Weakly-supervised Text Classification Based on Keyword Graph

一个基于Nonebot2和go-cqhttp的娱乐性qq机器人

A desktop GUI providing an audio interface for GPT3.

A fast, efficient universal vector embedding utility package.

"Investigating the Limitations of Transformers with Simple Arithmetic Tasks", 2021

ReCoin - Restoring our environment and businesses in parallel

Code for Discovering Topics in Long-tailed Corpora with Causal Intervention.

PyJPBoatRace: Python-based Japanese boatrace tools 🚤