easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Last update: May 24, 2022

Overview

easySpeech

easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Installation

You can install easySpeech very easily using the following command

pip3 install easySpeech

Usage

Using google speech to text api
By default easySpeech comes with a default api key which you can for testing purposes using the following code.

from easySpeech import speech
a=speech.speech('google')
print(a)

For production purpose use your own key because google can revoke the default api key at any time. Get your own api key from http://www.chromium.org/developers/how-tos/api-keys and use the following code

from easySpeech import speech
a=speech.speech('google',key="your api key")
print(a)

Specifying the duration of speech recognition in seconds(default value is 5 seconds)

from easySpeech import speech
a=speech.speech('google',duration = 10)
print(a)

Specifying the sample frequency(default is 44100)

from easySpeech import speech
a=speech.speech('google',duration = 10,freq = 44100)
print(a)

Specifying the language(works only for google speech api and default is english)

from easySpeech import speech
a=speech.speech('google',language="en-US")
print(a)

Converting an audio file to text(Currently it supports only wav file)

from easySpeech import speech
a=speech.google_audio('recording.wav')
print(a)

Using hugging face transformers(works offline and no need of any kind of api key) For using easySpeech with hugging face transformers use the following code.

from easySpeech import speech
a=speech.speech('ml')
print(a)

Specifying the duration of speech recognition in seconds(default valus is 5 seconds)

from easySpeech import speech
a=speech.speech('ml',duration = 10)
print(a)

Specifying the sample frequency(default is 44100)

from easySpeech import speech
a=speech.speech('ml',duration = 10,freq = 44100)
print(a)

Converting an audio file to text(Currently it supports only wav file)

from easySpeech import ml
a=ml.ml('recording.wav')
print(a)

Recording audio
For recording audio use the following code

from easySpeech import speech
speech.recorder('recording.wav')

For recording audio with a specific frequency use the following code(default is 44100)

from easySpeech import speech
speech.recorder('recording.wav',freq = 50000)

For recording audio for a specific duration use the following code(default is 5s)

from easySpeech import speech
speech.recorder('recording.wav',duration = 50)

How to contribute

Since it is a free software , you can contribute to make it better. New contributors are always welcome, whether you write code, create resources, report bugs, or suggest features.

The easySpeech is written primarily in Python3x

Have a look at the open issues to find a mission that resonates with you.

Contact

Email: [email protected]
If you find any bug make a issue immediately.

License

easySpeech is lisenced under MIT license

MIT License | Copyright (c) 2021 SaptakBhoumik

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software

Releases(v1.0.2)

v1.0.2(Jun 3, 2021)
easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers. You can also use it to record sound. What's new

It is now even more easy to use

Minor bug fix

Source code(tar.gz)
Source code(zip)
v1.0.1(Jun 1, 2021)

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers. You can also use it to record sound.
Source code(tar.gz)
Source code(zip)

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU，一个中文文本分类、序列标注工具包，支持中文长文本、短文本的多类、多标签分类任务，支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

186 Dec 24, 2022

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

The PyTorch-Kaldi Speech Recognition Toolkit PyTorch-Kaldi is an open-source repository for developing state-of-the-art DNN/HMM speech recognition sys

2.3k Dec 27, 2022

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul

1 Dec 20, 2021

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

3 Jul 28, 2021

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

2 Jun 19, 2022

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

PocketSphinx 5prealpha This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech r

3.2k Dec 28, 2022

easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Related tags

Overview

easySpeech

Installation

Usage

How to contribute

Contact

License

You might also like...

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

Command Line Text-To-Speech using Google TTS

Releases(v1.0.2)

v1.0.2(Jun 3, 2021)

v1.0.1(Jun 1, 2021)

Owner

Saptak Bhoumik

A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

An open collection of annotated voices in Japanese language

Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Code for the paper "Language Models are Unsupervised Multitask Learners"

BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Switch spaces for knowledge graph embeddings

Understand Text Summarization and create your own summarizer in python

PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

Host your own GPT-3 Discord bot

Library for fast text representation and classification.

This is a general repo that helps you develop fast/effective NLP classifiers using Huggingface

Converts python code into c++ by using OpenAI CODEX.

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

Codes for processing meeting summarization datasets AMI and ICSI.

Deduplication is the task to combine different representations of the same real world entity.

Tools and data for measuring the popularity & growth of various programming languages.

An open-source NLP library: fast text cleaning and preprocessing.

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Ray-based parallel data preprocessing for NLP and ML.