easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Overview

easySpeech


GitHub issues GitHub forks GitHub stars GitHub license GitHub last commit GitHub contributors Downloads


easySpeech is an open source python wrapper for google speech to text api that doesn't require PyAaudio(So you specially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers

Installation

You can install easySpeech very easily using the following command

pip3 install easySpeech

Usage

  • Using google speech to text api
    By default easySpeech comes with a default api key which you can for testing purposes using the following code.
from easySpeech import speech
a=speech.speech('google')
print(a)

For production purpose use your own key because google can revoke the default api key at any time. Get your own api key from http://www.chromium.org/developers/how-tos/api-keys and use the following code

from easySpeech import speech
a=speech.speech('google',key="your api key")
print(a)

Specifying the duration of speech recognition in seconds(default value is 5 seconds)

from easySpeech import speech
a=speech.speech('google',duration = 10)
print(a)

Specifying the sample frequency(default is 44100)

from easySpeech import speech
a=speech.speech('google',duration = 10,freq = 44100)
print(a)

Specifying the language(works only for google speech api and default is english)

from easySpeech import speech
a=speech.speech('google',language="en-US")
print(a)

Converting an audio file to text(Currently it supports only wav file)

from easySpeech import speech
a=speech.google_audio('recording.wav')
print(a)
  • Using hugging face transformers(works offline and no need of any kind of api key) For using easySpeech with hugging face transformers use the following code.
from easySpeech import speech
a=speech.speech('ml')
print(a)

Specifying the duration of speech recognition in seconds(default valus is 5 seconds)

from easySpeech import speech
a=speech.speech('ml',duration = 10)
print(a)

Specifying the sample frequency(default is 44100)

from easySpeech import speech
a=speech.speech('ml',duration = 10,freq = 44100)
print(a)

Converting an audio file to text(Currently it supports only wav file)

from easySpeech import ml
a=ml.ml('recording.wav')
print(a)
  • Recording audio
    For recording audio use the following code
from easySpeech import speech
speech.recorder('recording.wav')

For recording audio with a specific frequency use the following code(default is 44100)

from easySpeech import speech
speech.recorder('recording.wav',freq = 50000)

For recording audio for a specific duration use the following code(default is 5s)

from easySpeech import speech
speech.recorder('recording.wav',duration = 50)

How to contribute

Since it is a free software , you can contribute to make it better. New contributors are always welcome, whether you write code, create resources, report bugs, or suggest features.

The easySpeech is written primarily in Python3x

Have a look at the open issues to find a mission that resonates with you.


Contact

Email: [email protected]
If you find any bug make a issue immediately.

License

easySpeech is lisenced under MIT license

MIT License | Copyright (c) 2021 SaptakBhoumik

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software
You might also like...
Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Pytorch-NLU,一个中文文本分类、序列标注工具包,支持中文长文本、短文本的多类、多标签分类任务,支持中文命名实体识别、词性标注、分词等序列标注任务。 Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

A Python module made to simplify the usage of Text To Speech and Speech Recognition.
A Python module made to simplify the usage of Text To Speech and Speech Recognition.

Nav Module The solution for voice related stuff in Python Nav is a Python module which simplifies voice related stuff in Python. Just import the Modul

A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself
A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code or write code yourself

Scriptfab - What is it? A python script to prefab your scripts/text files, and re create them with ease and not have to open your browser to copy code

A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk.

Simple-Vosk A Python wrapper for simple offline real-time dictation (speech-to-text) and speaker-recognition using Vosk. Check out the official Vosk G

PocketSphinx is a lightweight speech recognition engine, specifically tuned for handheld and mobile devices, though it works equally well on the desktop

PocketSphinx 5prealpha This is PocketSphinx, one of Carnegie Mellon University's open source large vocabulary, speaker-independent continuous speech r

Code for ACL 2022 main conference paper "STEMM: Self-learning with Speech-text Manifold Mixup for Speech Translation".

STEMM: Self-learning with Speech-Text Manifold Mixup for Speech Translation This is a PyTorch implementation for the ACL 2022 main conference paper ST

Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech

epub2audiobook Creating an Audiobook (mp3 file) using a Ebook (epub) using BeautifulSoup and Google Text to Speech Input examples qual a pasta do seu

Command Line Text-To-Speech using Google TTS
Command Line Text-To-Speech using Google TTS

cli-tts Thanks to gTTS by @pndurette! This is an interactive command line text-to-speech tool using Google TTS. Just type text and the voice will be p

Releases(v1.0.2)
  • v1.0.2(Jun 3, 2021)

    easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers. You can also use it to record sound. What's new

    1. It is now even more easy to use
    2. Minor bug fix
    Source code(tar.gz)
    Source code(zip)
  • v1.0.1(Jun 1, 2021)

    easySpeech is an open-source Python wrapper for google speech to text API that doesn't require PyAudio(So you especially windows user don't have to deal with the errors while installing PyAudio) and also works with hugging face transformers. You can also use it to record sound.

    Source code(tar.gz)
    Source code(zip)
A Plover python dictionary allowing for consistent symbol input with specification of attachment and capitalisation in one stroke.

Emily's Symbol Dictionary Design This dictionary was created with the following goals in mind: Have a consistent method to type (pretty much) every sy

Emily 68 Jan 07, 2023
An open collection of annotated voices in Japanese language

声庭 (Koniwa): オープンな日本語音声とアノテーションのコレクション Koniwa (声庭): An open collection of annotated voices in Japanese language 概要 Koniwa(声庭)は利用・修正・再配布が自由でオープンな音声とアノテ

Koniwa project 32 Dec 14, 2022
Official code for Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset

Official code for our Interspeech 2021 - Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset [1]*. Visually-grounded spoken language datasets c

Ian Palmer 3 Jan 26, 2022
Code for the paper "Language Models are Unsupervised Multitask Learners"

Status: Archive (code is provided as-is, no updates expected) gpt-2 Code and models from the paper "Language Models are Unsupervised Multitask Learner

OpenAI 16.1k Jan 08, 2023
BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese

Table of contents Introduction Using BARTpho with fairseq Using BARTpho with transformers Notes BARTpho: Pre-trained Sequence-to-Sequence Models for V

VinAI Research 58 Dec 23, 2022
Switch spaces for knowledge graph embeddings

SwisE Switch spaces for knowledge graph embeddings. Requirements: python3 pytorch numpy tqdm Reproduce the results To reproduce the reported results,

Shuai Zhang 4 Dec 01, 2021
Understand Text Summarization and create your own summarizer in python

Automatic summarization is the process of shortening a text document with software, in order to create a summary with the major points of the original document. Technologies that can make a coherent

Sreekanth M 1 Oct 18, 2022
PhoNLP: A BERT-based multi-task learning toolkit for part-of-speech tagging, named entity recognition and dependency parsing

PhoNLP is a multi-task learning model for joint part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing. Experiments on Vietnamese benchmark datasets show that PhoNLP prod

VinAI Research 109 Dec 02, 2022
Host your own GPT-3 Discord bot

GPT3 Discord Bot Host your own GPT-3 Discord bot i'd host and make the bot invitable myself, however GPT3 terms of service prohibit public use of GPT3

[something hillarious here] 8 Jan 07, 2023
Library for fast text representation and classification.

fastText fastText is a library for efficient learning of word representations and sentence classification. Table of contents Resources Models Suppleme

Facebook Research 24.1k Jan 05, 2023
This is a general repo that helps you develop fast/effective NLP classifiers using Huggingface

NLP Classifier Introduction This project trains a bert model on any NLP classifcation model. And uses the model in make predictions on new data using

Abdullah Tarek 3 Mar 11, 2022
Converts python code into c++ by using OpenAI CODEX.

🦾 codex_py2cpp 🤖 OpenAI Codex Python to C++ Code Generator Your Python Code is too slow? 🐌 You want to speed it up but forgot how to code in C++? ⌨

Alexander 423 Jan 01, 2023
Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Espresso Espresso is an open-source, modular, extensible end-to-end neural automatic speech recognition (ASR) toolkit based on the deep learning libra

Yiming Wang 919 Jan 03, 2023
PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation

StyleSpeech - PyTorch Implementation PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation. Status (2021.06.09

Keon Lee 142 Jan 06, 2023
Codes for processing meeting summarization datasets AMI and ICSI.

Meeting Summarization Dataset Meeting plays an essential part in our daily life, which allows us to share information and collaborate with others. Wit

xcfeng 39 Dec 14, 2022
Deduplication is the task to combine different representations of the same real world entity.

Deduplication is the task to combine different representations of the same real world entity. This package implements deduplication using active learning. Active learning allows for rapid training wi

63 Nov 17, 2022
Tools and data for measuring the popularity & growth of various programming languages.

growth-data Tools and data for measuring the popularity & growth of various programming languages. Install the dependencies $ pip install -r requireme

3 Jan 06, 2022
An open-source NLP library: fast text cleaning and preprocessing.

An open-source NLP library: fast text cleaning and preprocessing

Iaroslav 21 Mar 18, 2022
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language mod

20.5k Jan 08, 2023
Ray-based parallel data preprocessing for NLP and ML.

Wrangl Ray-based parallel data preprocessing for NLP and ML. pip install wrangl # for latest pip install git+https://github.com/vzhong/wrangl See exa

Victor Zhong 33 Dec 27, 2022