Speech Recognition Database Management with python

Last update: Feb 02, 2022

Overview

Speech Recognition Database Management

The main aim of this project is to recognize voice of the user as input and convert that input voice into the text form.

Libraries Used Inside the Project

We have used Speech Recognition module of Python to accomplish this mission. Inside it we have modules like PyAudio which helps us to play and record audio.

Also, we have used the MySQL connector module for connecting our Python program to our MySQL database.

Libraries Created During the Project

We have created a library named MySQLvoice which helps our Artificial Intelligence to manage and organise the databases.

The main aim of this module is to select the keywords from the given input. After selecting the keywords our Artificial Intelligence start working on the database and provide the required results.

How We Converted the Voice into Text

For getting the voice input of the user we have used the pre-build library of Python which is Speech Recognition. We have taken the voice input from the systems microphone and stored it into a variable. After that we used the recognize function of Speech Recognition to recognize what user said and stored it into a variable.

After recognizing we printed the input into the text form to check the durability of our program.

Description

Using MySQLvoice library user doesn't need to know SQL database languages to make any changes or to know anything about their database. We have announced eight new keywords as follows:

How to Install and Run the Project

Once the MySQLvoice pip package is uploaded on PyPI, you can directly write "pip install MySQLvoice" in your respective terminals to install it in your system. After installing you can import it in your Python compiler and get benefited.

How to Use the Project

This Project is limited to MySQL Database operations but it can be used in all regions of the world for handeling databases as it is very easy to develop for regional languages. We are mostly working in common English language but it has the capability to be coded for any languages spoken in the world like Kannada, Korean, Japenese, Hindi, Gujrati etc. It will help the Non-Technical person to handle databases with ease.

Advantages

It supports multitasking.
Users don’t need to code.
Can be used in any sector of industry where we employ databases.
It saves time of the user which will enhance work procedure and economy.

Disadvantages

May fail to work during hardware failure.
May take time in data training of speech recognition.
Noise pollution can hamper the quality of voice input.
The improper pronunciation can effect the voice input.

Future Plans

We dream to include the regional languages (such as Kannada, Gujarati, Marathi etc.) which will help non-technical person to handle their databases.

We have a plan to include this developer tool features to small scale industries to enhance their productivity with this time saving database handling.

Conclusion

This project will help a lot of indutries and business as they are able to manage and organize their databases with thier voice. Also it will reduce the work load to a greater extent.

This project is just a small example of Artificial Intelligence related Database Management.

This project was jointly created by:

Speech Recognition Database Management with python

Related tags

Overview

Speech Recognition Database Management

Libraries Used Inside the Project

Libraries Created During the Project

How We Converted the Voice into Text

Description

How to Install and Run the Project

How to Use the Project

Advantages

Disadvantages

Future Plans

Conclusion

Owner

Abhishek Kumar Jha

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA

Pretrain CPM - 大规模预训练语言模型的预训练代码

JaQuAD: Japanese Question Answering Dataset

Train 🤗-transformers model with Poutyne.

Code for text augmentation method leveraging large-scale language models

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

PyTorch implementation of "data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language" from Meta AI

Phomber is infomation grathering tool that reverse search phone numbers and get their details, written in python3.

Code and datasets for our paper "PTR: Prompt Tuning with Rules for Text Classification"

Chinese segmentation library

Creating an LSTM model to generate music

Code for "Generative adversarial networks for reconstructing natural images from brain activity".

A telegram bot to translate 100+ Languages

A simple version of DeTR

An implementation of model parallel GPT-3-like models on GPUs, based on the DeepSpeed library. Designed to be able to train models in the hundreds of billions of parameters or larger.

Universal End2End Training Platform, including pre-training, classification tasks, machine translation, and etc.

Pre-training BERT masked language models with custom vocabulary

Python functions for summarizing and improving voice dictation input.

Multispeaker & Emotional TTS based on Tacotron 2 and Waveglow