Asr abc - Automatic speech recognition(ASR),中文语音识别

Overview

语音识别的简单示例,主要在课堂演示使用

创建python虚拟环境

在linux 和macos 上验证通过

# 如果已经有pyhon3.6 环境,跳过该步骤,使用现有环境也可以
virtualenv ~/env/asr_abc --python=python3.8
. ~/env/asr_abc/bin/activate

安装本项目

python setup.py install
or 
pip install .

识别wav音频

Note: 输入音频采样率必须是16k,如果待识别音频不是16k,可以采用以下命令重采样为16k.

ffmpeg -i ks1_48k.acc -ar 16000 ks1_16k.wav
python decode.py
# 或debug 模式
python decode.py -d

#预期输出:
2021-12-24 17:08:31,736 INFO [decode.py:91] All files seem exist.
2021-12-24 17:08:31,736 INFO [decode.py:96] Start loading model.
2021-12-24 17:08:41,321 INFO [decode.py:113] Start loading dict.
2021-12-24 17:08:41,336 INFO [decode.py:119] Start recognize data/wavs/BAC009S0764W0143.wav.
2021-12-24 17:08:41,527 INFO [decode.py:134] Result: 在市场整体从高速增长进入中高速增长区间的同时
2021-12-24 17:08:41,527 INFO [decode.py:135] done.

# 也可以指定输入音频
python decode.py --input-wav=data/wavs/ks1_16k.wav
或者
python decode.py -i=data/wavs/ks1_16k.wav  # ks is short for "Kantanzhe Song"
# 预期输出:
2021-12-27 19:16:02,911 INFO [decode.py:91] All files seem exist.
2021-12-27 19:16:02,911 INFO [decode.py:96] Start loading model.
2021-12-27 19:16:08,405 INFO [decode.py:113] Start loading dict.
2021-12-27 19:16:08,409 INFO [decode.py:119] Start recognize data/wavs/ks1_16k.wav.
2021-12-27 19:16:08,449 INFO [decode.py:137] Result: 我们有火焰般的热情
2021-12-27 19:16:08,450 INFO [decode.py:138] done.

相关项目链接:

https://github.com/k2-fsa/icefall
https://github.com/k2-fsa/k2
https://github.com/lhotse-speech/lhotse

手动模型下载

如果上述python decode.py 已识别出预期结果,说明模型已自动从下载源1成功下载模型,无需关注以下内容。

下载源1:(decode.py 代码会自动访问这个源下载): https://huggingface.co/GuoLiyong/cn_conformer_encoder_aishell/tree/main/data/lang_char

下载源2: 百度网盘

链接: https://pan.baidu.com/s/17tPOJM_Sm49q1kZrE3jfUQ
提取码: qa4p

对于访问下载源1有困难或者访问速度过慢的同学,可以手动从百度网盘下载. 下载完毕后按以下文件结构放置下载所得的"tokens.txt"和"conformer_encoder.pt"两个文件:

.
|-- README.md
|-- build
|   `-- bdist.linux-x86_64
|-- conformer.py
|-- data
|   |-- lang_char
|   |   |-- tokens.txt
|   `-- wavs
|       |-- BAC009S0764W0143.wav
|       |-- README.md
|       `-- transcript
|-- decode.py
|-- exp
|   `-- conformer_encoder.pt
|-- requirements.txt
|-- setup.py
`-- utils.py
Owner
LIyong.Guo
LIyong.Guo
An automated program that helps customers of Pizza Palour place their pizza orders

PIzza_Order_Assistant Introduction An automated program that helps customers of Pizza Palour place their pizza orders. The program uses voice commands

Tindi Sommers 1 Dec 26, 2021
Source code for CsiNet and CRNet using Fully Connected Layer-Shared feedback architecture.

FCS-applications Source code for CsiNet and CRNet using the Fully Connected Layer-Shared feedback architecture. Introduction This repository contains

Boyuan Zhang 4 Oct 07, 2022
Search Git commits in natural language

NaLCoS - NAtural Language COmmit Search Search commit messages in your repository in natural language. NaLCoS (NAtural Language COmmit Search) is a co

Pushkar Patel 50 Mar 22, 2022
Seonghwan Kim 24 Sep 11, 2022
Stuff related to Ben Eater's 8bit breadboard computer

8bit breadboard computer simulator This is an assembler + simulator/emulator of Ben Eater's 8bit breadboard computer. For a version with its RAM upgra

Marijn van Vliet 29 Dec 29, 2022
原神抽卡记录数据集-Genshin Impact gacha data

提要 持续收集原神抽卡记录中 可以使用抽卡记录导出工具导出抽卡记录的json,将json文件发送至[email protected],我会在清除个人信息后

117 Dec 27, 2022
Winner system (DAMO-NLP) of SemEval 2022 MultiCoNER shared task over 10 out of 13 tracks.

KB-NER: a Knowledge-based System for Multilingual Complex Named Entity Recognition The code is for the winner system (DAMO-NLP) of SemEval 2022 MultiC

116 Dec 27, 2022
Python package for performing Entity and Text Matching using Deep Learning.

DeepMatcher DeepMatcher is a Python package for performing entity and text matching using deep learning. It provides built-in neural networks and util

461 Dec 28, 2022
The following links explain a bit the idea of semantic search and how search mechanisms work by doing retrieve and rerank

Main Idea The following links explain a bit the idea of semantic search and how search mechanisms work by doing retrieve and rerank Semantic Search Re

Sergio Arnaud Gomez 2 Jan 28, 2022
Bot to connect a real Telegram user, simulating responses with OpenAI's davinci GPT-3 model.

AI-BOT Bot to connect a real Telegram user, simulating responses with OpenAI's davinci GPT-3 model.

Thempra 2 Dec 21, 2022
EdiTTS: Score-based Editing for Controllable Text-to-Speech

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

Neosapience 99 Jan 02, 2023
ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files.

Antlr Project 13.6k Jan 05, 2023
ChatBotProyect - This is an unfinished project about a simple chatbot.

chatBotProyect This is an unfinished project about a simple chatbot. (union_todo.ipynb) Reminders for the project: Find why one of the vectorizers fai

Tomás 0 Jul 24, 2022
👄 The most accurate natural language detection library for Python, suitable for long and short text alike

1. What does this library do? Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a prepr

Peter M. Stahl 334 Dec 30, 2022
The Easy-to-use Dialogue Response Selection Toolkit for Researchers

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

GMFTBY 32 Nov 13, 2022
A python framework to transform natural language questions to queries in a database query language.

__ _ _ _ ___ _ __ _ _ / _` | | | |/ _ \ '_ \| | | | | (_| | |_| | __/ |_) | |_| | \__, |\__,_|\___| .__/ \__, | |_| |_| |___/

Machinalis 1.2k Dec 18, 2022
Twitter-Sentiment-Analysis - Analysis of twitter posts' positive and negative score.

Twitter-Sentiment-Analysis The hands-on project is in Python 3 Programming class offered by University of Michigan via Coursera. The task is to build

Eszter Pai 1 Jan 03, 2022
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

fastNLP fastNLP是一款轻量级的自然语言处理(NLP)工具包,目标是快速实现NLP任务以及构建复杂模型。 fastNLP具有如下的特性: 统一的Tabular式数据容器,简化数据预处理过程; 内置多种数据集的Loader和Pipe,省去预处理代码; 各种方便的NLP工具,例如Embedd

fastNLP 2.8k Jan 01, 2023
KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정한 코드입니다.

KoBERTopic 모델 소개 KoBERTopic은 BERTopic을 한국어 데이터에 적용할 수 있도록 토크나이저와 BERT를 수정했습니다. 기존 BERTopic : https://github.com/MaartenGr/BERTopic/tree/05a6790b21009d

Won Joon Yoo 26 Jan 03, 2023
A notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository

We provide a notebook that shows how to import the IITB English-Hindi Parallel Corpus from the HuggingFace datasets repository. The notebook also shows how to segment the corpus using BPE tokenizatio

Computation for Indian Language Technology (CFILT) 9 Oct 13, 2022