This repo contains simple to use, pretrained/training-less models for speaker diarization.

Last update: Jan 20, 2022

Related tags

Text Data & NLP pydiar

Overview

PyDiar

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Supported Models

Binary Key Speaker Modeling

Based on pyBK by Jose Patino which implements the diarization system from "The EURECOM submission to the first DIHARD Challenge" by Patino, Jose and Delgado, Héctor and Evans, Nicholas

If you have any other models you would like to see added, please open an issue.

Usage

This library seeks to provide a very basic interface. To use the Binary Key model on a file, do something like this:

import numpy as np
from pydiar.models import BinaryKeyDiarizationModel, Segment
from pydiar.util.misc import optimize_segments
from pydub import AudioSegment

INPUT_FILE = "test.wav"

sample_rate = 32000
audio = AudioSegment.from_wav(test.wav)
audio = audio.set_frame_rate(sample_rate)
audio = audio.set_channels(1)

diarization_model = BinaryKeyDiarizationModel()
segments = diarization_model.diarize(
    sample_rate, np.array(audio.get_array_of_samples())
)
optimized_segments = optimize_segments(segments)

Now optimized_segments contains a list of segments with their start, length and speaker id

Example

A simple script which reads an audio file, diarizes it and transcribes it into the WebVTT format can be found in examples/generate_webvtt.py. To use it, download a vosk model from https://alphacephei.com/vosk/models and then run the script using

poetry install
poetry run python -m examples.generate_webvtt -i PATH/TO/INPUT.wav -m PATH/TO/VOSK_MODEL

This repo contains simple to use, pretrained/training-less models for speaker diarization.

Related tags

Overview

PyDiar

Supported Models

Usage

Example

Owner

A Pytorch implementation of "Splitter: Learning Node Representations that Capture Multiple Social Contexts" (WWW 2019).

GAP-text2SQL: Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

PyJPBoatRace: Python-based Japanese boatrace tools 🚤

A desktop GUI providing an audio interface for GPT3.

The code from the whylogs workshop in DataTalks.Club on 29 March 2022

This repository contains examples of Task-Informed Meta-Learning

Training code for Korean multi-class sentiment analysis

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering.

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

Fastseq 基于ONNXRUNTIME的文本生成加速框架

Datasets of Automatic Keyphrase Extraction

Meta learning algorithms to train cross-lingual NLI (multi-task) models

Translate - a PyTorch Language Library

This repository contains all the source code that is needed for the project : An Efficient Pipeline For Bloom’s Taxonomy Using Natural Language Processing and Deep Learning

UA-GEC: Grammatical Error Correction and Fluency Corpus for the Ukrainian Language

Text editor on python to convert english text to malayalam(Romanization/Transiteration).

Reformer, the efficient Transformer, in Pytorch

A large-scale (194k), Multiple-Choice Question Answering (MCQA) dataset designed to address realworld medical entrance exam questions.

Text-to-Speech for Belarusian language

Top2Vec is an algorithm for topic modeling and semantic search.