Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

Last update: Jan 09, 2023

Related tags

Overview

UniSpeech

The family of UniSpeech:

UniSpeech (ICML 2021): Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

UniSpeech-SAT (ICASSP 2022 Submission): Universal Speech Representation Learning with Speaker Aware Pre-Training

Pre-trained models

We strongly suggest using our UniSpeech-SAT model for speaker related tasks, since it shows very powerful performance on various speaker related benchmarks.

Model	Dataset	Model
UniSpeech Base	1500 hrs CommonVoice	download
UniSpeech Large	1500 hrs CommonVoice	download
UniSpeech-SAT Base	960 hrs LibriSpeech	download
UniSpeech-SAT Base+	60k hrs Libri-Light + 10k hrs GigaSpeech + 24k hrs VoxPopuli	download
UniSpeech-SAT Large	60k hrs Libri-Light + 10k hrs GigaSpeech + 24k hrs VoxPopuli	download

License

This project is licensed under the license found in the LICENSE file in the root directory of this source tree. Portions of the source code are based on the FAIRSEQ project.

Microsoft Open Source Code of Conduct

Contact Information

For help or issues using UniSpeech models, please submit a GitHub issue.

For other communications related to UniSpeech, please contact Yu Wu ([email protected]).

Unified Pre-training for Self-Supervised Learning and Supervised Learning for ASR

Related tags

Overview

UniSpeech

Pre-trained models

License

Contact Information

Owner

Microsoft

Complete system for facial identity system

Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

RL-driven agent playing tic-tac-toe on starknet against challengers.

A model that attempts to learn and benefit from data collected on card counting.

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

Official Code Release for "TIP-Adapter: Training-free clIP-Adapter for Better Vision-Language Modeling"

The official PyTorch implementation of paper BBN: Bilateral-Branch Network with Cumulative Learning for Long-Tailed Visual Recognition

CityLearn Challenge Multi-Agent Reinforcement Learning for Intelligent Energy Management, 2020, PikaPika team

A tensorflow implementation of Fully Convolutional Networks For Semantic Segmentation

Taming Transformers for High-Resolution Image Synthesis

Code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2021

Pytorch implementation of Bert and Pals: Projected Attention Layers for Efficient Adaptation in Multi-Task Learning

Source code for "Taming Visually Guided Sound Generation" (Oral at the BMVC 2021)

Learning to Stylize Novel Views

Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Source code for the paper "PLOME: Pre-training with Misspelled Knowledge for Chinese Spelling Correction" in ACL2021

OpenDelta - An Open-Source Framework for Paramter Efficient Tuning.

Easily benchmark PyTorch model FLOPs, latency, throughput, max allocated memory and energy consumption

The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"