Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Last update: May 11, 2022

Overview

Keyword Spotting Transformer

This is the unofficial TensorFlow implementation of the Keyword Spotting Transformer model. This model is used to train on the 35 words speech command dataset

Paper : Keyword Transformer: A Self-Attention Model for Keyword Spotting

Model architecture

Download the dataset

To download the dataset use the following command

wget https://storage.googleapis.com/download.tensorflow.org/data/speech_commands_v0.02.tar.gz
mkdir data
mv ./speech_commands_v0.02.tar.gz ./data
cd ./data
tar -xf ./speech_commands_v0.02.tar.gz
cd ../

Setup virtual environment

virtualenv -p python3 venv
source ./venv/bin/activate

Install dependencies

pip install -r requirements.txt

Training the model

To train the model run this command

python3 train.py --data_dir ${Path to data directory} \
                 --logdir ${Path to log directory} \
                 --num_layers ${Number of sequential encoder layers} \
                 --d_model ${Dimension of the encoder layers} \
                 --num_heads ${Number of heads in multi head attention layer} \
                 --mlp_dim ${Dimension of mlp layers} \
                 --lr ${Learning rate} \
                 --weight_decay ${Weight decay} \
                 --batch_size ${Batch size} \
                 --epochs ${Number of epochs} \
                 --save_dir ${Directory to save the model weights}

To track your training metrics

tensorboard --logdir  ${Path to log directory}

Predicting keyword of audio file

To predict the keyword of the audio file

python3 test.py --model_dir ${Saved model directory} \
                --file_path ${Audio file}

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Related tags

Overview

Keyword Spotting Transformer

Model architecture

Download the dataset

Setup virtual environment

Install dependencies

Training the model

Predicting keyword of audio file

Owner

Intelligent Machines Limited

Official implementation of EdiTTS: Score-based Editing for Controllable Text-to-Speech

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Official implementation of Deep Burst Super-Resolution

An attempt at the implementation of GLOM, Geoffrey Hinton's paper for emergent part-whole hierarchies from data

Self Governing Neural Networks (SGNN): the Projection Layer

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

Pytorch implementation of the paper DocEnTr: An End-to-End Document Image Enhancement Transformer.

A state-of-the-art semi-supervised method for image recognition

Deep learning algorithms for muon momentum estimation in the CMS Trigger System

利用python脚本实现微信、支付宝账单的合并，并保存到excel文件实现自动记账，可查看可视化图表。

An AI made using artificial intelligence (AI) and machine learning algorithms (ML) .

Single object tracking and segmentation.

The official PyTorch implementation for NCSNv2 (NeurIPS 2020)

Earth Vision Foundation

This GitHub repository contains code used for plots in NeurIPS 2021 paper 'Stochastic Multi-Armed Bandits with Control Variates.'

Realtime_Multi-Person_Pose_Estimation

ATAC: Adversarially Trained Actor Critic

Object tracking and object detection is applied to track golf puts in real time and display stats/games.

Implementation of Deep Deterministic Policy Gradiet Algorithm in Tensorflow