Baseline for the Spoofing-aware Speaker Verification Challenge 2022

Overview

Introduction

This repository contains several materials that supplements the Spoofing-Aware Speaker Verification (SASV) Challenge 2022 including:

  • calculating metrics;
  • extracting speaker/spoofing embeddings from pre-trained models;
  • training/evaluating Baseline2 in the evaluation plan.

More information can be found in the webpage and the evaluation plan

Prerequisites

Load ECAPA-TDNN & AASIST repositories

git submodule init
git submodule update

Install requirements

pip install -r requirements.txt

Data preparation

The ASVspoof2019 LA dataset [1] can be downloaded using the scipt in AASIST [2] repository

python ./aasist/download_dataset.py

Speaker & spoofing embedding extraction

Speaker embeddings and spoofing embeddings can be extracted using below script. Extracted embeddings will be saved in ./embeddings.

  • Speaker embeddings are extracted using the ECAPA-TDNN [3].
  • Spoofing embeddings are extracted using the AASIST [2].
  • We also prepared extracted embeddings.
    • To use prepared emebddings, git-lfs is required. Please refer to https://git-lfs.github.com for further instruction. After installing git-lfs use following command to download the embeddings.
      git-lfs install
      git-lfs pull
      
python save_embeddings.py

Baseline 2 Training

Run below script to train Baseline2 in the evaluation plan.

  • It will reproduce Baseline2 described in the Evaluation plan.
python main.py --config ./configs/baseline2.conf

Developing own models

  • Currently adding...

Adding custom DNN architecture

  1. create new file under ./models/.
  2. create a new configuration file under ./configs
  3. in the new configuration, modify model_arch and add required arguments in model_config.
  4. run python main.py --config {USER_CONFIG_FILE}

Using only metrics

Use get_all_EERs in metrics.py to calculate all three EERs.

  • prediction scores and keys should be passed on using
    • protocols/ASVspoof2019.LA.asv.dev.gi.trl.txt or
    • protocols/ASVspoof2019.LA.asv.eval.gi.trl.txt

References

[1] ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

@article{wang2020asvspoof,
  title={ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech},
  author={Wang, Xin and Yamagishi, Junichi and Todisco, Massimiliano and Delgado, H{\'e}ctor and Nautsch, Andreas and Evans, Nicholas and Sahidullah, Md and Vestman, Ville and Kinnunen, Tomi and Lee, Kong Aik and others},
  journal={Computer Speech \& Language},
  volume={64},
  pages={101114},
  year={2020},
  publisher={Elsevier}
}

[2] AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks

@inproceedings{Jung2022AASIST,
  author={Jung, Jee-weon and Heo, Hee-Soo and Tak, Hemlata and Shim, Hye-jin and Chung, Joon Son and Lee, Bong-Jin and Yu, Ha-Jin and Evans, Nicholas},
  booktitle={Proc. ICASSP}, 
  title={AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks}, 
  year={2022}

[3] ECAPA-TDNN: Emphasized Channel Attention, propagation and aggregation in TDNN based speaker verification

@inproceedings{desplanques2020ecapa,
  title={{ECAPA-TDNN: Emphasized Channel Attention, propagation and aggregation in TDNN based speaker verification}},
  author={Desplanques, Brecht and Thienpondt, Jenthe and Demuynck, Kris},
  booktitle={Proc. Interspeech 2020},
  pages={3830--3834},
  year={2020}
}
You might also like...
Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"

AASIST This repository provides the overall framework for training and evaluating audio anti-spoofing systems proposed in 'AASIST: Audio Anti-Spoofing

Using LSTM to detect spoofing attacks in an Air-Ground network
Using LSTM to detect spoofing attacks in an Air-Ground network

Using LSTM to detect spoofing attacks in an Air-Ground network Specifications IDE: Spider Packages: Tensorflow 2.1.0 Keras NumPy Scikit-learn Matplotl

Flexible-Modal Face Anti-Spoofing: A Benchmark

Flexible-Modal FAS This is the official repository of "Flexible-Modal Face Anti-

Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector
Imposter-detector-2022 - HackED 2022 Team 3IQ - 2022 Imposter Detector

HackED 2022 Team 3IQ - 2022 Imposter Detector By Aneeljyot Alagh, Curtis Kan, Jo

ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge (ManiSkill Challenge), a large-scale learning-from-demonstrations benchmark for object manipulation.

ManiSkill-Learn ManiSkill-Learn is a framework for training agents on SAPIEN Open-Source Manipulation Skill Challenge, a large-scale learning-from-dem

Contrastive Fact Verification

VitaminC This repository contains the dataset and models for the NAACL 2021 paper: Get Your Vitamin C! Robust Fact Verification with Contrastive Evide

Codes for ACL-IJCNLP 2021 Paper
Codes for ACL-IJCNLP 2021 Paper "Zero-shot Fact Verification by Claim Generation"

Zero-shot-Fact-Verification-by-Claim-Generation This repository contains code and models for the paper: Zero-shot Fact Verification by Claim Generatio

The VeriNet toolkit for verification of neural networks

VeriNet The VeriNet toolkit is a state-of-the-art sound and complete symbolic interval propagation based toolkit for verification of neural networks.

Pocsploit is a lightweight, flexible and novel open source poc verification framework
Pocsploit is a lightweight, flexible and novel open source poc verification framework

Pocsploit is a lightweight, flexible and novel open source poc verification framework

Comments
  • About the extracted embeddings.

    About the extracted embeddings.

    When we installed the git-lfs and step to pull the embeddings data, an error like:

    batch response: This repository is over its data quota. Account responsible for LFS bandwidth should purchase more data packs to restore access.
    error: failed to fetch some objects from 'https://github.com/sasv-challenge/SASVC2022_Baseline.git/info/lfs
    

    was appeared.

    What should I do? How can I download the embeddings data?

    opened by ikou-austin 3
  • Reproducing baseline1

    Reproducing baseline1

    Thanks for providing the code for pre-trained models and baseline2. I am reproducing baseline1 based on your description in the evaluation plan, but I got very different results on the development set. I am also curious why the SPF-EER on the development set is much worse than that on the evaluation set in your results. Could you please provide the code for reproducing your baseline1 result? Thank you so much!

    opened by yzyouzhang 3
  • omegaconf.errors.ConfigAttributeError: Missing key

    omegaconf.errors.ConfigAttributeError: Missing key

    I encounter the following error when I run main.py with the Baseline2 configuration.

    omegaconf.errors.ConfigAttributeError: Missing key

    There are in total three keys missing. min_req_mem gradient_clip reload_every_n_epoch

    I fixed these missing keys one by one by setting them to 0 or None. I am curious what are the default values for these. Thank you very much.

    opened by yzyouzhang 3
  • speaker_loss.weight is not in the model.

    speaker_loss.weight is not in the model.

    Thanks for your repo. I have successfully replicated the baseline2 performance. I encounter the following messages when I run python save_embeddings.py. It did not crash the program but I wonder where is the second line printed from since I did not find it. I am also not sure if it will cause potential issues.

    Device: cuda speaker_loss.weight is not in the model. Getting embedgins from set trn...

    Thanks.

    opened by yzyouzhang 1
Releases(v0.0.2)
Official implementation for the paper: Generating Smooth Pose Sequences for Diverse Human Motion Prediction

Generating Smooth Pose Sequences for Diverse Human Motion Prediction This is official implementation for the paper Generating Smooth Pose Sequences fo

Wei Mao 28 Dec 10, 2022
Hack Camera, Microphone, Location, Clipboard With Just a Link. Also, Get Many Details About Victim's Device. And So On...

An Automated Tool to Hack Victim's Camera, Microphone, Location, Clipboard. Has 2 Extra Features. Version 1.1 Update Fixed Some Major Bugs Data Saving

ToxicNoob 36 Jan 07, 2023
Generative Adversarial Networks(GANs)

Generative Adversarial Networks(GANs) Vanilla GAN ClusterGAN Vanilla GAN Model Structure Final Generator Structure A MLP with 2 hidden layers of hidde

Zhenbang Feng 2 Nov 05, 2021
Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your personal computer!

Reproducible research and reusable acyclic workflows in Python. Execute code on HPC systems as if you executed them on your machine! Motivation Would

Joeri Hermans 15 Sep 11, 2022
WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

Jieyu Zhang 176 Dec 28, 2022
NeurIPS 2021 Datasets and Benchmarks Track

AP-10K: A Benchmark for Animal Pose Estimation in the Wild Introduction | Updates | Overview | Download | Training Code | Key Questions | License Intr

AP-10K 82 Dec 11, 2022
QI-Q RoboMaster2022 CV Algorithm

QI-Q RoboMaster2022 CV Algorithm

2 Jan 10, 2022
【ACMMM 2021】DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning

DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning (ACMMM 2021) Overview We release the code of the DSANet (Dynamic S

Wenhao Wu 46 Dec 27, 2022
Code release for ICCV 2021 paper "Anticipative Video Transformer"

Anticipative Video Transformer Ranked first in the Action Anticipation task of the CVPR 2021 EPIC-Kitchens Challenge! (entry: AVT-FB-UT) [project page

Facebook Research 123 Dec 13, 2022
Graph parsing approach to structured sentiment analysis.

Fine-grained Sentiment Analysis as Dependency Graph Parsing This repository contains the code and datasets described in following paper: Fine-grained

Jeremy Barnes 36 Dec 12, 2022
Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

Nicely is a real-time Feedback and Intervention Program Depression is a prevalent issue across all age groups, socioeconomic classes, and cultural identities.

1 Jan 16, 2022
PG2Net: Personalized and Group PreferenceGuided Network for Next Place Prediction

PG2Net PG2Net:Personalized and Group Preference Guided Network for Next Place Prediction Datasets Experiment results on two Foursquare check-in datase

Urban Mobility 5 Dec 20, 2022
A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

A tutorial showing how to train, convert, and run TensorFlow Lite object detection models on Android devices, the Raspberry Pi, and more!

Evan 1.3k Jan 02, 2023
The implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021

DynamicNeuralGarments Introduction This repository contains the implemetation of Dynamic Nerual Garments proposed in Siggraph Asia 2021. ./GarmentMoti

42 Dec 27, 2022
Light-SERNet: A lightweight fully convolutional neural network for speech emotion recognition

Light-SERNet This is the Tensorflow 2.x implementation of our paper "Light-SERNet: A lightweight fully convolutional neural network for speech emotion

Arya Aftab 29 Nov 12, 2022
MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition

MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition Paper: MinkLoc++: Lidar and Monocular Image Fusion for Place Recognition accepted fo

64 Dec 18, 2022
Implementation of SwinTransformerV2 in TensorFlow.

SwinTransformerV2-TensorFlow A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of Sw

Phan Nguyen 2 May 30, 2022
Implementation of the CVPR 2021 paper "Online Multiple Object Tracking with Cross-Task Synergy"

Online Multiple Object Tracking with Cross-Task Synergy This repository is the implementation of the CVPR 2021 paper "Online Multiple Object Tracking

54 Oct 15, 2022
Efficient Multi Collection Style Transfer Using GAN

Proposed a new model that can make style transfer from single style image, and allow to transfer into multiple different styles in a single model.

Zhaozheng Shen 2 Jan 15, 2022
External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 Jittor code will come soon

MenghaoGuo 357 Dec 11, 2022