Code for csig audio deepfake detection

Related tags

AudioCSIG_audio
Overview

FMFCC Audio Deepfake Detection Solution

This repo provides an solution for the 多媒体伪造取证大赛. Our solution achieve the 1st in the Audio Deepfake Detection track . The ranking can be seen here

Authors

Institution: Shenzhen Key Laboratory of Media Information Content Security(MICS)

Team name: Forensics_SZU

Username:

Pipeline

EfficientNet-B2 + mel features (不同mel参数的训练两个模型ensemble)

Requirements:

  • 激活已经配置好的环境[建议使用这个]
conda activate audio
  • 或者执行这个重新配置
pip install -r requirements.txt
  • 可能存在部分包漏了安装,可以通过pip install 方式继续安装完整

Submission

  • cd到代码位置:/home/audio5/py_project/CSIG_audio
cd /home/audio5/py_project/CSIG_audio
  • 接着,修改inference.py 文件的第98行中的root_path路劲为自己语音测试集的路劲

1. 预测第一个模型

  • 然后执行
python inference.py
  • 运行结束后,可以在 "./output/results/efficientnet-b2_96_k2e6_hop48_t' 看到results.json文件

2. 预测第二个模型

  • 2.1 修改json_root的路劲:把inference.py的105行的代码注释掉, 然后解除106行的注释,更新了第二个模型预测结果保存的路劲,修改后的代码如下
105        # json_root = f'./output/results/{model_name}_96_k2e6_hop48_t'
106        json_root = f'./output/results/{model_name}_96_k2e9_t'
  • 2.2 修改model_path的路劲(即改变模型权值):把inference.py的114行的代码注释掉, 然后解除115行的注释,更新了模型权值的路劲,修改后的代码如下
114        # model_path = './output/weights/efficientnet-b2_96_hop48_k2/audio6_acc0.9832.pth'
115        model_path = './output/weights/efficientnet-b2_96_k2/audio9_acc0.9752.pth'
  • 2.3 修改mel feature的配置参数:把dataset/preprocess/vggish_params.py中的 35行代码"EXAMPLE_HOP_SECONDS = 0.48" 改为"EXAMPLE_HOP_SECONDS = 0.96" 特别要注意EXAMPLE_HOP_SECONDS这个参数,验证结束后,需要重新修改为0.48, 方便最终决赛运行
  • 修改上述三点后,然后执行, 另外前面的两个修改也需要恢复。
python inference.py
  • 运行结束后,可以在 "./output/results/efficientnet-b2_96_k2e9_t' 看到results.json文件

Ensemble

  • 集成刚刚生成的两个json文件:把inference.py 第88行的"json_ensemble = False"改为"json_ensemble = True"
  • 修改后执行:
python inference.py
  • 运行结束后,可以在 "./output/results/ensemble_01_t' 看到results.json文件,该文件就是我们最终提交的结果json文件

注意:验证结束后,需要恢复上述修改到原始状态,方便最终的决赛

Owner
BokingChen
BokingChen
MelGAN test on audio decoding

Official repository for the paper MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis The original work URL: https://github.com

Jurio 1 Apr 29, 2022
Code for paper 'Audio-Driven Emotional Video Portraits'.

Audio-Driven Emotional Video Portraits [CVPR2021] Xinya Ji, Zhou Hang, Kaisiyuan Wang, Wayne Wu, Chen Change Loy, Xun Cao, Feng Xu [Project] [Paper] G

197 Dec 31, 2022
TwitterMusicBot - A Twitter bot with Spotify integration.

A Twitter Music Bot 🤖 🎵 🎶 I created this project to learn more about APIs, so it only works for student purposes. Initially, delving into the Spoti

Gustavo Oliveira 2 Jan 02, 2022
PianoPlayer - Automatic fingering generator for piano scores

PianoPlayer - Automatic fingering generator for piano scores

Marco Musy 571 Jan 02, 2023
Code for "Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose"

Audio-driven Talking Face Video Generation with Learning-based Personalized Head Pose We provide PyTorch implementations for our arxiv paper "Audio-dr

Ran Yi 497 Jan 09, 2023
Enhanced Audio Player for Discord

Discodo is an enhanced audio player for discord

Mary 42 Oct 05, 2022
Pythonic bindings for FFmpeg's libraries.

PyAV PyAV is a Pythonic binding for the FFmpeg libraries. We aim to provide all of the power and control of the underlying library, but manage the gri

PyAV 1.8k Jan 03, 2023
Graphical interface to control granular sound synthesis.

Granular sound synthesis interface SoundGrain is a graphical interface where users can draw and edit trajectories to control granular sound synthesis

Olivier Bélanger 122 Dec 10, 2022
Royal Music You can play music and video at a time in vc

Royals-Music Royal Music You can play music and video at a time in vc Commands SOON String STRING_SESSION Deployment 🎖 Credits • 🇸ᴏᴍʏᴀ⃝🇯ᴇᴇᴛ • 🇴ғғɪ

2 Nov 23, 2021
Conferencing Speech Challenge

ConferencingSpeech 2021 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech challenge. For more detai

73 Nov 29, 2022
Voice to Text using Raspberry Pi

This module will help to convert your voice (speech) into text using Speech Recognition Library. You can control the devices or you can perform the desired tasks by the word recognition

Raspberry_Pi Pakistan 2 Dec 15, 2021
Python CD-DA ripper preferring accuracy over speed

Whipper Whipper is a Python 3 (3.6+) CD-DA ripper based on the morituri project (CDDA ripper for *nix systems aiming for accuracy over speed). It star

671 Jan 04, 2023
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack

Audiovisual Communications Laboratory 1k Jan 09, 2023
A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction 👨‍💻 It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

Sujith 84 Dec 22, 2022
SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats

SU Music Player — The first open-source PyTgCalls based Pyrogram bot to play music in voice chats Note Neither this, or PyTgCalls are fully

SU Projects 58 Jan 02, 2023
digital audio workstation, instrument and effect plugins, wave editor

digital audio workstation, instrument and effect plugins, wave editor

306 Jan 05, 2023
Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

Ján Mazanec 31 Dec 22, 2022
Synthesia but open source, made in python and free

PyPiano Synthesia but open source, made in python and free Requirements are in requirements.txt If you struggle with installation of pyaudio, run : pi

DaCapo 11 Nov 06, 2022
Python implementation of the Short Term Objective Intelligibility measure

Python implementation of STOI Implementation of the classical and extended Short Term Objective Intelligibility measures Intelligibility measure which

Pariente Manuel 250 Dec 21, 2022
Python Audio Analysis Library: Feature Extraction, Classification, Segmentation and Applications

A Python library for audio feature extraction, classification, segmentation and applications This doc contains general info. Click here for the comple

Theodoros Giannakopoulos 5.1k Jan 02, 2023