L3DAS22 challenge supporting API

Overview

L3DAS22 challenge supporting API

This repository supports the L3DAS22 IEEE ICASSP Grand Challenge and it is aimed at downloading the dataset, pre-processing the sound files and the metadata, training and evaluating the baseline models and validating the final results. We provide easy-to-use instruction to produce the results included in our paper. Moreover, we extensively commented our code for easy customization.

For further information please refer to the challenge website and to the challenge documentation.

Installation

Our code is based on Python 3.7.

To install all required dependencies run:

pip install -r requirements.txt

Follow these instructions to properly create and place the kaggle.json file.

Dataset download

It is possible to download the entire dataset through the script download_dataset.py. This script downloads the data, extracts the archives, merges the 2 parts of task1 train360 files and prepares all folders for the preprocessing stage.

To download run this command:

python download_dataset.py --output_path ./DATASETS --unzip True

This script may take long, especially the unzipping stage.

Alternatively, it is possible to manually download the dataset from Kaggle.

The train360 section of task 1 is split in 2 downloadable files. If you manually download the dataset, you should manually merge the content of the 2 folders. You can use the function download_dataset.merge_train360(). Example:

import download_dataset

train360_path = "path_that_contains_both_train360_parts"
download_dataset.merge_train360(train360_path)

Pre-processing

The file preprocessing.py provides automated routines that load the raw audio waveforms and their correspondent metadata, apply custom pre-processing functions and save numpy arrays (.pkl files) containing the separate predictors and target matrices.

Run these commands to obtain the matrices needed for our baseline models:

python preprocessing.py --task 1 --input_path DATASETS/Task1 --training_set train100 --num_mics 1
python preprocessing.py --task 2 --input_path DATASETS/Task2 --num_mics 1 --frame_len 100

The two tasks of the challenge require different pre-processing.

For Task1 the function returns 2 numpy arrays contatining:

  • Input multichannel audio waveforms (3d noise+speech scenarios) - Shape: [n_data, n_channels, n_samples].
  • Output monoaural audio waveforms (clean speech) - Shape [n_data, 1, n_samples].

For Task2 the function returns 2 numpy arrays contatining:

  • Input multichannel audio spectra (3d acoustic scenarios): Shape: [n_data, n_channels, n_fft_bins, n_time_frames].
  • Output seld matrices containing the class ids of all sounds present in each 100-milliseconds frame alongside with their location coordinates - Shape: [n_data, n_frames, ((n_classes * n_class_overlaps) + (n_classes * n_class_overlaps * n_coordinates))], where n_class_overlaps is the maximum amount of possible simultaneous sounds of the same class (3) and n_coordinates refers to the spatial dimensions (3).

Baseline models

We provide baseline models for both tasks, implemented in PyTorch. For task 1 we use a Beamforming U-Net and for task 2 an augmented variant of the SELDNet architecture. Both models are based on the single-microphone dataset configuration. Moreover, for task 1 we used only Train100 as training set.

To train our baseline models with the default parameters run:

python train_baseline_task1.py
python train_baseline_task2.py

These models will produce the baseline results mentioned in the paper.

GPU is strongly recommended to avoid very long training times.

Alternatively, it is possible to download our pre-trained models with these commands:

python download_baseline_models.py --task 1 --output_path RESULTS/Task1/pretrained
python download_baseline_models.py --task 2 --output_path RESULTS/Task2/pretrained

These models are also available for manual download here.

We also provide a Replicate interactive demo of both baseline models.

Evaluaton metrics

Our evaluation metrics for both tasks are included in the metrics.py script. The functions task1_metric and location_sensitive_detection compute the evaluation metrics for task 1 and task 2, respectively. The default arguments reflect the challenge requirements. Please refer to the above-linked challenge paper for additional information about the metrics and how to format the prediction and target vectors.

Example:

import metrics

task1_metric = metrics.task1_metric(prediction_vector, target_vector)
_,_,_,task2_metric = metrics.location_sensitive_detection(prediction_vector, target_vector)

To compute the challenge metrics for our basiline models run:

python evaluate_baseline_task1.py
python evaluate_baseline_task2.py

In case you want to evaluate our pre-trained models, please add --model_path path/to/model to the above commands.

Submission shape validation

The script validate_submission.py can be used to assess the validity of the submission files shape. Instructions about how to format the submission can be found in the L3das website Use these commands to validate your submissions:

python validate_submission.py --task 1 --submission_path path/to/task1_submission_folder --test_path path/to/task1_test_dataset_folder

python validate_submission.py --task 2 --submission_path path/to/task2_submission_folder --test_path path/to/task2_test_dataset_folder

For each task, this script asserts if:

  • The number of submitted files is correct
  • The naming of the submitted files is correct
  • Only the files to be submitted are present in the folder
  • The shape of each submission file is as expected

Once you have valid submission folders, please follow the instructions on the link above to proceed with the submission.

Owner
L3DAS
The L3DAS project aims at providing new 3D audio datasets and encouraging the proliferation of new deep learning methods for 3D audio analysis.
L3DAS
A Telegram Bot to prevent Night Spams

NightModeBot A Telegram Bot to lock group in night to prevent night spam Setps To Use - Put Variables Correctly. - Add Bot to your group and make admi

ReeshuXD 10 Oct 21, 2022
A telegram bot to interact with a Minecraft Server

telegram-mc-bot A telegram bot to interact with a Minecraft Server It has the following commands: /status - Returns the server status (Online/Offline)

KleynArt 1 Dec 09, 2021
Check and write all account info + Check nitro on account

Discord-Token-Checker Check and write all account info + Check nitro on account Also check https://github.com/GuFFy12/Discord-Token-Parser (Parse disc

36 Jan 01, 2023
Asynchronous RDP/VNC client for Python (GUI)

🚩 This is the public repository of aardwolf, for latest version and updates please consider supporting us through https://porchetta.industries/ AARDW

29 Dec 15, 2022
M3U Playlist for free TV channels

Free TV This is an M3U playlist for free TV channels around the World. Either free locally (over the air): Or free on the Internet: Plex TV Pluto TV P

Free TV 964 Jan 08, 2023
🐍 The official Python client library for Google's discovery based APIs.

Google API Client This is the Python client library for Google's discovery based APIs. To get started, please see the docs folder. These client librar

Google APIs 6.2k Jan 08, 2023
A repository for 8G server's discord bot

8G Discord-Bot A general-purpose discord bot for the 8G Discord-Server To setup: Create a new file called secrets.py and make it look like this TOKEN=

1 Jan 12, 2022
a Disqus alternative

Isso – a commenting server similar to Disqus Isso – Ich schrei sonst – is a lightweight commenting server written in Python and JavaScript. It aims to

Martin Zimmermann 4.7k Jan 02, 2023
A simple but useful Discord Selfbot with essential features, made with discord.py-self.

Discord Selfbot Xyno Discord Selfbot Xyno is a simple but useful selfbot for Discord. It has currently limited useful features but it will be updated

Amit Pathak 7 Apr 24, 2022
A delightful and complete interface to GitHub's amazing API

ghapi A delightful and complete interface to GitHub's amazing API ghapi provides 100% always-updated coverage of the entire GitHub API. Because we aut

fast.ai 428 Jan 08, 2023
A mood based crypto tracking application.

Crypto Bud - API A mood based crypto tracking application. The main repository is private. I am creating the API before I connect everything to the ma

Krishnasis Mandal 1 Oct 23, 2021
LavaAPI - A simple library for accepting payments and using the LAVA Wallet

This library was created to simplify the LAVA api provided on the official websi

Vlad Baccara 8 Dec 18, 2022
Advance Anonymous Sender bot with Caption Editor

AnonyMous Sender 👨‍💻 Advanced Anonymous Sender with Caption Editor Join @DaisySupport_Official 🎵 for help Features Get forwarded messages without f

Inuka Asith 13 Oct 09, 2022
A simple script that loads and hot-reloads cogs when you save any changes

DiscordBot-HotReload A simple script that loads and hot-reloads cogs when you save any changes Usage @bot.event async def on_ready(): from HotRelo

2 Jan 14, 2022
A discord bot that will help you browse/download nhentai sources.

Risa Introduction Risa is an nHentai discord bot that will help you browse and download your favorite doujin inside your own discord server. Hosting M

markee7 14 Oct 25, 2021
Ever wanted a dashboard for making your antispam? This is it.

Ever wanted a dashboard for making your antispam? This is it.

Skelmis 1 Oct 27, 2021
Discord bot that displays the current Swatch Internet Time (.beat) as a status.

Internet-Time-Display Discord bot that displays the current Swatch Internet Time (.beat) as a status. Visit the website! Add the bot to your server! A

2 Mar 15, 2022
AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications

AWS Serverless Application Model (AWS SAM) The AWS Serverless Application Model (SAM) is an open-source framework for building serverless applications

Amazon Web Services 8.9k Dec 31, 2022
Analog clock that shows the weather instead of the actual numerical hour it points to.

Eli's weatherClock An digital analog clock but instead of showing the hours, the clock shows the weather at that hour of the day. So instead of showin

Kovin 154 Dec 01, 2022