Reading List for topics in Sound Event Detection

Introduction

Sound event detection aims at processing the continuous acoustic signal and converting it into symbolic descriptions of the corresponding sound events present at the auditory scene. Sound event detection can be utilized in a variety of applications, including context-based indexing and retrieval in multimedia databases, unobtrusive monitoring in health care, and surveillance. Recently (since 2017), to utilise large multimedia data available, learning acoustic information from weak annotations was formulated. This reading list consists of papers which use weak annotation for learning symbolic descriptions of the corresponding sound events in the audio.

Papers covering multiple sub-areas are listed in both the sections. If there are any areas, papers, and datasets I missed, please let me know or feel free to make a pull request.

Task	Dataset	Source	Num. Files
Sound Event Classification	ESC-50	freesound.org	2k files
Sound Event Classification	DCASE17 Task 4	YT videos	2k files
Sound Event Classification	US8K	freesound.org	8k files
Sound Event Classification	FSD50K	freesound.org	50k files
Sound Event Classification	AudioSet	YT videos	2M files
COVID-19 Detection using Coughs	DiCOVA	Volunteers recording audio via a website	1k files
Few-shot Bioacoustic Event Detection	DCASE21 Task 5	audio	4k+ files
Acoustic Scene Classification	DCASE18 Task 1	Recorded by TUT	1.5k
Various	VGG-Sound	Web videos	200k files
Audio Captioning	Clotho	freesound.org	5k files
Audio Captioning	AudioCaps	YT videos	51k files
Action Recognition	UCF101	Web videos	13k files
Unlabeled	YFCC100M	Yahoo videos	1M files

Venues	link
Machine Learning for Audio Signal Processing, NIPS 2017 workshop	https://nips.cc/Conferences/2017/Schedule?showEvent=8790
MLSP: Machine Learning for Signal Processing	https://ieeemlsp.cc/
WASPAA: IEEE Workshop on Applications of Signal Processing to Audio and Acoustics	https://www.waspaa.com
ICASSP: IEEE International Conference on Acoustics Speech and Signal Processing	https://2021.ieeeicassp.org/
INTERSPEECH	https://www.interspeech2021.org/
IEEE/ACM Transactions on Audio, Speech and Language Processing	https://dl.acm.org/journal/taslp
DCASE	http://dcase.community/

Reading list for research topics in sound event detection

Related tags

Overview

Reading List for topics in Sound Event Detection

Introduction

Recent Content

Table of Contents

Research papers

Survey papers

Areas

Learning formulation

Network Architecture

Pooling functions

Missing or noisy audio:

Data Augmentation:

Generative Learning

Representation Learning

Multi-Task Learning

Few-Shot Learning

Knowledge Transfer

Polyphonic SED

Joint task

Loss function

Audio and Visual

Audio and Text [Audio Captioning]

Strongly and Weakly labelled data

Others

Dataset

Workshops/Conferences/Journals

Tutorials

Resources

More

Owner

Soham

Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

Simple, hackable offline speech to text - using the VOSK-API.

Music player and music library manager for Linux, Windows, and macOS

An audio guide for destroying oracles in Destiny's Vault of Glass raid

Python interface to the WebRTC Voice Activity Detector

This is a short program that takes the input from your microphone and uses OpenGL to draw a live colourful pattern

Voice helper on russian

Nayeli: cool telegram groups vc music project

Improved Python UI to convert Youtube URL to .mp3 file.

XA Music Player - Telegram Music Bot

controls volume using hand gestures

DCL - An easy to use diacritic library used for diacritic and accent manipulation.

A library for augmenting annotated audio data

eyeD3 is a Python module and command line program for processing ID3 tags. Information about mp3 files (i.e bit rate, sample frequency, play time, etc.) is also provided. The formats supported are ID3v1 (1.0/1.1) and ID3v2 (2.3/2.4).

Library for Python 3 to communicate with the Google Chromecast.

Learn chords with your MIDI keyboard !

Use android as mic/speaker for ubuntu

Bot duniya Music Player

extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21