Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Last update: Jul 06, 2022

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

 Abeßer, J. & Müller, M. Towards Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning, submitted to: ICASSP 2022

Related Work

we use pre-computed features & model architecture used in 3 previous papers
- these are all unsupervised domain adaptation methods

    Mezza, A. I., Habets, E. A. P., Müller, M., & Sarti, A. (2021).
    #Unsupervised domain adaptation for acoustic scene classification
    using band-wise statistics matching. Proceedings of the European
    Signal Processing Conference (EUSIPCO), 11–15.
    https://doi.org/10.23919/Eusipco47968.2020.9287533"

    Drossos, K., Magron, P., & Virtanen, T. (2019). Unsupervised Adversarial Domain Adaptation based
    on the Wasserstein Distance for Acoustic Scene Classification. Proceedings of the IEEE Workshop
    on Applications of Signal Processing to Audio and Acoustics (WASPAA), 259–263. New Paltz, NY, USA.

    Gharib, S., Drossos, K., Emre, C., Serdyuk, D., & Virtanen, T. (2018). Unsupervised Adversarial Domain
    Adaptation for Acoustic Scene Classification. Proceedings of the Detection and Classification of
    Acoustic Scenes and Events (DCASE). Surrey, UK.

Files

configs.py - Training configurations (C0 ... C3M)
generator.py - Data generator
losses.py - Loss implementations
model.py - Function to create dual-input / dual-output model
model_kaggle.py - reference CNN model from related work for acoustic scene classification (ASC)
normalization.py - Normalization methods (see Mezza et al. above)
params.py - General parameters
prediction.py - Prediction script to evaluate models on test data
training.py - Script to run the model training for 6 different configurations (see Fig. 2 in the paper)

How to run

create python environment (e.g. with conda), the following versions were used during the paper preparation process
- librosa==0.8.0
- matplotlib==3.3.2
- numpy=1.19.2
- python=3.7.0
- scikit-learn==0.23.2
- tensorflow==2.3.0
- torch==1.9.0
set in params.py the following variables
- dir_feat to your local copy of the .p files from https://zenodo.org/record/1401995
- dir_target to your local output folder
run python training.py && python prediction.py on a GPU device to train & evaluate the models

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Related tags

Overview

Audio Domain Adaptation for Acoustic Scene Classification using Disentanglement Learning

Reference

Related Work

Files

How to run

Owner

Jakob Abeßer

RMNA: A Neighbor Aggregation-Based Knowledge Graph Representation Learning Model Using Rule Mining

Official repository for the CVPR 2021 paper "Learning Feature Aggregation for Deep 3D Morphable Models"

Syllabus del curso IIC2115 - Programación como Herramienta para la Ingeniería 2022/I

This is the official source code for SLATE. We provide the code for the model, the training code, and a dataset loader for the 3D Shapes dataset. This code is implemented in Pytorch.

neural image generation

Recursive Bayesian Networks

A curated list of references for MLOps

labelpix is a graphical image labeling interface for drawing bounding boxes

PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.

Refactoring dalle-pytorch and taming-transformers for TPU VM

This is an official implementation of the paper "Distance-aware Quantization", accepted to ICCV2021.

This repository is related to an Arabic tutorial, within the tutorial we discuss the common data structure and algorithms and their worst and best case for each, then implement the code using Python.

[ICCV 2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation

Pytorch Lightning Distributed Accelerators using Ray

Implementation of the paper "Self-Promoted Prototype Refinement for Few-Shot Class-Incremental Learning"

Gym environment for FLIPIT: The Game of "Stealthy Takeover"

Official repository for "Action-Based Conversations Dataset: A Corpus for Building More In-Depth Task-Oriented Dialogue Systems"

StarGAN2 for practice

MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)