Official implementation of A cappella: Audio-visual Singing VoiceSeparation, from BMVC21

Overview

Y-Net

Official implementation of A cappella: Audio-visual Singing VoiceSeparation, British Machine Vision Conference 2021

Project page: ipcv.github.io/Acappella/
Paper: Arxiv, Supplementary Material, BMVC (not available yet)

Running a demo / Y-Net Inference

We provide simple functions to load models with pre-trained weights. Steps:

  1. Clone the repo or download y-net>VnBSS>models (models can run as a standalone package)
  2. Load a model:
from VnBSS import y_net_gr # or from models import y_net_gr 
model = y_net_gr()

Examples can be found at y_net>examples. Also you can have a look at tcol.py or example.py, files which computes the demos shown in the website.
Check a demo fully working:
Open In Colab

Citation

@inproceedings{acappella,
    author    = {Juan F. Montesinos and
                 Venkatesh S. Kadandale and
                 Gloria Haro},
    title     = {A cappella: Audio-visual Singing VoiceSeparation},
    booktitle = {British Machine Vision Conference (BMVC)},
    year      = {2021},

}

.
.
.
.
.
.

Training / Using DEV code

Training

The most difficult part is to prepare the dataset as everything is builded upon a very specific format.
To run training:
python run.py -m model_name --workname experiment_name --arxiv_path directory_of_experiments --pretrained_from path_pret_weights
You can inspect the argparse at default.py>argparse_default.
Possible model names are: y_net_g, y_net_gr, y_net_m,y_net_r,u_net,llcp

Testing

  1. Go to manuscript_scripts and replace checkpoint paths by yours in the testing scripts.
  2. Run: bash manuscript_scripts/test_gr_r.sh
  3. Replace the paths of manuscript_scripts/auto_metrics.py by your experiment_directory path.
  4. Run: python manuscript_scripts/auto_metrics.py to visualise results.

It's a complicated framework. HELP!

The best option to run the framework is to debug! Having a runable code helps to see input shapes, dataflow and to run line by line. Download The circle of life demo with the files already processed. It will act like a dataset of 6 samples. You can download it from Google Drive 1.1 Gb.

  1. Unzip the file
  2. run python run.py -m y_net_gr (for example) TODO :D

Everything has been configured to run by default this way.

The model

Each effective model is wrapped by a nn.Module which takes care of computing the STFT, the mask, returning the waveform etcetera... This wrapper can be found at VnBSS>models>y_net.py>YNet. To get rid of this you can simply inherit the class, take minimum layers and keep the core_forward method, which is the inference step without the miscelanea.

Downloading the datasets

To download the Acappella Dataset run the script at preproc>preprocess.py
To download the demos used in the website run preproc>demo_preprocessor.py
Audioset can be downloaded via webapp, streamlit run audioset.py

Computing the demos

Demos shown in the website can be computed:

  • The circle of life demo is obtained by running tcol.py. First turn the flag COMPUTE=True. To visualize the results turn the flag COMPUTE=False and run a streamlit run tcol.py.

FAQs

  1. How to change the optimizer's hyperparameters?
    Go to config>optimizer.json
  2. How to change clip duration, video framerate, STFT parameters or audio samplerate?
    Go to config>__init__.py
  3. How to change the batch size or the amount of epochs?
    Go to config>hyptrs.json
  4. How to dump predictions from the training and test set
    Go to default.py. Modify DUMP_FILES (can be controlled at a subset level). force argument skips the iteration-wise conditions and dumps for every single network prediction.
  5. Is tensorboard enabled?
    Yes, you will find tensorboard records at your_experiment_directory/used_workname/tensorboard
  6. Can I resume an experiment?
    Yes, if you set exactly the same experiment folder and workname, the system will detect it and will resume from there.
  7. I'm trying to resume but found AssertionError If there is an exception before running the model
  8. How to change the amount of layers of U-Net
    U-net is build dynamically given a list of layers per block as shown in models>__init__.py from outer to inner blocks.
  9. How to modify the default network values?
    The json file config>net_cfg.json overwrites any default configuration from the model.
Owner
Juan F. Montesinos
PhD student at Pompeu Fabra university Barcelona
Juan F. Montesinos
Just-Music - Spotify API Driven Music Web app, that allows to listen and control and share songs

Just Music... Just Music Is A Web APP That Allows Users To Play Song Using Spoti

Ayush Mishra 3 May 01, 2022
Anki vector Music ❤ is the best and only Telegram VC player with playlists, Multi Playback, Channel play and more

Anki Vector Music 🎵 A bot that can play music on Telegram Group and Channel Voice Chats Available on telegram as @Anki Vector Music Features 🔥 Thumb

Damantha Jasinghe 12 Nov 12, 2022
Graphical interface to control granular sound synthesis.

Granular sound synthesis interface SoundGrain is a graphical interface where users can draw and edit trajectories to control granular sound synthesis

Olivier Bélanger 122 Dec 10, 2022
extract unpack asset file (form unreal engine 4 pak) with extenstion *.uexp which contain awb/acb (cri/cpk like) sound or music resource

Uexp2Awb extract unpack asset file (form unreal engine 4 pak) with extenstion .uexp which contain awb/acb (cri/cpk like) sound or music resource. i ju

max 6 Jun 22, 2022
C++ library for audio and music analysis, description and synthesis, including Python bindings

Essentia Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPL license.

Music Technology Group - Universitat Pompeu Fabra 2.3k Jan 03, 2023
Spotify Song Recommendation Program

Spotify-Song-Recommendation-Program Made by Esra Nur Özüm Written in Python The aim of this project was to build a recommendation system that recommen

esra nur özüm 1 Jun 30, 2022
This is an OverPowered Vc Music Player! Will work for you and play music in Voice Chatz

VcPlayer This is an OverPowered Vc Music Player! Will work for you and play music in Voice Chatz Telegram Voice-Chat Bot [PyTGCalls] ⇝ Requirements ⇜

1 Dec 20, 2021
GNOME powered sound conversion

SoundConverter A simple sound converter application for the GNOME environment. It reads anything the GStreamer library can read, and writes Ogg Vorbis

Gautier Portet 188 Dec 17, 2022
Music Streaming Platform based on full implementation of DBSM

Symphony Music Streaming Platform based on full implementation of DBSM List of Commands Insert User (INSERT) Function to implement input in USER Get a

Parth Maradia 1 Nov 12, 2021
A rofi-blocks script that searches youtube and plays the selected audio on mpv.

rofi-ytm A rofi-blocks script that searches youtube and plays the selected audio on mpv. To use the script, run the following command rofi -modi block

Cliford 26 Dec 21, 2022
Simple, hackable offline speech to text - using the VOSK-API.

Nerd Dictation Offline Speech to Text for Desktop Linux. This is a utility that provides simple access speech to text for using in Linux without being

Campbell Barton 844 Jan 07, 2023
F.R.I.D.A.Y. ----- Female Replacement Intelligent Digital Assistant Youth

F.R.I.D.A.Y. Female Replacement Intelligent Digital Assistant Youth--Jarvis-- the virtual assistant made by python Overview This is a virtual assistan

JIB - Just Innovative Bro 4 Feb 26, 2022
Enhanced Audio Player for Discord

Discodo is an enhanced audio player for discord

Mary 42 Oct 05, 2022
Play any song directly into your group voice chat.

Telegram VCPlayer Bot Play any song directly into your group voice chat. Official Bot : VCPlayerBot | Discussion Group : VoiceChat Music Player Suppor

Shubham Kumar 50 Nov 21, 2022
Multi-Track Music Generation with the Transfomer and the Johann Sebastian Bach Chorales dataset

MMM: Exploring Conditional Multi-Track Music Generation with the Transformer and the Johann Sebastian Bach Chorales Dataset. Implementation of the pap

102 Dec 08, 2022
Nayeli: cool telegram groups vc music project

Nayeli-music Nayeli 🥀 is cool telegram 🍎 groups vc music project 🎋 . Nayeli-music Nayeli Deployment 🎋 📲 Esy deploy 🐾️ Source Owner ♥️ ❄️ He is s

Kasun bandara 2 Dec 20, 2021
A python script that can play .mp3 URLs upon the ringing or motion detection of a Ring doorbell. The sound plays through Sonos speakers.

Ring x Sonos A python script that plays .mp3 files whenever a doorbell is rung or a doorbell detects motion. Features Music! Authors @braden Running T

braden 0 Nov 12, 2021
Audio book player for senior visually impaired.

PI Zero W Audio Book Motivation and requirements My dad is practically blind and at 80 years has trouble hearing and operating tiny or more complicate

Andrej Hosna 29 Dec 25, 2022
This is my voice assistant Patric!

voice-assistant This is my voice assistant Patric! You can add can add commands and even modify his name Indice How to use Installation guide How to u

Norbert Gabos 1 Jun 28, 2022
A Quick Music Player Made Fully in Python

Quick Music Player Made Fully In Python. Pure Python, cross platform, single function module with no dependencies for playing sounds. Installation & S

1 Dec 24, 2021