Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Last update: Jan 01, 2023

Overview

RAVE: Realtime Audio Variational autoEncoder

Official implementation of RAVE: A variational autoencoder for fast and high-quality neural audio synthesis (article link) by Antoine Caillon and Philippe Esling.

If you use RAVE as a part of a music performance or installation, be sure to cite either this repository or the article !

Installation

RAVE needs python 3.9. Install the dependencies using

pip install -r requirements.txt

Detailed instructions to setup a training station for this project are available here.

Preprocessing

RAVE comes with two command line utilities, resample and duration. resample allows to pre-process (silence removal, loudness normalization) and augment (compression) an entire directory of audio files (.mp3, .aiff, .opus, .wav, .aac). duration prints out the total duration of a .wav folder.

Training

Both RAVE and the prior model are available in this repo. For most users we recommand to use the cli_helper.py script, since it will generate a set of instructions allowing the training and export of both RAVE and the prior model on a specific dataset.

python cli_helper.py

However, if you want to customize even more your training, you can use the provided train_{rave, prior}.py and export_{rave, prior}.py scripts manually.

Reconstructing audio

Once trained, you can reconstruct an entire folder containing wav files using

python reconstruct.py --ckpt /path/to/checkpoint --wav-folder /path/to/wav/folder

You can also export RAVE to a torchscript file using export_rave.py and use the encode and decode methods on tensors.

Realtime usage

UPDATE

If you want to use the realtime mode, you should update your dependencies !

pip install -r requirements.txt

RAVE and the prior model can be used in realtime on live audio streams, allowing creative interactions with both models.

nn~

RAVE is compatible with the nn~ max/msp and PureData external.

An audio example of the prior sampling patch is available in the docs/ folder.

RAVE vst

You can also use RAVE as a VST audio plugin using the RAVE vst !

Discussion

If you have questions, want to share your experience with RAVE or share musical pieces done with the model, you can use the Discussion tab !

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Related tags

Overview

RAVE: Realtime Audio Variational autoEncoder

Installation

Preprocessing

Training

Reconstructing audio

Realtime usage

nn~

RAVE vst

Discussion

Owner

ACIDS

EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

[Link]deep_portfolo - Use Reforcemet earg ad Supervsed learg to Optmze portfolo allocato []

Recognize numbers from an (28 x 28) image using neural networks

PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training”

PyTorch implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose

Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

VISNOTATE: An Opensource tool for Gaze-based Annotation of WSI Data

A custom-designed Spider Robot trained to walk using Deep RL in a PyBullet Simulation

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

A simple configurable bot for sending arXiv article alert by mail

The Official Repository for "Generalized OOD Detection: A Survey"

DeepGNN is a framework for training machine learning models on large scale graph data.

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Classification Modeling: Probability of Default

Code for the paper "Can Active Learning Preemptively Mitigate Fairness Issues?" presented at RAI 2021.

The official code of "SCROLLS: Standardized CompaRison Over Long Language Sequences".

[NeurIPS 2021] A weak-shot object detection approach by transferring semantic similarity and mask prior.

Stitch it in Time: GAN-Based Facial Editing of Real Videos

Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

RARA: Zero-shot Sim2Real Visual Navigation with Following Foreground Cues