An AI for Music Generation

Overview

MuseGAN

MuseGAN is a project on music generation. In a nutshell, we aim to generate polyphonic music of multiple tracks (instruments). The proposed models are able to generate music either from scratch, or by accompanying a track given a priori by the user.

We train the model with training data collected from Lakh Pianoroll Dataset to generate pop song phrases consisting of bass, drums, guitar, piano and strings tracks.

Sample results are available here.

Looking for a PyTorch version? Check out this repository.

Prerequisites

Below we assume the working directory is the repository root.

Install dependencies

  • Using pipenv (recommended)

    Make sure pipenv is installed. (If not, simply run pip install pipenv.)

    # Install the dependencies
    pipenv install
    # Activate the virtual environment
    pipenv shell
  • Using pip

    # Install the dependencies
    pip install -r requirements.txt

Prepare training data

The training data is collected from Lakh Pianoroll Dataset (LPD), a new multitrack pianoroll dataset.

# Download the training data
./scripts/download_data.sh
# Store the training data to shared memory
./scripts/process_data.sh

You can also download the training data manually (train_x_lpd_5_phr.npz).

As pianoroll matrices are generally sparse, we store only the indices of nonzero elements and the array shape into a npz file to save space, and later restore the original array. To save some training data data into this format, simply run np.savez_compressed("data.npz", shape=data.shape, nonzero=data.nonzero())

Scripts

We provide several shell scripts for easy managing the experiments. (See here for a detailed documentation.)

Below we assume the working directory is the repository root.

Train a new model

  1. Run the following command to set up a new experiment with default settings.

    # Set up a new experiment
    ./scripts/setup_exp.sh "./exp/my_experiment/" "Some notes on my experiment"
  2. Modify the configuration and model parameter files for experimental settings.

  3. You can either train the model:

    # Train the model
    ./scripts/run_train.sh "./exp/my_experiment/" "0"

    or run the experiment (training + inference + interpolation):

    # Run the experiment
    ./scripts/run_exp.sh "./exp/my_experiment/" "0"

Collect training data

Run the following command to collect training data from MIDI files.

# Collect training data
./scripts/collect_data.sh "./midi_dir/" "data/train.npy"

Use pretrained models

  1. Download pretrained models

    # Download the pretrained models
    ./scripts/download_models.sh

    You can also download the pretrained models manually (pretrained_models.tar.gz).

  2. You can either perform inference from a trained model:

    # Run inference from a pretrained model
    ./scripts/run_inference.sh "./exp/default/" "0"

    or perform interpolation from a trained model:

    # Run interpolation from a pretrained model
    ./scripts/run_interpolation.sh "./exp/default/" "0"

Outputs

By default, samples will be generated alongside the training. You can disable this behavior by setting save_samples_steps to zero in the configuration file (config.yaml). The generated will be stored in the following three formats by default.

  • .npy: raw numpy arrays
  • .png: image files
  • .npz: multitrack pianoroll files that can be loaded by the Pypianoroll package

You can disable saving in a specific format by setting save_array_samples, save_image_samples and save_pianoroll_samples to False in the configuration file.

The generated pianorolls are stored in .npz format to save space and processing time. You can use the following code to write them into MIDI files.

from pypianoroll import Multitrack

m = Multitrack('./test.npz')
m.write('./test.mid')

Sample Results

Some sample results can be found in ./exp/ directory. More samples can be downloaded from the following links.

Papers

Convolutional Generative Adversarial Networks with Binary Neurons for Polyphonic Music Generation
Hao-Wen Dong and Yi-Hsuan Yang
in Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), 2018.
[website] [arxiv] [paper] [slides(long)] [slides(short)] [poster] [code]

MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment
Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang and Yi-Hsuan Yang, (*equal contribution)
in Proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI), 2018.
[website] [arxiv] [paper] [slides] [code]

MuseGAN: Demonstration of a Convolutional GAN Based Model for Generating Multi-track Piano-rolls
Hao-Wen Dong,* Wen-Yi Hsiao,* Li-Chia Yang and Yi-Hsuan Yang (*equal contribution)
in Late-Breaking Demos of the 18th International Society for Music Information Retrieval Conference (ISMIR), 2017. (two-page extended abstract)
[paper] [poster]

Owner
Hao-Wen Dong
PhD Candidate in Computer Science at UC San Diego | Previous Intern at Dolby and Yamaha | Music x AI
Hao-Wen Dong
ianZiPu is a way to write notation for Guqin (古琴) music.

PyBetween Wrapper for Between - 비트윈을 위한 파이썬 라이브러리 Legal Disclaimer 오직 교육적 목적으로만 사용할수 있으며, 비트윈은 VCNC의 자산입니다. 악의적 공격에 이용할시 처벌 받을수 있습니다. 사용에 따른 책임은 사용자가

Nancy Yi Liang 8 Nov 25, 2022
Use python MIDI to write some simple music

Use Python MIDI to write songs

小宝 1 Nov 19, 2021
An AI for Music Generation

An AI for Music Generation

Hao-Wen Dong 1.3k Dec 31, 2022
Pyroomacoustics is a package for audio signal processing for indoor applications. It was developed as a fast prototyping platform for beamforming algorithms in indoor scenarios.

Summary Pyroomacoustics is a software package aimed at the rapid development and testing of audio array processing algorithms. The content of the pack

Audiovisual Communications Laboratory 1k Jan 09, 2023
Bot duniya Music Player

Bot duniya Music Player Requirements 📝 FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) 2nd Telegram Account (ne

Aman Vishwakarma 16 Oct 21, 2022
GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

GiantMIDI-Piano is a classical piano MIDI dataset contains 10,854 MIDI files of 2,786 composers

Bytedance Inc. 1.3k Jan 04, 2023
Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm.

LPC_for_TTS Linear Prediction Coefficients estimation from mel-spectrogram implemented in Python based on Levinson-Durbin algorithm. 基于Levinson-Durbin

Zewang ZHANG 58 Nov 17, 2022
Pyrogram bot to automate streaming music in voice chats

Pyrogram bot to automate streaming music in voice chats Help If you face an error, want to discuss this project or get support for it, join it's group

Roj 124 Oct 21, 2022
Basically Play Pauses the song when it is safe to do so. when you die in a round

Basically Play Pauses the song when it is safe to do so. when you die in a round

AG_1436 1 Feb 13, 2022
A music player designed for a University Project.

A music player designed for a University Project. Very flexibe and easy to use, a real life working application with user friendly controls. Hope u enjoy!!

Aditya Johorey 1 Nov 19, 2021
Gateware for the Terasic/Arrow DECA board, to become a USB2 high speed audio interface

DECA USB Audio Interface DECA based USB 2.0 High Speed audio interface Status / current limitations enumerates as class compliant audio device on Linu

Hans Baier 16 Mar 21, 2022
A Python wrapper around the Soundcloud API

soundcloud-python A friendly wrapper around the Soundcloud API. Installation To install soundcloud-python, simply: pip install soundcloud Or if you'r

SoundCloud 84 Dec 31, 2022
A Youtube audio player for your terminal

AudioLine A lightweight Youtube audio player for your terminal Explore the docs » View Demo · Report Bug · Request Feature · Send a Pull Request About

Haseeb Khalid 26 Jan 04, 2023
[Singing Log] Let your program learn to sing!

[Singing Log] Let your program learn to sing! You must have thought this was changelog when you saw the English title, but it's not, it's chànggēlog. What it does is allow your program to print logs

黄巍 22 Sep 03, 2022
LibXtract is a simple, portable, lightweight library of audio feature extraction functions.

LibXtract LibXtract is a simple, portable, lightweight library of audio feature extraction functions. The purpose of the library is to provide a relat

Jamie Bullock 215 Nov 16, 2022
Welcome to Nexus. Your personal virtual assistant

AI Voice Assistant Welcome to Nexus voice assistant Description Have you ever heard of voice assistants like Cortana, Siri, Google assistant, and Alex

Mustafah Zacs 1 Jan 10, 2022
A Simple Script that will help you to Play / Change Songs with just your Voice

Auto-Spotify using Voice Recognition A Simple Script that will help you to Play / Change Songs with just your Voice Explore the docs » Table of Conten

Mehul Shah 1 Nov 21, 2021
Algorithmic Multi-Instrumental MIDI Continuation Implementation

Matchmaker Algorithmic Multi-Instrumental MIDI Continuation Implementation Taming large-scale MIDI datasets with algorithms This is a WIP so please ch

Alex 2 Mar 11, 2022
A voice assistant which can be used to interact with your computer and controls your pc operations

Introduction 👨‍💻 It is a voice assistant which can be used to interact with your computer and also you have been seeing it in Iron man movies, but t

Sujith 84 Dec 22, 2022
Library for working with sound files of the format: .ogg, .mp3, .wav

Library for working with sound files of the format: .ogg, .mp3, .wav. By work is meant - playing sound files in a straight line and in the background, obtaining information about the sound file (auth

Romanin 2 Dec 15, 2022