Zalo AI challenge 2021 task hum to song

Last update: Dec 16, 2022

Related tags

Deep Learning hum2song

Overview

Zalo AI challenge 2021 task Hum to Song

pipeline:

Chuẩn bị dữ liệu cho quá trình train:

Sửa các file đường dẫn trong config/preprocess.yaml
- raw_path: đường dẫn đến data thô
- preprocessed_path: đường dẫn đầu ra của quá trình rút trích mel
- temp_dir: đường dẫn chứa dữ liệu mp3 được chuẩn hóa
- Chạy lần lượt các lệnh sau:

        python preprocessing.py

        python utils/split_train_val_by_id.py
   
        python utils/augment_mp3.py
   
        python utils/preprocess_augment.py

Train model:

Sửa các file đường dẫn trong config/config.py
- meta_train: đường dẫn đến file train_meta.csv trong preprocessed_path
- train_root: đường dẫn đến dữ liệu mel đã tiền xử lý
- train_list = 'full_data_train.txt'
- val_list = 'full_data_val.txt'
Chạy lần lượt các lệnh sau:

        python convert_data.py

        python train.py

Infer public test:

Đặt dữ liệu mp3 thô ở địa chỉ /data/public_test (bên trong chứa 2 thư mục full_song và hum)
Chạy lần lượt các lệnh sau:

./predict.sh

Infer private test:

Đặt dữ liệu mp3 thô ở địa chỉ /data/private_test (bên trong chứa 2 thư mục full_song và hum)

Chạy lần lượt các lệnh sau:

./predict_private_test.sh

Team:

Võ Văn Phúc

Nguyễn Văn Thiều

Lâm Bá Thịnh

Zalo AI challenge 2021 task hum to song

Related tags

Overview

Zalo AI challenge 2021 task Hum to Song

pipeline:

Chuẩn bị dữ liệu cho quá trình train:

Train model:

Infer public test:

Infer private test:

Team:

Owner

Vo Van Phuc

The PyTorch improved version of TPAMI 2017 paper: Face Alignment in Full Pose Range: A 3D Total Solution.

Doods2 - API for detecting objects in images and video streams using Tensorflow

🧑‍🔬 verify your TEAL program by experiment and observation

Implementation of Shape and Electrostatic similarity metric in deepFMPO.

List some popular DeepFake models e.g. DeepFake, FaceSwap-MarekKowal, IPGAN, FaceShifter, FaceSwap-Nirkin, FSGAN, SimSwap, CihaNet, etc.

Synthesizing and manipulating 2048x1024 images with conditional GANs

On-device speech-to-intent engine powered by deep learning

A Python-based development platform for automated trading systems - from backtesting to optimisation to livetrading.

The implementation of the lifelong infinite mixture model

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition - NeurIPS2021

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.

A Number Recognition algorithm

PyTorch implementations of the paper: "Learning Independent Instance Maps for Crowd Localization"

Our solution for SSN Invente 2021's Hackathon

Many Class Activation Map methods implemented in Pytorch for CNNs and Vision Transformers. Including Grad-CAM, Grad-CAM++, Score-CAM, Ablation-CAM and XGrad-CAM

This is the official implementation of "One Question Answering Model for Many Languages with Cross-lingual Dense Passage Retrieval".

A fast Protein Chain / Ligand Extractor and organizer.

FLSim a flexible, standalone library written in PyTorch that simulates FL settings with a minimal, easy-to-use API

Create time-series datacubes for supervised machine learning with ICEYE SAR images.

Implementation of "A Deep Learning Loss Function based on Auditory Power Compression for Speech Enhancement" by pytorch