ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Last update: Jan 03, 2023

Overview

Voice2Series-Reprogramming

Voice2Series: Reprogramming Acoustic Models for Time Series Classification

International Conference on Machine Learning (ICML), 2021 | Paper | Colab Demo

Environment

Tensorflow 2.2 (CUDA=10.0) and Kapre 0.2.0.

Noted: Echo to many interests from the community, we will also provide Pytorch V2S layers and frameworks around this September, incoperating the new torch audio layers. Feel free to email the authors for further collaboration.
option 1 (from yml)

conda env create -f V2S.yml

option 2 (from clean python 3.6)

pip install tensorflow-gpu==2.1.0
pip install kapre==0.2.0
pip install h5py==2.10.0

Training

This is tengible Version. Please also check the paper for actual validation details. Many Thanks!

python v2s_main.py --dataset 0 --eps 100 --mapping 3

Result

seg idx: 0 --> start: 0, end: 500
seg idx: 1 --> start: 5000, end: 5500
seg idx: 2 --> start: 10000, end: 10500
Tensor("AddV2_2:0", shape=(None, 16000, 1), dtype=float32)
--- Preparing Masking Matrix
Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            [(None, 500, 1)]     0                                            
__________________________________________________________________________________________________
zero_padding1d (ZeroPadding1D)  (None, 16000, 1)     0           input_1[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_AddV2 (TensorFlowOp [(None, 16000, 1)]   0           zero_padding1d[0][0]             
__________________________________________________________________________________________________
zero_padding1d_1 (ZeroPadding1D (None, 16000, 1)     0           input_1[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_AddV2_1 (TensorFlow [(None, 16000, 1)]   0           tf_op_layer_AddV2[0][0]          
                                                                 zero_padding1d_1[0][0]           
__________________________________________________________________________________________________
zero_padding1d_2 (ZeroPadding1D (None, 16000, 1)     0           input_1[0][0]                    
__________________________________________________________________________________________________
tf_op_layer_AddV2_2 (TensorFlow [(None, 16000, 1)]   0           tf_op_layer_AddV2_1[0][0]        
                                                                 zero_padding1d_2[0][0]           
__________________________________________________________________________________________________
art_layer (ARTLayer)            (None, 16000, 1)     16000       tf_op_layer_AddV2_2[0][0]        
__________________________________________________________________________________________________
reshape_1 (Reshape)             (None, 16000)        0           art_layer[0][0]                  
__________________________________________________________________________________________________
model (Model)                   (None, 36)           1292911     reshape_1[0][0]                  
__________________________________________________________________________________________________
tf_op_layer_MatMul (TensorFlowO [(None, 6)]          0           model[1][0]                      
__________________________________________________________________________________________________
tf_op_layer_Shape (TensorFlowOp [(2,)]               0           tf_op_layer_MatMul[0][0]         
__________________________________________________________________________________________________
tf_op_layer_strided_slice (Tens [()]                 0           tf_op_layer_Shape[0][0]          
__________________________________________________________________________________________________
tf_op_layer_Reshape_2/shape (Te [(3,)]               0           tf_op_layer_strided_slice[0][0]  
__________________________________________________________________________________________________
tf_op_layer_Reshape_2 (TensorFl [(None, 2, 3)]       0           tf_op_layer_MatMul[0][0]         
                                                                 tf_op_layer_Reshape_2/shape[0][0]
__________________________________________________________________________________________________
tf_op_layer_Mean (TensorFlowOpL [(None, 2)]          0           tf_op_layer_Reshape_2[0][0]      
==================================================================================================
Total params: 1,308,911
Trainable params: 217,225
Non-trainable params: 1,091,686
__________________________________________________________________________________________________
Epoch 1/100
2021-07-19 01:43:32.690913: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-07-19 01:43:32.919343: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
113/113 [==============================] - 6s 50ms/step - loss: 0.0811 - accuracy: 1.0000 - val_loss: 1.5589e-04 - val_accuracy: 1.0000
Epoch 2/100
113/113 [==============================] - 5s 41ms/step - loss: 5.0098e-05 - accuracy: 1.0000 - val_loss: 1.0906e-05 - val_accuracy: 1.0000

Class Activation Mapping

python cam_v2s.py --dataset 5 --weight wNo5_map6-88-0.7662.h5 --mapping 6 --layer conv2d_1

Reference

Voice2Series: Reprogramming Acoustic Models for Time Series Classification

@InProceedings{pmlr-v139-yang21j,
  title = 	 {Voice2Series: Reprogramming Acoustic Models for Time Series Classification},
  author =       {Yang, Chao-Han Huck and Tsai, Yun-Yun and Chen, Pin-Yu},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {11808--11819},
  year = 	 {2021},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
}

ICML 21 - Voice2Series: Reprogramming Acoustic Models for Time Series Classification

Related tags

Overview

Voice2Series-Reprogramming

Environment

Training

Class Activation Mapping

Reference

Owner

Visyerres sgdf woob - Modules Woob pour l'intranet et autres sites Scouts et Guides de France

The official codes for the ICCV2021 presentation "Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting"

Lab Materials for MIT 6.S191: Introduction to Deep Learning

Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

DeepConsensus uses gap-aware sequence transformers to correct errors in Pacific Biosciences (PacBio) Circular Consensus Sequencing (CCS) data.

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

PyTorch Implementation of SSTNs for hyperspectral image classifications from the IEEE T-GRS paper "Spectral-Spatial Transformer Network for Hyperspectral Image Classification: A FAS Framework."

Live Hand Tracking Using Python

Implementation of "Fast and Flexible Temporal Point Processes with Triangular Maps" (Oral @ NeurIPS 2020)

Official implementation of "Can You Spot the Chameleon? Adversarially Camouflaging Images from Co-Salient Object Detection" in CVPR 2022.

Swapping face using Face Mesh with TensorFlow Lite

Spherical Confidence Learning for Face Recognition, accepted to CVPR2021.

ParmeSan: Sanitizer-guided Greybox Fuzzing

Material related to the Principles of Cloud Computing course.

Scalable Multi-Agent Reinforcement Learning

Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm

Official code for "EagerMOT: 3D Multi-Object Tracking via Sensor Fusion" [ICRA 2021]

Shitty gaze mouse controller

[SIGGRAPH'22] StyleGAN-XL: Scaling StyleGAN to Large Diverse Datasets