Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

Last update: Nov 29, 2022

Related tags

Overview

1xN Pattern for Pruning Convolutional Neural Networks (paper) .

This is Pytorch re-implementation of "1xN Pattern for Pruning Convolutional Neural Networks". A more formal project will be released as soon as we are given the authority from Alibaba Group.

1) 1×N Block Pruning

Requirements

Python 3.7
Pytorch >= 1.0.1
CUDA = 10.0.0

Code Running

To reproduce our experiments, please use the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--job_dir ./experiment/ \
--data_path [DATA_PATH] \
--pretrained_model [PRETRAIN_MODEL_PATH] \
--pr_target 0.5 \
--N 4 (or 2, 8, 16, 32) \
--conv_type BlockL1Conv \
--train_batch_size 256 \
--eval_batch_size 256 \
--rearrange \

Accuracy Performance

Table 1: Performance comparison of our 1×N block sparsity against weight pruning and filter pruning (p = 50%).

MobileNet-V1	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	70.764	89.592	Pruned Model
Filter Pruning	65.348	86.264	Pruned Model
1 x 2 Block	70.281	89.370	Pruned Model
1 x 4 Block	70.052	89.056	Pruned Model
1 x 8 Block	69.908	89.027	Pruned Model
1 x 16 Block	69.559	88.933	Pruned Model
1 x 32 Block	69.541	88.801	Pruned Model

MobileNet-V2	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	71.146	89.872	Pruned Model
Filter Pruning	66.730	87.190	Pruned Model
1 x 2 Block	70.233	89.417	Pruned Model
1 x 4 Block	60.706	89.165	Pruned Model
1 x 8 Block	69.372	88.862	Pruned Model
1 x 16 Block	69.352	88.708	Pruned Model
1 x 32 Block	68.762	88.425	Pruned Model

MobileNet-V3-small	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	66.376	86.868	Pruned Model
Filter Pruning	59.054	81.713	Pruned Model
1 x 2 Block	65.380	86.060	Pruned Model
1 x 4 Block	64.465	85.495	Pruned Model
1 x 8 Block	64.101	85.274	Pruned Model
1 x 16 Block	63.126	84.203	Pruned Model
1 x 32 Block	62.881	83.982	Pruned Model

MobileNet-V3-large	Top-1 Acc.	Top-5 Acc.	Model Link
Weight Pruning	72.897	91.093	Pruned Model
Filter Pruning	69.137	89.097	Pruned Model
1 x 2 Block	72.120	90.677	Pruned Model
1 x 4 Block	71.935	90.458	Pruned Model
1 x 8 Block	71.478	90.163	Pruned Model
1 x 16 Block	71.112	90.129	Pruned Model
1 x 32 Block	70.769	89.696	Pruned Model

More links for pruned models under different pruning rates and their training logs can be found in MobileNet-V2 and ResNet-50.

Evaluate our models

To verify the performance of our pruned models, download our pruned models from the links provided above and run the following command:

python imagenet.py \
--gpus 0 \
--arch mobilenet_v1 (or mobilenet_v2 or mobilenet_v3_large or mobilenet_v3_small) \
--data_path [DATA_PATH] \
--conv_type DenseConv \
--evaluate [PRUNED_MODEL_PATH] \
--eval_batch_size 256 \

Arguments

optional arguments:
  -h, --help            show this help message and exit
  --gpus                Select gpu_id to use. default:[0]
  --data_path           The dictionary where the data is stored.
  --job_dir             The directory where the summaries will be stored.
  --resume              Load the model from the specified checkpoint.
  --pretrain_model      Path of the pre-trained model.
  --pruned_model        Path of the pruned model to evaluate.
  --arch                Architecture of model. For ImageNet :mobilenet_v1, mobilenet_v2, mobilenet_v3_small, mobilenet_v3_large
  --num_epochs          The num of epochs to train. default:180
  --train_batch_size    Batch size for training. default:256
  --eval_batch_size     Batch size for validation. default:100
  --momentum            Momentum for Momentum Optimizer. default:0.9
  --lr LR               Learning rate. default:1e-2
  --lr_decay_step       The iterval of learn rate decay for cifar. default:100 150
  --lr_decay_freq       The frequecy of learn rate decay for Imagenet. default:30
  --weight_decay        The weight decay of loss. default:4e-5
  --lr_type             lr scheduler. default: cos. optional:exp/cos/step/fixed
  --use_dali            If this parameter exists, use dali module to load ImageNet data (benefit in training acceleration).
  --conv_type           Importance criterion of filters. Default: BlockL1Conv. optional: BlockRandomConv, DenseConv
  --pr_target           Pruning rate. default:0.5
  --full                If this parameter exists, prune fully-connected layer.
  --N                   Consecutive N kernels for removal (see paper for details).
  --rearrange           If this parameter exists, filters will be rearranged (see paper for details).
  --export_onnx         If this parameter exists, export onnx model.

2）Filter Rearrangement

Table 2: Performance studies of our 1×N block sparsity with and without filter rearrangement (p=50%).

N = 2	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	69.900	89.296	Pruned Model
Rearrange	70.233	89.417	Pruned Model

N = 4	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	69.521	88.920	Pruned Model
Rearrange	69.579	88.944	Pruned Model

N = 8	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	69.206	88.608	Pruned Model
Rearrange	69.372	88.862	Pruned Model

N = 16	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	68.971	88.399	Pruned Model
Rearrange	69.352	88.708	Pruned Model

N = 32	Top-1 Acc.	Top-5 Acc.	Model Link
w/o Rearange	68.431	88.315	Pruned Model
Rearrange	68.762	88.425	Pruned Model

3）Encoding and Decoding Efficiency

Performance and latency comparison

Our sparse convolution implementation has been released to TVM community.

To verify the performance of our pruned models, convert onnx model and run the following command:

python model_tune.py \
--onnx_path [ONNX_MODEL_PATH] \
--bsr 4 \
--bsc 1 \
--sparsity 0.5

The detail tuning setting is referred to TVM.

4）Contact

Any problem regarding this code re-implementation, please contact the first author: [email protected] or the third author: [email protected].

Any problem regarding the sparse convolution implementation, please contact the second author: [email protected].

Pytorch implementation of our paper under review -- 1xN Pattern for Pruning Convolutional Neural Networks

Related tags

Overview

1xN Pattern for Pruning Convolutional Neural Networks (paper) .

1) 1×N Block Pruning

Requirements

Code Running

Accuracy Performance

Evaluate our models

Arguments

2）Filter Rearrangement

3）Encoding and Decoding Efficiency

Performance and latency comparison

4）Contact

Owner

Mingbao Lin (林明宝)

Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Testability-Aware Low Power Controller Design with Evolutionary Learning, ITC2021

基于PaddleOCR搭建的OCR server... 离线部署用

ML-PersonalWork - Big assignment PersonalWork in Machine Learning, 2021 autumn BUAA.

3D dataset of humans Manipulating Objects in-the-Wild (MOW)

Get started learning C# with C# notebooks powered by .NET Interactive and VS Code.

Official Pytorch implementation of "Unbiased Classification Through Bias-Contrastive and Bias-Balanced Learning (NeurIPS 2021)

Open-CyKG: An Open Cyber Threat Intelligence Knowledge Graph

Alphabetical Letter Recognition

The Most Efficient Temporal Difference Learning Framework for 2048

Export CenterPoint PonintPillars ONNX Model For TensorRT

PyTorch implementation of Weak-shot Fine-grained Classification via Similarity Transfer

Generalized and Efficient Blackbox Optimization System.

Predictive Maintenance LSTM

Invariant Causal Prediction for Block MDPs

A library for Deep Learning Implementations and utils

Online Multi-Granularity Distillation for GAN Compression (ICCV2021)

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

QueryFuzz implements a metamorphic testing approach to test Datalog engines.

An official reimplementation of the method described in the INTERSPEECH 2021 paper - Speech Resynthesis from Discrete Disentangled Self-Supervised Representations.