Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Overview

DominoSearch

This is repository for codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

Instructions and other materials will be released soon.

Search:

git clone https://github.com/NM-sparsity/DominoSearch.git
cd DominoSearch/DominoSearch/search/script_resnet_ImageNet

We provide several search scripts for different sparse-ratio target, you can specify your own target and change the parameters accordingly. Note, you need to first specify your ImageNet dataset path

The searching phase could take 2-3 hours, then you will get searched schemes stored in a txt file, which will be needed as input for mixed-sparsity training.

Below is an example of output formate.

{'SparseConv0_3-64-(7, 7)': [16, 16], 'SparseConv1_64-64-(1, 1)': [16, 16], 'SparseConv2_64-64-(3, 3)': [4, 16], 'SparseConv3_64-256-(1, 1)': [8, 16], 'SparseConv4_64-256-(1, 1)': [8, 16], 'SparseConv5_256-64-(1, 1)': [8, 16], 'SparseConv6_64-64-(3, 3)': [4, 16], 'SparseConv7_64-256-(1, 1)': [8, 16], 'SparseConv8_256-64-(1, 1)': [8, 16], 'SparseConv9_64-64-(3, 3)': [4, 16], 'SparseConv10_64-256-(1, 1)': [8, 16], 'SparseConv11_256-128-(1, 1)': [8, 16], 'SparseConv12_128-128-(3, 3)': [2, 16], 'SparseConv13_128-512-(1, 1)': [8, 16], 'SparseConv14_256-512-(1, 1)': [4, 16], 'SparseConv15_512-128-(1, 1)': [8, 16], 'SparseConv16_128-128-(3, 3)': [4, 16], 'SparseConv17_128-512-(1, 1)': [8, 16], 'SparseConv18_512-128-(1, 1)': [8, 16], 'SparseConv19_128-128-(3, 3)': [4, 16], 'SparseConv20_128-512-(1, 1)': [8, 16], 'SparseConv21_512-128-(1, 1)': [8, 16], 'SparseConv22_128-128-(3, 3)': [2, 16], 'SparseConv23_128-512-(1, 1)': [8, 16], 'SparseConv24_512-256-(1, 1)': [4, 16], 'SparseConv25_256-256-(3, 3)': [2, 16], 'SparseConv26_256-1024-(1, 1)': [4, 16], 'SparseConv27_512-1024-(1, 1)': [4, 16], 'SparseConv28_1024-256-(1, 1)': [4, 16], 'SparseConv29_256-256-(3, 3)': [2, 16], 'SparseConv30_256-1024-(1, 1)': [4, 16], 'SparseConv31_1024-256-(1, 1)': [4, 16], 'SparseConv32_256-256-(3, 3)': [2, 16], 'SparseConv33_256-1024-(1, 1)': [4, 16], 'SparseConv34_1024-256-(1, 1)': [4, 16], 'SparseConv35_256-256-(3, 3)': [2, 16], 'SparseConv36_256-1024-(1, 1)': [4, 16], 'SparseConv37_1024-256-(1, 1)': [4, 16], 'SparseConv38_256-256-(3, 3)': [2, 16], 'SparseConv39_256-1024-(1, 1)': [4, 16], 'SparseConv40_1024-256-(1, 1)': [4, 16], 'SparseConv41_256-256-(3, 3)': [2, 16], 'SparseConv42_256-1024-(1, 1)': [4, 16], 'SparseConv43_1024-512-(1, 1)': [4, 16], 'SparseConv44_512-512-(3, 3)': [2, 16], 'SparseConv45_512-2048-(1, 1)': [4, 16], 'SparseConv46_1024-2048-(1, 1)': [2, 16], 'SparseConv47_2048-512-(1, 1)': [4, 16], 'SparseConv48_512-512-(3, 3)': [2, 16], 'SparseConv49_512-2048-(1, 1)': [4, 16], 'SparseConv50_2048-512-(1, 1)': [4, 16], 'SparseConv51_512-512-(3, 3)': [2, 16], 'SparseConv52_512-2048-(1, 1)': [4, 16], 'Linear0_2048-1000': [4, 16]}

Train:

After getting the layer-wise sparse schemes, we need to fine-tune with the schemes to recover the accuracy. The training code is based on NM-sparsity, where we made some changes to support flexible N:M schemes.

Below is an example of training layer-wise sparse resnet50 with 80% overall sparsity.

cd DominoSearch\DominoSearch\train\classification_sparsity_level\train_imagenet
 python -m torch.distributed.launch --nproc_per_node=8 ../train_imagenet.py --config ./configs/config_resnet50.yaml  --base_lr 0.01 --decay 0.0005 --epochs 120 --schemes_file ./schemes/resnet50_M16_0.80.txt --model_dir ./resnet50/resnet50_0.80_M16

Experiments

We provide the trained models of the experiments. Please check our paper for details and intepretations of the experiments.

ResNet50 experiments in section 4.1

Model Name TOP1 Accuracy Trained Model Searched schemes
resnet50 - 0.80 model size 76.7 google drive google drive
resnet50 - 0.875 model size 75.7 google drive google drive
resnet50 - 0.9375 model size 73.5 google drive google drive
resnet50 - 8x FLOPs 75.4 google drive google drive
resnet50- 16x FLOPs 73.4 google drive google drive

Ablation experiments of ResNet50 in section 5.3

Model Name TOP1 Accuracy Trained Model Train log
Ablation E3 76.1 google drive google drive
Ablation E4 76.4 google drive google drive
Ablation E6 76.6 google drive google drive
Ablation E7 75.6 google drive google drive

Citation

@inproceedings{
sun2021dominosearch,
title={DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks},
author={Wei Sun and Aojun Zhou and Sander Stuijk and Rob G. J. Wijnhoven and Andrew Nelson and Hongsheng Li and Henk Corporaal},
booktitle={Thirty-Fifth Conference on Neural Information Processing Systems},
year={2021},
url={https://openreview.net/forum?id=IGrC6koW_g}
}
This is an official implementation for "SimMIM: A Simple Framework for Masked Image Modeling".

Project This repo has been populated by an initial template to help get you started. Please make sure to update the content to build a great experienc

Microsoft 674 Dec 26, 2022
Human Activity Recognition example using TensorFlow on smartphone sensors dataset and an LSTM RNN. Classifying the type of movement amongst six activity categories - Guillaume Chevalier

LSTMs for Human Activity Recognition Human Activity Recognition (HAR) using smartphones dataset and an LSTM RNN. Classifying the type of movement amon

Guillaume Chevalier 3.1k Dec 30, 2022
Google-drive-to-sqlite - Create a SQLite database containing metadata from Google Drive

google-drive-to-sqlite Create a SQLite database containing metadata from Google

Simon Willison 140 Dec 04, 2022
Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques"

THESIS_CAIRONE_FIORENTINO Politecnico of Turin Thesis: "Implementation and Evaluation of an Educational Chatbot based on NLP Techniques" GENERATE TOKE

cairone_fiorentino97 1 Dec 10, 2021
Codebase for Inducing Causal Structure for Interpretable Neural Networks

Interchange Intervention Training (IIT) Codebase for Inducing Causal Structure for Interpretable Neural Networks Release Notes 12/01/2021: Code and Pa

Zen 6 Oct 10, 2022
Código de um painel de auto atendimento feito em Python.

Painel de Auto-Atendimento O intuito desse projeto era fazer em Python um programa que simulasse um painel de auto atendimento, no maior estilo Mac Do

Calebe Alves Evangelista 2 Nov 09, 2022
VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

    VarCLR: Variable Representation Pre-training via Contrastive Learning New: Paper accepted by ICSE 2022. Preprint at arXiv! This repository contain

squaresLab 32 Oct 24, 2022
Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

Raphael Sulzer 23 Jan 04, 2023
Shape-Adaptive Selection and Measurement for Oriented Object Detection

Source Code of AAAI22-2171 Introduction The source code includes training and inference procedures for the proposed method of the paper submitted to t

houliping 24 Nov 29, 2022
Tutorial repo for an end-to-end Data Science project

End-to-end Data Science project This is the repo with the notebooks, code, and additional material used in the ITI's workshop. The goal of the session

Deena Gergis 127 Dec 30, 2022
Reproducing Results from A Hybrid Approach to Targeting Social Assistance

title author date output Reproducing Results from A Hybrid Approach to Targeting Social Assistance Lendie Follett and Heath Henderson 12/28/2021 html_

Lendie Follett 0 Jan 06, 2022
SysWhispers Shellcode Loader

Shhhloader Shhhloader is a SysWhispers Shellcode Loader that is currently a Work in Progress. It takes raw shellcode as input and compiles a C++ stub

icyguider 630 Jan 03, 2023
Adaptive FNO transformer - official Pytorch implementation

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers This repository contains PyTorch implementation of the Adaptive Fourier Neu

NVIDIA Research Projects 77 Dec 29, 2022
Implementation of CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification

CrossViT : Cross-Attention Multi-Scale Vision Transformer for Image Classification This is an unofficial PyTorch implementation of CrossViT: Cross-Att

Rishikesh (ऋषिकेश) 103 Nov 25, 2022
Dynamics-aware Adversarial Attack of 3D Sparse Convolution Network

Leaded Gradient Method (LGM) This repository contains the PyTorch implementation for paper Dynamics-aware Adversarial Attack of 3D Sparse Convolution

An Tao 2 Oct 18, 2022
High performance distributed framework for training deep learning recommendation models based on PyTorch.

PERSIA (Parallel rEcommendation tRaining System with hybrId Acceleration) is developed by AI 340 Dec 30, 2022

Like Dirt-Samples, but cleaned up

Clean-Samples Like Dirt-Samples, but cleaned up, with clear provenance and license info (generally a permissive creative commons licence but check the

TidalCycles 39 Nov 30, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

24 Nov 02, 2022
Extracting knowledge graphs from language models as a diagnostic benchmark of model performance.

Interpreting Language Models Through Knowledge Graph Extraction Idea: How do we interpret what a language model learns at various stages of training?

EPFL Machine Learning and Optimization Laboratory 9 Oct 25, 2022