🔪 Elimination based Lightweight Neural Net with Pretrained Weights

Overview

ELimNet

ELimNet: Eliminating Layers in a Neural Network Pretrained with Large Dataset for Downstream Task

  • Removed top layers from pretrained EfficientNetB0 and ResNet18 to construct lightweight CNN model with less than 1M #params.
  • Assessed on Trash Annotations in Context(TACO) Dataset sampled for 6 classes with 20,851 images.
  • Compared performance with lightweight models generated with Optuna's Neural Architecture Search(NAS) constituted with same convolutional blocks.

Quickstart

Installation

# clone the repository
git clone https://github.com/snoop2head/elimnet

# fetch image dataset and unzip
!wget -cq https://aistages-prod-server-public.s3.amazonaws.com/app/Competitions/000081/data/data.zip
!unzip ./data.zip -d ./

Train

# finetune on the dataset with pretrained model
python train.py --model ./model/efficientnet_b0.yaml

# finetune on the dataset with ElimNet
python train.py --model ./model/efficientnet_b0_elim_3.yaml

Inference

# inference with the lastest ran model
python inference.py --model_dir ./exp/latest/

Performance

Performance is compared with (1) original pretrained model and (2) Optuna NAS constructed models with no pretrained weights.

  • Indicates that top convolutional layers eliminated pretrained CNN models outperforms empty Optuna NAS models generated with same convolutional blocks.
  • Suggests that eliminating top convolutional layers creates lightweight model that shows similar(or better) classifcation performance with original pretrained model.
  • Reduces parameters to 7%(or less) of its original parameters while maintaining(or improving) its performance. Saves inference time by 20% or more by eliminating top convolutional layters.

ELimNet vs Pretrained Models (Train)

[100 epochs] # of Parameters # of Layers Train Validation Test F1
Pretrained EfficientNet B0 4.0M 352 Loss: 0.43
Acc: 81.23%
F1: 0.84
Loss: 0.469
Acc: 82.17%
F1: 0.76
0.7493
EfficientNet B0 Elim 2 0.9M 245 Loss:0.652
Acc: 87.22%
F1: 0.84
Loss: 0.622
Acc: 87.22%
F1: 0.77
0.7603
EfficientNet B0 Elim 3 0.30M 181 Loss: 0.602
Acc: 78.17%
F1: 0.74
Loss: 0.661
Acc: 77.41%
F1: 0.74
0.7349
Resnet18 11.17M 69 Loss: 0.578
Acc: 78.90%
F1: 0.76
Loss: 0.700
Acc: 76.17%
F1: 0.719
-
Resnet18 Elim 2 0.68M 37 Loss: 0.447
Acc: 83.73%
F1: 0.71
Loss: 0.712
Acc: 75.42%
F1: 0.71
-

ELimNet vs Pretrained Models (Inference)

# of Parameters # of Layers CPU times (sec) CUDA time (sec) Test Inference Time (sec)
Pretrained EfficientNet B0 4.0M 352 3.9s 4.0s 105.7s
EfficientNet B0 Elim 2 0.9M 245 4.1s 13.0s 83.4s
EfficientNet B0 Elim 3 0.30M 181 3.0s 9.0s 73.5s
Resnet18 11.17M 69 - - -
Resnet18 Elim 2 0.68M 37 - - -

ELimNet vs Empty Optuna NAS Models (Train)

[100 epochs] # of Parameters # of Layers Train Valid Test F1
Empty MobileNet V3 4.2M 227 Loss 0.925
Acc: 65.18%
F1: 0.58
Loss 0.993
Acc: 62.83%
F1: 0.56
-
Empty EfficientNet B0 1.3M 352 Loss 0.867
Acc: 67.28%
F1: 0.61
Loss 0.898
Acc: 66.80%
F1: 0.61
0.6337
Empty DWConv & InvertedResidualv3 NAS 0.08M 66 - Loss: 0.766
Acc: 71.71%
F1: 0.68
0.6740
Empty MBConv NAS 0.33M 141 Loss: 0.786
Acc: 70.72%
F1: 0.66
Loss: 0.866
Acc: 68.09%
F1: 0.62
0.6245
Resnet18 Elim 2 0.68M 37 Loss: 0.447
Acc: 83.73%
F1: 0.71
Loss: 0.712
Acc: 75.42%
F1: 0.71
-
EfficientNet B0 Elim 3 0.30M 181 Loss: 0.602
Acc: 78.17%
F1: 0.74
Loss: 0.661
Acc: 77.41%
F1: 0.74
0.7603

ELimNet vs Empty Optuna NAS Models (Inference)

# of Parameters # of Layers CPU times (sec) CUDA time (sec) Test Inference Time (sec)
Empty MobileNet V3 4.2M 227 4 13 -
Empty EfficientNet B0 1.3M 352 3.780 3.782 68.4s
Empty DWConv &
InvertedResidualv3 NAS
0.08M 66 1 3.5 61.1s
Empty MBConv NAS 0.33M 141 2.14 7.201 67.1s
Resnet18 Elim 2 0.68M 37 - - -
EfficientNet B0 Elim 3 0.30M 181 3.0s 9s 73.5s

Background & WiP

Background

Work in Progress

  • Will test the performance of replacing convolutional blocks with pretrained weights with a single convolutional layer without pretrained weights.
  • Will add ResNet18's inference time data and compare with Optuna's NAS constructed lightweight model.
  • Will test on pretrained MobileNetV3, MnasNet on torchvision with elimination based lightweight model architecture search.
  • Will be applied on other small datasets such as Fashion MNIST dataset and Plant Village dataset.

Others

  • "Empty" stands for model with no pretrained weights.
  • "EfficientNet B0 Elim 2" means 2 convolutional blocks have been eliminated from pretrained EfficientNet B0. Number next to "Elim" annotates how many convolutional blocks have been removed.
  • Table's performance illustrates best performance out of 100 epochs of finetuning on TACO Dataset.

Authors

Owner
snoop2head
break, compose, display
snoop2head
Full Resolution Residual Networks for Semantic Image Segmentation

Full-Resolution Residual Networks (FRRN) This repository contains code to train and qualitatively evaluate Full-Resolution Residual Networks (FRRNs) a

Toby Pohlen 274 Oct 27, 2022
Generative Autoregressive, Normalized Flows, VAEs, Score-based models (GANVAS)

GANVAS-models This is an implementation of various generative models. It contains implementations of the following: Autoregressive Models: PixelCNN, G

MRSAIL (Mini Robotics, Software & AI Lab) 6 Nov 26, 2022
Classification models 1D Zoo - Keras and TF.Keras

Classification models 1D Zoo - Keras and TF.Keras This repository contains 1D variants of popular CNN models for classification like ResNets, DenseNet

Roman Solovyev 12 Jan 06, 2023
PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR)

This is a PyTorch implementation of EGVSR: Efficcient & Generic Video Super-Resolution (VSR), using subpixel convolution to optimize the inference speed of TecoGAN VSR model. Please refer to the offi

789 Jan 04, 2023
Hydra Lightning Template for Structured Configs

Hydra Lightning Template for Structured Configs Template for creating projects with pytorch-lightning and hydra. How to use this template? Create your

Model-driven Machine Learning 4 Jul 19, 2022
Stratified Transformer for 3D Point Cloud Segmentation (CVPR 2022)

Stratified Transformer for 3D Point Cloud Segmentation Xin Lai*, Jianhui Liu*, Li Jiang, Liwei Wang, Hengshuang Zhao, Shu Liu, Xiaojuan Qi, Jiaya Jia

DV Lab 195 Jan 01, 2023
Scrutinizing XAI with linear ground-truth data

This repository contains all the experiments presented in the corresponding paper: "Scrutinizing XAI using linear ground-truth data with suppressor va

braindata lab 2 Oct 04, 2022
Final term project for Bayesian Machine Learning Lecture (XAI-623)

Mixquality_AL Final Term Project For Bayesian Machine Learning Lecture (XAI-623) Youtube Link The presentation is given in YoutubeLink Problem Formula

JeongEun Park 3 Jan 18, 2022
Provide partial dates and retain the date precision through processing

Prefix date parser This is a helper class to parse dates with varied degrees of precision. For example, a data source might state a date as 2001, 2001

Friedrich Lindenberg 13 Dec 14, 2022
Source code and data from the RecSys 2020 article "Carousel Personalization in Music Streaming Apps with Contextual Bandits" by W. Bendada, G. Salha and T. Bontempelli

Carousel Personalization in Music Streaming Apps with Contextual Bandits - RecSys 2020 This repository provides Python code and data to reproduce expe

Deezer 48 Jan 02, 2023
The code of “Similarity Reasoning and Filtration for Image-Text Matching” [AAAI2021]

SGRAF PyTorch implementation for AAAI2021 paper of “Similarity Reasoning and Filtration for Image-Text Matching”. It is built on top of the SCAN and C

Ronnie_IIAU 149 Dec 22, 2022
discovering subdomains, hidden paths, extracting unique links

python-website-crawler discovering subdomains, hidden paths, extracting unique links pip install -r requirements.txt discover subdomain: You can give

merve 4 Sep 05, 2022
Prototype python implementation of the ome-ngff table spec

Prototype python implementation of the ome-ngff table spec

Kevin Yamauchi 8 Nov 20, 2022
Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

TR-BERT Source code and dataset for "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference". The code is based on huggaface's transformers.

THUNLP 37 Oct 30, 2022
Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Fbone (Flask bone) is a Flask (Python microframework) starter/template/bootstrap/boilerplate application.

Wilson 1.7k Dec 30, 2022
Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

Keras implementation of Normalizer-Free Networks and SGD - Adaptive Gradient Clipping

Yam Peleg 63 Sep 21, 2022
Using this codebase as a tool for my own research. Making some modifications to the original repo for my own purposes.

For SwapNet Create a list.txt file containing all the images to process. This can be done with the GNU find command: find path/to/input/folder -name '

Andrew Jong 2 Nov 10, 2021
Official release of MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis of Pancreatic Cancer axriv: http://arxiv.org/abs/2112.13513

MSHT: Multi-stage Hybrid Transformer for the ROSE Image Analysis This is the official page of the MSHT with its experimental script and records. We de

Tianyi Zhang 53 Dec 27, 2022
Pytorch implementation of the paper: "A Unified Framework for Separating Superimposed Images", in CVPR 2020.

Deep Adversarial Decomposition PDF | Supp | 1min-DemoVideo Pytorch implementation of the paper: "Deep Adversarial Decomposition: A Unified Framework f

Zhengxia Zou 72 Dec 18, 2022
Official code for "InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization" (ICLR 2020, spotlight)

InfoGraph: Unsupervised and Semi-supervised Graph-Level Representation Learning via Mutual Information Maximization Authors: Fan-yun Sun, Jordan Hoffm

Fan-Yun Sun 232 Dec 28, 2022