Code implementation of Data Efficient Stagewise Knowledge Distillation paper.

Overview

Data Efficient Stagewise Knowledge Distillation

Stagewise Training Procedure

Table of Contents

This repository presents the code implementation for Stagewise Knowledge Distillation, a technique for improving knowledge transfer between a teacher model and student model.

Requirements

  • Install the dependencies using conda with the requirements.yml file
    conda env create -f environment.yml
    
  • Setup the stagewise-knowledge-distillation package itself
    pip install -e .
    
  • Apart from the above mentioned dependencies, it is recommended to have an Nvidia GPU (CUDA compatible) with at least 8 GB of video memory (most of the experiments will work with 6 GB also). However, the code works with CPU only machines as well.

Image Classification

Introduction

In this work, ResNet architectures are used. Particularly, we used ResNet10, 14, 18, 20 and 26 as student networks and ResNet34 as the teacher network. The datasets used are CIFAR10, Imagenette and Imagewoof. Note that Imagenette and Imagewoof are subsets of ImageNet.

Preparation

  • Before any experiments, you need to download the data and saved weights of teacher model to appropriate locations.

  • The following script

    • downloads the datasets
    • saves 10%, 20%, 30% and 40% splits of each dataset separately
    • downloads teacher model weights for all 3 datasets
    # assuming you are in the root folder of the repository
    cd image_classification/scripts
    bash setup.sh
    

Experiments

For detailed information on the various experiments, refer to the paper. In all the image classification experiments, the following common training arguments are listed with the possible values they can take:

  • dataset (-d) : imagenette, imagewoof, cifar10
  • model (-m) : resnet10, resnet14, resnet18, resnet20, resnet26, resnet34
  • number of epochs (-e) : Integer is required
  • percentage of dataset (-p) : 10, 20, 30, 40 (don't use this argument at all for full dataset experiments)
  • random seed (-s) : Give any random seed (for reproducibility purposes)
  • gpu (-g) : Don't use unless training on CPU (in which case, use -g 'cpu' as the argument). In case of multi-GPU systems, run CUDA_VISIBLE_DEVICES=id in the terminal before the experiment, where id is the ID of your GPU according to nvidia-smi output.
  • Comet ML API key (-a) (optional) : If you want to use Comet ML for tracking your experiments, then either put your API key as the argument or make it the default argument in the arguments.py file. Otherwise, no need of using this argument.
  • Comet ML workspace (-w) (optional) : If you want to use Comet ML for tracking your experiments, then either put your workspace name as the argument or make it the default argument in the arguments.py file. Otherwise, no need of using this argument.

In the following subsections, example commands for training are given for one experiment each.

No Teacher

Full Imagenette dataset, ResNet10

python3 no_teacher.py -d imagenette -m resnet10 -e 100 -s 0

Traditional KD (FitNets)

20% Imagewoof dataset, ResNet18

python3 traditional_kd.py -d imagewoof -m resnet18 -p 20 -e 100 -s 0

FSP KD

30% CIFAR10 dataset, ResNet14

python3 fsp_kd.py -d cifar10 -m resnet14 -p 30 -e 100 -s 0

Attention Transfer KD

10% Imagewoof dataset, ResNet26

python3 attention_transfer_kd.py -d imagewoof -m resnet26 -p 10 -e 100 -s 0

Hinton KD

Full CIFAR10 dataset, ResNet14

python3 hinton_kd.py -d cifar10 -m resnet14 -e 100 -s 0

Simultaneous KD (Proposed Baseline)

40% Imagenette dataset, ResNet20

python3 simultaneous_kd.py -d imagenette -m resnet20 -p 40 -e 100 -s 0

Stagewise KD (Proposed Method)

Full CIFAR10 dataset, ResNet10

python3 stagewise_kd.py -d cifar10 -m resnet10 -e 100 -s 0

Semantic Segmentation

Introduction

In this work, ResNet backbones are used to construct symmetric U-Nets for semantic segmentation. Particularly, we used ResNet10, 14, 18, 20 and 26 as the backbones for student networks and ResNet34 as the backbone for the teacher network. The dataset used is the Cambridge-driving Labeled Video Database (CamVid).

Preparation

  • The following script
    • downloads the data (and shifts it to appropriate folder)
    • saves 10%, 20%, 30% and 40% splits of each dataset separately
    • downloads the pretrained teacher weights in appropriate folder
    # assuming you are in the root folder of the repository
    cd semantic_segmentation/scripts
    bash setup.sh
    

Experiments

For detailed information on the various experiments, refer to the paper. In all the semantic segmentation experiments, the following common training arguments are listed with the possible values they can take:

  • dataset (-d) : camvid
  • model (-m) : resnet10, resnet14, resnet18, resnet20, resnet26, resnet34
  • number of epochs (-e) : Integer is required
  • percentage of dataset (-p) : 10, 20, 30, 40 (don't use this argument at all for full dataset experiments)
  • random seed (-s) : Give any random seed (for reproducibility purposes)
  • gpu (-g) : Don't use unless training on CPU (in which case, use -g 'cpu' as the argument). In case of multi-GPU systems, run CUDA_VISIBLE_DEVICES=id in the terminal before the experiment, where id is the ID of your GPU according to nvidia-smi output.
  • Comet ML API key (-a) (optional) : If you want to use Comet ML for tracking your experiments, then either put your API key as the argument or make it the default argument in the arguments.py file. Otherwise, no need of using this argument.
  • Comet ML workspace (-w) (optional) : If you want to use Comet ML for tracking your experiments, then either put your workspace name as the argument or make it the default argument in the arguments.py file. Otherwise, no need of using this argument.

Note: Currently, there are no plans for adding Attention Transfer KD and FSP KD experiments for semantic segmentation.

In the following subsections, example commands for training are given for one experiment each.

No Teacher

Full CamVid dataset, ResNet10

python3 pretrain.py -d camvid -m resnet10 -e 100 -s 0

Traditional KD (FitNets)

20% CamVid dataset, ResNet18

python3 traditional_kd.py -d camvid -m resnet18 -p 20 -e 100 -s 0

Simultaneous KD (Proposed Baseline)

40% CamVid dataset, ResNet20

python3 simultaneous_kd.py -d camvid -m resnet20 -p 40 -e 100 -s 0

Stagewise KD (Proposed Method)

10 % CamVid dataset, ResNet10

python3 stagewise_kd.py -d camvid -m resnet10 -p 10 -e 100 -s 0

Citation

If you use this code or method in your work, please cite using

@misc{kulkarni2020data,
      title={Data Efficient Stagewise Knowledge Distillation}, 
      author={Akshay Kulkarni and Navid Panchi and Sharath Chandra Raparthy and Shital Chiddarwar},
      year={2020},
      eprint={1911.06786},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

Built by Akshay Kulkarni, Navid Panchi and Sharath Chandra Raparthy.

Owner
IvLabs
Robotics and AI community of VNIT
IvLabs
Code for paper Novel View Synthesis via Depth-guided Skip Connections

Novel View Synthesis via Depth-guided Skip Connections Code for paper Novel View Synthesis via Depth-guided Skip Connections @InProceedings{Hou_2021_W

8 Mar 14, 2022
Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Simple-Image-Classification Simple Image Classification Code (PyTorch) Yechan Kim This repository contains: Python3 / Pytorch code for multi-class ima

Yechan Kim 8 Oct 29, 2022
[SDM 2022] Towards Similarity-Aware Time-Series Classification

SimTSC This is the PyTorch implementation of SDM2022 paper Towards Similarity-Aware Time-Series Classification. We propose Similarity-Aware Time-Serie

Daochen Zha 49 Dec 27, 2022
DGL-TreeSearch and the Gurobi-MWIS interface

Independent Set Benchmarking Suite This repository contains the code for our maximum independent set benchmarking suite as well as our implementations

Maximilian Böther 19 Nov 22, 2022
yolov5 deepsort 行人 车辆 跟踪 检测 计数

yolov5 deepsort 行人 车辆 跟踪 检测 计数 实现了 出/入 分别计数。 默认是 南/北 方向检测,若要检测不同位置和方向,可在 main.py 文件第13行和21行,修改2个polygon的点。 默认检测类别:行人、自行车、小汽车、摩托车、公交车、卡车。 检测类别可在 detect

554 Dec 30, 2022
EGNN - Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch

EGNN - Pytorch Implementation of E(n)-Equivariant Graph Neural Networks, in Pytorch. May be eventually used for Alphafold2 replication. This

Phil Wang 259 Jan 04, 2023
This repo contains the code for the paper "Efficient hierarchical Bayesian inference for spatio-temporal regression models in neuroimaging" that has been accepted to NeurIPS 2021.

Dugh-NeurIPS-2021 This repo contains the code for the paper "Efficient hierarchical Bayesian inference for spatio-temporal regression models in neuroi

Ali Hashemi 5 Jul 12, 2022
Datasets, Transforms and Models specific to Computer Vision

torchvision The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. Installat

13.1k Jan 02, 2023
CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

CharacterGAN Implementation of the paper "CharacterGAN: Few-Shot Keypoint Character Animation and Reposing" by Tobias Hinz, Matthew Fisher, Oliver Wan

Tobias Hinz 181 Dec 27, 2022
Progressive Domain Adaptation for Object Detection

Progressive Domain Adaptation for Object Detection Implementation of our paper Progressive Domain Adaptation for Object Detection, based on pytorch-fa

96 Nov 25, 2022
Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021)

TDEER (WIP) Code For TDEER: An Efficient Translating Decoding Schema for Joint Extraction of Entities and Relations (EMNLP2021) Overview TDEER is an e

Alipay 6 Dec 17, 2022
An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astronomy data.

EquivariantSelfAttention An implementation of Equivariant e2 convolutional kernals into a convolutional self attention network, applied to radio astro

2 Nov 09, 2021
A simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

This is a simple rest api that classifies pneumonia infection weather it is Normal, Pneumonia Virus or Pneumonia Bacteria from a chest-x-ray image.

crispengari 3 Jan 08, 2022
This project aims to be a handler for input creation and running of multiple RICEWQ simulations.

What is autoRICEWQ? This project aims to be a handler for input creation and running of multiple RICEWQ simulations. What is RICEWQ? From the descript

Yass Fuentes 1 Feb 01, 2022
P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks

P-tuning v2 P-Tuning v2: Prompt Tuning Can Be Comparable to Finetuning Universally Across Scales and Tasks An optimized prompt tuning strategy achievi

THUDM 540 Dec 30, 2022
Codebase for Image Classification Research, written in PyTorch.

pycls pycls is an image classification codebase, written in PyTorch. It was originally developed for the On Network Design Spaces for Visual Recogniti

Facebook Research 2k Jan 01, 2023
PyTorch implementation of Deep HDR Imaging via A Non-Local Network (TIP 2020).

NHDRRNet-PyTorch This is the PyTorch implementation of Deep HDR Imaging via A Non-Local Network (TIP 2020). 0. Differences between Original Paper and

Yutong Zhang 1 Mar 01, 2022
这是一个利用facenet和retinaface实现人脸识别的库,可以进行在线的人脸识别。

Facenet+Retinaface:人脸识别模型在Pytorch当中的实现 目录 注意事项 Attention 所需环境 Environment 文件下载 Download 预测步骤 How2predict 参考资料 Reference 注意事项 该库中包含了两个网络,分别是retinaface和

Bubbliiiing 102 Dec 30, 2022
Official code for the paper: Deep Graph Matching under Quadratic Constraint (CVPR 2021)

QC-DGM This is the official PyTorch implementation and models for our CVPR 2021 paper: Deep Graph Matching under Quadratic Constraint. It also contain

Quankai Gao 55 Nov 14, 2022
PyTorch implementation of the Pose Residual Network (PRN)

Pose Residual Network This repository contains a PyTorch implementation of the Pose Residual Network (PRN) presented in our ECCV 2018 paper: Muhammed

Salih Karagoz 289 Nov 28, 2022