Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

Overview

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

This repository contains the source code for the paper (link will be posted).

Requirements

  • GPU
  • Python 3
  • PyTorch 1.9
    • Earlier version may work, but untested.
  • pip install -r requirements.txt
  • If running ResNet-21 or ImageNet experiments, first download and prepare the ImageNet 2012 dataset with bin/imagenet_prep.sh script.

Running

For non-ImageNet experiments, the main python file is main.py. To see its arguments:

python main.py --help

Running for the first time can take a little longer due to automatic downloading of the MNIST and Cifar-10 dataset from the Internet.

For ImageNet experiments, the main python files are main_imagenet_float.py and main_imagenet_binary.py. Too see their arguments:

python main_imagenet_float.py --help

and

python main_imagenet_binary.py --help

The ImageNet dataset must be already downloaded and prepared. Please see the requirements section for details.

Scripts

The main python file has many options. The following scripts runs training with hyper-parameters given in the paper. Output includes a run-log text file and tensorboard files. These files are saved to ./logs and reused for subsequent runs.

300-100-10

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/mnist/300/sensitivity/layer.sh sensitivity forward 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/mnist/300/sensitivity/layer.sh sensitivity 231 0.1 0
# Layer 3. Learning rate 0.1.
./scripts/mnist/300/sensitivity/layer.sh sensitivity reverse 0.1 0

Output files and run-log are written to ./logs/mnist/val/sensitivity/.

Hyperparam search

For floating-point training:

# Learning rate 0.1.
./scripts/mnist/300/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/mnist/300/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam forward 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam reverse 0.1 0
# 1, 3, 2 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 132 0.1 0
# 2, 1, 3 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 213 0.1 0
# 2, 3, 1 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 231 0.1 0
# 3, 1, 2 order. Learning rate 0.1.
./scripts/mnist/300/val/layer.sh hyperparam 312 0.1 0

Output files and run-log are written to ./logs/mnist/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full forward 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full reverse 0.1 316 0
# 1, 3, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 132 0.1 316 0
# 2, 1, 3 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 213 0.1 316 0
# 2, 3, 1 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 231 0.1 316 0
# 3, 1, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/300/run/layer.sh full 312 0.1 316 0

Output files and run-log are written to ./logs/mnist/run/full/.

784-100-10

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/mnist/784/sensitivity/layer.sh sensitivity forward 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/mnist/784/sensitivity/layer.sh sensitivity 231 0.1 0
# Layer 3. Learning rate 0.1.
./scripts/mnist/784/sensitivity/layer.sh sensitivity reverse 0.1 0

Output files and run-log are written to ./logs/mnist/val/sensitivity/.

Hyperparam search

For floating-point training:

# Learning rate 0.1.
./scripts/mnist/784/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/mnist/784/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam forward 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam reverse 0.1 0
# 1, 3, 2 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 132 0.1 0
# 2, 1, 3 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 213 0.1 0
# 2, 3, 1 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 231 0.1 0
# 3, 1, 2 order. Learning rate 0.1.
./scripts/mnist/784/val/layer.sh hyperparam 312 0.1 0

Output files and run-log are written to ./logs/mnist/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full forward 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full reverse 0.1 316 0
# 1, 3, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 132 0.1 316 0
# 2, 1, 3 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 213 0.1 316 0
# 2, 3, 1 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 231 0.1 316 0
# 3, 1, 2 order. Learning rate 0.1. Seed 316.
./scripts/mnist/784/run/layer.sh full 312 0.1 316 0

Output files and run-log are written to ./logs/mnist/run/full/.

Vgg-5

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 1 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 2 0.1 0
# Layer 5. Learning rate 0.1.
./scripts/cifar10/vgg5/sensitivity/layer.sh sensitivity 5 0.1 0

Output files and run-log are written to ./logs/cifar10/val/sensitivity/.

Hyperparam Search

For floating-point training:

# Learning rate 0.1.
./scripts/cifar10/vgg5/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/cifar10/vgg5/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam forward 0.1 0
# Ascend order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam ascend 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam reverse 0.1 0
# Descend order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam descend 0.1 0
# Random order. Learning rate 0.1.
./scripts/cifar10/vgg5/val/layer.sh hyperparam random 0.1 0

Output files and run-log are written to ./logs/cifar10/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full forward 0.1 316 0
# Ascend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full ascend 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full reverse 0.1 316 0
# Descend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full descend 0.1 316 0
# Random order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg5/run/layer.sh full random 0.1 316 0

Output files and run-log are written to ./logs/cifar10/run/full/.

Vgg-9

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 1 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 2 0.1 0
# Layer 5. Learning rate 0.1.
./scripts/cifar10/vgg9/sensitivity/layer.sh sensitivity 5 0.1 0

Output files and run-log are written to ./logs/cifar10/val/sensitivity/.

Hyperparam Search

For floating-point training:

# Learning rate 0.1.
./scripts/cifar10/vgg9/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1.
./scripts/cifar10/vgg9/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam forward 0.1 0
# Ascend order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam ascend 0.1 0
# Reverse order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam reverse 0.1 0
# Descend order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam descend 0.1 0
# Random order. Learning rate 0.1.
./scripts/cifar10/vgg9/val/layer.sh hyperparam random 0.1 0

Output files and run-log are written to ./logs/cifar10/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full forward 0.1 316 0
# Ascend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full ascend 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full reverse 0.1 316 0
# Descend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full descend 0.1 316 0
# Random order. Learning rate 0.1. Seed 316.
./scripts/cifar10/vgg9/run/layer.sh full random 0.1 316 0

Output files and run-log are written to ./logs/cifar10/run/full/.

ResNet-20

Sensitivity Pre-training

# Layer 1. Learning rate 0.1.
./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 1 0.1 0
# Layer 2. Learning rate 0.1.
./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 2 0.1 0
# ...
# Layer 20. Learning rate 0.1.
./scripts/cifar10/resnet20/sensitivity/layer.sh sensitivity 20 0.1 0

Output files and run-log are written to ./logs/cifar10/val/sensitivity/.

Hyperparam Search

For floating-point training:

# Learning rate 0.1
./scripts/cifar10/resnet20/val/float.sh hyperparam 0.1 0

For full binary training:

# Learning rate 0.1
./scripts/cifar10/resnet20/val/binary.sh hyperparam 0.1 0

For iterative training:

# Forward order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam forward 0.1 0
# Ascend order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam ascend 0.1 0
# Reverse order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam reverse 0.1 0
# Descend order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam descend 0.1 0
# Random order. Learning rate 0.1
./scripts/cifar10/resnet20/val/layer.sh hyperparam random 0.1 0

Output files and run-log are written to ./logs/cifar10/val/hyperparam/.

Full Training

For floating-point training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/float.sh full 0.1 316 0

For full binary training:

# Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/binary.sh full 0.1 316 0

For iterative training:

# Forward order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full forward 0.1 316 0
# Ascend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full ascend 0.1 316 0
# Reverse order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full reverse 0.1 316 0
# Descend order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full descend 0.1 316 0
# Random order. Learning rate 0.1. Seed 316.
./scripts/cifar10/resnet20/run/layer.sh full random 0.1 316 0

Output files and run-log are written to ./logs/cifar10/run/full/.

ResNet-21

To run experiments for ResNet-21, first download and prepare the ImageNet dataset. See the requirements section at the beginning of this readme. We assume the dataset is prepared and is at ./imagenet.

Sensitivity Pre-training

# Layer 1. Learning rate 0.01.
./scripts/imagenet/layer.sh sensitivity ./imagenet 20 "[20]" 20 1 0.01
# Layer 2. Learning rate 0.01.
./scripts/imagenet/layer.sh sensitivity ./imagenet 20 "[20]" 20 2 0.01
# Layer 21. Learning rate 0.01.
./scripts/imagenet/layer.sh sensitivity ./imagenet 20 "[20]" 20 21 0.01

Output files and run-log are written to ./logs/imagenet/sensitivity/.

Full Training

For floating-point training:

# Learning rate 0.01.
./scripts/imagenet/float.sh full ./imagenet 67 "[42,57]" 0.01

For full binary training:

# Learning rate 0.01.
./scripts/imagenet/binary.sh full ./imagenet 67 "[42,57]" 0.01

For layer-by-layer training:

# Forward order
./scripts/imagenet/layer.sh full ./imagenet 67 "[42,57]" 2 forward 0.01
# Ascending order
./scripts/imagenet/layer.sh full ./imagenet 67 "[42,57]" 2 ascend 0.01

For all scripts, output files and run-log are written to ./logs/imagenet/full/.

License

See LICENSE

Contributing

See the contributing guide for details of how to participate in development of the module.

Owner
Rakuten Group, Inc.
Rakuten Group, Inc.
MEDS: Enhancing Memory Error Detection for Large-Scale Applications

MEDS: Enhancing Memory Error Detection for Large-Scale Applications Prerequisites cmake and clang Build MEDS supporting compiler $ make Build Using Do

Secomp Lab at Purdue University 34 Dec 14, 2022
realsense d400 -> jpg + csv

Realsense-capture realsense d400 - jpg + csv Requirements RealSense sdk : Installation Python3 pyrealsense2 (RealSense SDK) Numpy OpenCV Tkinter Run

Ar-Ray 2 Mar 22, 2022
Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN", accepted to ACM MM 2021 BNI Track.

RecycleD Official PyTorch implementation of the paper "Recycling Discriminator: Towards Opinion-Unaware Image Quality Assessment Using Wasserstein GAN

Yunan Zhu 23 Nov 05, 2022
Automated image registration. Registrationimation was too much of a mouthful.

alignimation Automated image registration. Registrationimation was too much of a mouthful. This repo contains the code used for my blog post Alignimat

Ethan Rosenthal 9 Oct 13, 2022
MNIST, but with Bezier curves instead of pixels

bezier-mnist This is a work-in-progress vector version of the MNIST dataset. Samples Here are some samples from the training set. Note that, while the

Alex Nichol 15 Jan 16, 2022
Implementation of "Learning to Match Features with Seeded Graph Matching Network" ICCV2021

SGMNet Implementation PyTorch implementation of SGMNet for ICCV'21 paper "Learning to Match Features with Seeded Graph Matching Network", by Hongkai C

87 Dec 11, 2022
A simple interface for editing natural photos with generative neural networks.

Neural Photo Editor A simple interface for editing natural photos with generative neural networks. This repository contains code for the paper "Neural

Andy Brock 2.1k Dec 29, 2022
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

MaCan 4.2k Dec 29, 2022
A Python library for Deep Graph Networks

PyDGN Wiki Description This is a Python library to easily experiment with Deep Graph Networks (DGNs). It provides automatic management of data splitti

Federico Errica 194 Dec 22, 2022
Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

Likelihood-Regret Official implementation of Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020. T

Xavier 33 Oct 12, 2022
Applying curriculum to meta-learning for few shot classification

Curriculum Meta-Learning for Few-shot Classification We propose an adaptation of the curriculum training framework, applicable to state-of-the-art met

Stergiadis Manos 3 Oct 25, 2022
Improving Object Detection by Label Assignment Distillation

Improving Object Detection by Label Assignment Distillation This is the official implementation of the WACV 2022 paper Improving Object Detection by L

Cybercore Co. Ltd 51 Dec 08, 2022
Modifications of the official PyTorch implementation of StyleGAN3. Let's easily generate images and videos with StyleGAN2/2-ADA/3!

Alias-Free Generative Adversarial Networks (StyleGAN3) Official PyTorch implementation of the NeurIPS 2021 paper Alias-Free Generative Adversarial Net

Diego Porres 185 Dec 24, 2022
A Python library for generating new text from existing samples.

ReMarkov is a Python library for generating text from existing samples using Markov chains. You can use it to customize all sorts of writing from birt

8 May 17, 2022
The offcial repository for 'CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos', SIGIR2022

CharacterBERT-DR The offcial repository for CharacterBERT and Self-Teaching for Improving the Robustness of Dense Retrievers on Queries with Typos, Sh

ielab 11 Nov 15, 2022
KakaoBrain KoGPT (Korean Generative Pre-trained Transformer)

KoGPT KoGPT (Korean Generative Pre-trained Transformer) https://github.com/kakaobrain/kogpt https://huggingface.co/kakaobrain/kogpt Model Descriptions

Kakao Brain 799 Dec 28, 2022
Randstad Artificial Intelligence Challenge (powered by VGEN). Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato

Randstad Artificial Intelligence Challenge (powered by VGEN) Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato Struttura director

Stefano Fiorucci 1 Nov 13, 2021
Boostcamp AI Tech 3rd / Basic Paper reading w.r.t Embedding

Boostcamp AI Tech 3rd : Basic Paper Reading w.r.t Embedding TL;DR 1992년부터 2018년도까지 이루어진 word/sentence embedding의 중요한 줄기를 이루는 기초 논문 스터디를 진행하고자 합니다. 논

Soyeon Kim 14 Nov 14, 2022
TRIQ implementation

TRIQ Implementation TF-Keras implementation of TRIQ as described in Transformer for Image Quality Assessment. Installation Clone this repository. Inst

Junyong You 115 Dec 30, 2022