Official PyTorch Implementation of Learning Architectures for Binary Networks

Related tags

Deep Learningbnas
Overview

Learning Architectures for Binary Networks

An Pytorch Implementation of the paper Learning Architectures for Binary Networks (BNAS) (ECCV 2020)
If you find any part of our code useful for your research, consider citing our paper.

@inproceedings{kimSC2020BNAS,
  title={Learning Architectures for Binary Networks},
  author={Dahyun Kim and Kunal Pratap Singh and Jonghyun Choi},
  booktitle={ECCV},
  year={2020}
}

Maintainer

Introduction

We present a method for searching architectures of a network with both binary weights and activations. When using the same binarization scheme, our searched architectures outperform binary network whose architectures are well known floating point networks.

Note: our searched architectures still achieve competitive results when compared to the state of the art without additional pretraining, new binarization schemes, or novel training methods.

Prerequisite - Docker Containers

We recommend using the below Docker container as it provides comprehensive running environment. You will need to install nvidia-docker and its related packages following instructions here.

Pull our image uploaded here using the following command.

$ docker pull killawhale/apex:latest

You can then create the container to run our code via the following command.

$ docker run --name [CONTAINER_NAME] --runtime=nvidia -it -v [HOST_FILE_DIR]:[CONTAINER_FILE_DIR] --shm-size 16g killawhale/apex:latest bash
  • [CONTAINER_NAME]: the name of the created container
  • [HOST_FILE_DIR]: the path of the directory on the host machine which you want to sync your container with
  • [CONTAINER_FILE_DIR]: the name in which the synced directory will appear inside the container

Note: For those who do not want to use the docker container, we use PyTorch 1.2, torchvision 0.5, Python 3.6, CUDA 10.0, and Apex 0.1. You can also refer to the provided requirements.txt.

Dataset Preparation

CIFAR10

For CIFAR10, we're using CIFAR10 provided by torchvision. Run the following command to download it.

$ python src/download_cifar10.py

This will create a directory named data and download the dataset in it.

ImageNet

For ImageNet, please follow the instructions below.

  1. download the training set for the ImageNet dataset.
$ wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar

This may take a long time depending on your internet connection.

  1. download the validation set for the ImageNet dataset.
$ wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
  1. make a directory which will contain the ImageNet dataset and move your downloaded .tar files inside the directory.

  2. extract and organize the training set into different categories.

$ mkdir train && mv ILSVRC2012_img_train.tar train/ && cd train
$ tar -xvf ILSVRC2012_img_train.tar && rm -f ILSVRC2012_img_train.tar
$ find . -name "*.tar" | while read NAME ; do mkdir -p "${NAME%.tar}"; tar -xvf "${NAME}" -C "${NAME%.tar}"; rm -f "${NAME}"; done
$ cd ..
  1. do the same for the validation set as well.
$ mkdir val && mv ILSVRC2012_img_val.tar val/ && cd val && tar -xvf ILSVRC2012_img_val.tar
$ wget -qO- https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh | bash
$ rm -rf ILSVR2012_img_val.tar
  1. change the synset ids to integer labels.
$ python src/prepare_imgnet.py [PATH_TO_IMAGENET]
  • [PATH_TO_IMAGENET]: the path to the directory which has the ImageNet dataset

You can optionally run the following if you only prepared the validation set.

$ python src/prepare_imgnet.py [PATH_TO_IMAGENET] --val_only

Inference with Pre-Trained Models

To reproduce the results reported in the paper, you can use the pretrained models provided here.

Note: For CIFAR10 we only share BNAS-A, as training other configurations for CIFAR10 does not take much time. For ImageNet, we currently provide all the models (BNAS-D,E,F,G,H).

For running validation on CIFAR10 using our pretrained models, use the following command.

$ CUDA_VISIBLE_DEVICES=0,1; python -W ignore src/test.py --path_to_weights [PATH_TO_WEIGHTS] --arch latest_cell_zeroise --parallel
  • [PATH_TO_WEIGHTS]: the path to the downloaded pretrained weights (for CIFAR10, it's BNAS-A)

For running validation on ImageNet using our pretrained models, use the following command.

$ CUDA_VISIBLE_DEIVCES=0,1; python -m torch.distributed.launch --nproc_per_node=2 src/test_imagenet.py --data [PATH_TO_IMAGENET] --path_to_weights [PATH_TO_WEIGHTS] --model_config [MODEL_CONFIG]
  • [PATH_TO_IMAGENET]: the path to the directory which has the ImageNet dataset
  • [PATH_TO_WEIGHTS]: the path to the downloaded pretrained weights
  • [MODEL_CONFIG]: the model to run the validation with. Can be one of 'bnas-d', 'bnas-e', 'bnas-f', 'bnas-g', or 'bnas-h'

Expected result:

Model Reported Top-1(%) Reported Top-5(%) Reproduced Top-1(%) Reproduced Top-1(%)
BNAS-A 92.70 - ~92.39 -
BNAS-D 57.69 79.89 ~57.60 ~80.00
BNAS-E 58.76 80.61 ~58.15 ~80.16
BNAS-F 58.99 80.85 ~58.89 ~80.87
BNAS-G 59.81 81.61 ~59.39 ~81.03
BNAS-H 63.51 83.91 ~63.70 ~84.49

You can click on the links at the model name to download the corresponding model weights.

Note: the provided pretrained weights were trained with Apex along with different batch size than the ones reported in the paper (due to computation resource constraints) and hence the inference result may vary slightly from the reported results in the paper.

More comparison with state of the art binary networks are in the paper.

Searching Architectures

To search architectures, use the following command.

$ CUDA_VISIBLE_DEVICES=0; python -W ignore src/search.py --save [ARCH_NAME]
  • [ARCH_NAME]: the name of the searched architecture

This will automatically append the searched architecture in the genotypes.py file. Note that two genotypes will be appended, one for CIFAR10 and one for ImageNet. The validation accuracy at the end of the search is not indicative of the final performance of the searched architecture. To obtain the final performance, one must train the final architecture from scratch as described next.

BNAS-Normal Cell BNAS-Reduction Cell

Figure: One Example of Normal(left) and Reduction(right) cells searched by BNAS

Training Searched Architectures from scratch

To train our best searched cell on CIFAR10, use the following command.

$ CUDA_VISIBLE_DEVICES=0,1 python -W ignore src/train.py --learning_rate 0.05 --save [SAVE_NAME] --arch latest_cell_zeroise --parallel 
  • [SAVE_NAME]: experiment name

You will be able to see the validation accuracy at every epoch as shown below and there is no need to run a separate inference code.

2019-12-29 11:47:42,166 args = Namespace(arch='latest_cell_zeroise', batch_size=256, data='../data', drop_path_prob=0.2, epochs=600, init_channels=36, layers=20, learning_rate=0.05, model_path='saved_models', momentum=0.9, num_skip=1, parallel=True, report_freq=50, save='eval-latest_cell_zeroise_train_repro_0.05', seed=0, weight_decay=3e-06)
2019-12-29 11:47:46,893 param size = 5.578252MB
2019-12-29 11:47:48,654 epoch 0 lr 2.500000e-03
2019-12-29 11:47:55,462 train 000 2.623852e+00 9.375000 57.812500
2019-12-29 11:48:34,952 train 050 2.103856e+00 22.533701 74.180453
2019-12-29 11:49:14,118 train 100 1.943232e+00 27.440439 80.186417
2019-12-29 11:49:53,748 train 150 1.867823e+00 30.114342 82.512417
2019-12-29 11:50:29,680 train_acc 32.170000
2019-12-29 11:50:30,057 valid 000 1.792161e+00 30.859375 88.671875
2019-12-29 11:50:34,032 valid_acc 38.350000
2019-12-29 11:50:34,101 epoch 1 lr 2.675926e-03
2019-12-29 11:50:35,476 train 000 1.551331e+00 40.234375 92.187500
2019-12-29 11:51:15,773 train 050 1.572010e+00 42.256434 90.502451
2019-12-29 11:51:55,991 train 100 1.539024e+00 43.181467 90.976949
2019-12-29 11:52:36,345 train 150 1.515295e+00 44.264797 91.395902
2019-12-29 11:53:12,128 train_acc 45.016000
2019-12-29 11:53:12,487 valid 000 1.419507e+00 46.484375 93.359375
2019-12-29 11:53:16,366 valid_acc 48.640000

You should expect around 92% validation accuracy with our best searched cell once the training is finished at 600 epochs. To train custom architectures, give custom architectures to the --arch flag after adding it in the genotypes.py file. Note that you can also change the number of cells stacked and number of initial channels of the model by giving arguments to the --layers option and --init_channels option respectively.

With different architectures, the final network will have different computational costs and the default hyperparameters may not be optimal (such as the learning rate scheduling). Thus, you should expect the final accuracy to vary by around 1~1.5% on the validation accuracy on CIFAR10.

To train our best searched cell on ImageNet, use the following command.

$ CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 src/train_imagenet.py --data [PATH_TO_IMAGENET] --arch latest_cell --model_config [MODEL_CONFIG] --save [SAVE_NAME]
  • [PATH_TO_IMAGENET]: the path to the directory which has the ImageNet dataset
  • [MODEL_CONFIG]: the model to train. Can be one of 'bnas-d', 'bnas-e', 'bnas-f', 'bnas-g', or 'bnas-h'
  • [SAVE_NAME]: experiment name

You will be able to see the validation accuracy at every epoch as shown below and there is no need to run a separate inference code.

2020-03-25 09:53:44,578 args = Namespace(arch='latest_cell', batch_size=256, class_size=1000, data='../../darts-norm/cnn/Imagenet/', distributed=True, drop_path_prob=0, epochs=250, gpu=0, grad_clip=5.0, init_channels=68, keep_batchnorm_fp32=None, label_smooth=0.1, layers=15, learning_rate=0.05, local_rank=0, loss_scale=None, momentum=0.9, num_skip=3, opt_level='O0', report_freq=100, resume=None, save='eval-bnas_f_retrain', seed=0, start_epoch=0, weight_decay=3e-05, world_size=4)
2020-03-25 09:53:50,926 no of parameters 39781442.000000
2020-03-25 09:53:56,444 epoch 0
2020-03-25 09:54:06,220 train 000 7.844889e+00 0.000000 0.097656
2020-03-25 10:00:01,717 train 100 7.059666e+00 0.315207 1.382658
2020-03-25 10:06:09,138 train 200 6.909059e+00 0.498484 2.070215
2020-03-25 10:12:21,804 train 300 6.750810e+00 0.838027 3.157119
2020-03-25 10:18:30,815 train 400 6.627304e+00 1.203534 4.369691
2020-03-25 10:24:37,901 train 500 6.526508e+00 1.601679 5.519625
2020-03-25 10:30:44,522 train 600 6.439230e+00 2.016983 6.666298
2020-03-25 10:36:50,776 train 700 6.360960e+00 2.424132 7.778648
2020-03-25 10:42:58,087 train 800 6.291507e+00 2.830446 8.824784
2020-03-25 10:49:04,204 train 900 6.228209e+00 3.251162 9.829681
2020-03-25 10:55:12,315 train 1000 6.167705e+00 3.673670 10.844819
2020-03-25 11:01:18,095 train 1100 6.112888e+00 4.080009 11.778710
2020-03-25 11:07:23,347 train 1200 6.060676e+00 4.500969 12.712388
2020-03-25 11:10:30,048 train_acc 4.697276
2020-03-25 11:10:33,504 valid 000 4.754593e+00 10.839844 27.832031
2020-03-25 11:11:03,920 valid_acc_top1 11.714000
2020-03-25 11:11:03,920 valid_acc_top5 28.784000

You should expect similar accuracy to our pretrained models once the training is finished at 250 epochs.

To train custom architectures, give custom architectures to the --arch flag after adding it in the genotypes.py file as. Note that you can also change the number of cells stacked and number of initial channels of the model by giving arguments to the --layers option and --init_channels option respectively.

With different architectures, the final network will have different computational costs and the default hyperparameters may not be optimal (such as the learning rate scheduling). Thus, you should expect the final accuracy to vary by around 0.2% on the validation accuracy on ImageNet.

Note: we ran our experiments with at least two NVIDIA V100s. For running on a single GPU, omit the --parallel flag and specify the GPU id using the CUDA_VISIBLE_DEVICES environment variable in the command line.

You might also like...
An experimental technique for efficiently exploring neural architectures.
An experimental technique for efficiently exploring neural architectures.

SMASH: One-Shot Model Architecture Search through HyperNetworks An experimental technique for efficiently exploring neural architectures. This reposit

Code image classification of MNIST dataset using different architectures: simple linear NN, autoencoder, and highway network

Deep Learning for image classification pip install -r http://webia.lip6.fr/~baskiotisn/requirements-amal.txt Train an autoencoder python3 train_auto

Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)
Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021) authors: Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano Overv

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

NFT-Price-Prediction-CNN - Using visual feature extraction, prices of NFTs are predicted via CNN (Alexnet and Resnet) architectures.

Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The original code is written in keras.

CasRel-pytorch-reimplement Pytorch reimplement of the paper "A Novel Cascade Binary Tagging Framework for Relational Triple Extraction" ACL2020. The o

Binary Stochastic Neurons in PyTorch

Binary Stochastic Neurons in PyTorch http://r2rt.com/binary-stochastic-neurons-in-tensorflow.html https://github.com/pytorch/examples/tree/master/mnis

Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks
Official PyTorch implementation of Joint Object Detection and Multi-Object Tracking with Graph Neural Networks

This is the official PyTorch implementation of our paper: "Joint Object Detection and Multi-Object Tracking with Graph Neural Networks". Our project website and video demos are here.

Official PyTorch implementation of Spatial Dependency Networks.
Official PyTorch implementation of Spatial Dependency Networks.

Spatial Dependency Networks: Neural Layers for Improved Generative Image Modeling Đorđe Miladinović   Aleksandar Stanić   Stefan Bauer   Jürgen Schmid

The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training
The official PyTorch implementation of recent paper - SAINT: Improved Neural Networks for Tabular Data via Row Attention and Contrastive Pre-Training

This repository is the official PyTorch implementation of SAINT. Find the paper on arxiv SAINT: Improved Neural Networks for Tabular Data via Row Atte

Comments
  • TypeError:forward() missing 1 required positinal argument: 'x'

    TypeError:forward() missing 1 required positinal argument: 'x'

    Hi! Thanks for your excellent contribution to the community! I faces a problem when I tried to simply run your code (search or inference ). It raised " TypeError:forward() missing 1 required positinal argument: 'x' " . This problem probably is in model_search.py line 136 & line 124. Can you take a look at it ?

    opened by lawsonX 1
  • Something wrong about selecting which convs to be binaried

    Something wrong about selecting which convs to be binaried

    In bin_utils_search.py, line 24-30; there is something wrong about selecting which convolution kernels to be binaried. The right way is to choose the 8 * 14 * 4=448 convolutions to be binaried(8 is the number of cells, 14 is the number of lines in a cell, 4 is the options for different convolutions in one line). However, the code choose the top 448 convolutions to be binaried(including res, prepocess0, preprocess1). The cause is that n here has no effect to distinguish. I hope your reply!

    opened by DamonAtSjtu 1
  • Calculating FLOPs for a Binary/XNOR-Net

    Calculating FLOPs for a Binary/XNOR-Net

    Hello, Thanks for providing amazing support to the research community. Your code is very helpful. I was reading your paper and discovered that you have provided FLOPs required vs accuracy. However, when I was looking at your code, I struggled to find the place where the FLOPS were calculated. Would you be happy to share how you calculated the FLOPS count for the binary/xnor-net? Any help regarding this would be much appreciated.

    Kind Regards, Mohaimen

    opened by mohaimenz 0
Releases(v1.0)
  • v1.0(Aug 12, 2020)

    First release of the official PyTorch implementation of our paper Learning Architectures for Binary Networks (https://arxiv.org/abs/2002.06963)

    Source code(tar.gz)
    Source code(zip)
Owner
Computer Vision Lab. @ GIST
Some useful codes for computer vision and machine learning.
Computer Vision Lab. @ GIST
A higher performance pytorch implementation of DeepLab V3 Plus(DeepLab v3+)

A Higher Performance Pytorch Implementation of DeepLab V3 Plus Introduction This repo is an (re-)implementation of Encoder-Decoder with Atrous Separab

linhua 326 Nov 22, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
本步态识别系统主要基于GaitSet模型进行实现

本步态识别系统主要基于GaitSet模型进行实现。在尝试部署本系统之前,建立理解GaitSet模型的网络结构、训练和推理方法。 系统的实现效果如视频所示: 演示视频 由于模型较大,部分模型文件存储在百度云盘。 链接提取码:33mb 具体部署过程 1.下载代码 2.安装requirements.txt

16 Oct 22, 2022
AlphaBot2 Pi Core software for interfacing with the various components.

AlphaBot2-Pi-Core AlphaBot2 Pi Core software for interfacing with the various components. This project is currently a W.I.P. I will update this readme

KyleDev 1 Feb 13, 2022
AlgoVision - A Framework for Differentiable Algorithms and Algorithmic Supervision

NeurIPS 2021 Paper "Learning with Algorithmic Supervision via Continuous Relaxations"

Felix Petersen 76 Jan 01, 2023
MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tricks

MEAL-V2 This is the official pytorch implementation of our paper: "MEAL V2: Boosting Vanilla ResNet-50 to 80%+ Top-1 Accuracy on ImageNet without Tric

Zhiqiang Shen 653 Dec 19, 2022
Code for Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021)

Parameter Prediction for Unseen Deep Architectures (NeurIPS 2021) authors: Boris Knyazev, Michal Drozdzal, Graham Taylor, Adriana Romero-Soriano Overv

Facebook Research 462 Jan 03, 2023
Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Implementation of temporal pooling methods studied in [ICIP'20] A Comparative Evaluation Of Temporal Pooling Methods For Blind Video Quality Assessment

Zhengzhong Tu 5 Sep 16, 2022
Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021, Pytorch)

S2VD Semi-supervised Video Deraining with Dynamical Rain Generator (CVPR, 2021) Requirements and Dependencies Ubuntu 16.04, cuda 10.0 Python 3.6.10, P

Zongsheng Yue 53 Nov 23, 2022
Official pytorch implementation of "Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization" ACMMM 2021 (Oral)

Feature Stylization and Domain-aware Contrastive Loss for Domain Generalization This is an official implementation of "Feature Stylization and Domain-

22 Sep 22, 2022
Implementation for "Domain-Specific Bias Filtering for Single Labeled Domain Generalization"

DSBF Introduction This repository contains the implementation code for paper: Domain-Specific Bias Filtering for Single Labeled Domain Generalization

ScottYuan 7 Jan 05, 2023
Trustworthy AI related projects

Trustworthy AI This repository aims to include trustworthy AI related projects from Huawei Noah's Ark Lab. Current projects include: Causal Structure

HUAWEI Noah's Ark Lab 589 Dec 30, 2022
《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020.

41 Oct 27, 2022
“英特尔创新大师杯”深度学习挑战赛 赛道3:CCKS2021中文NLP地址相关性任务

基于 bert4keras 的一个baseline 不作任何 数据trick 单模 线上 最高可到 0.7891 # 基础 版 train.py 0.7769 # transformer 各层 cls concat 明神的trick https://xv44586.git

孙永松 7 Dec 28, 2021
Image-generation-baseline - MUGE Text To Image Generation Baseline

MUGE Text To Image Generation Baseline Requirements and Installation More detail

23 Oct 17, 2022
Code for approximate graph reduction techniques for cardinality-based DSFM, from paper

SparseCard Code for approximate graph reduction techniques for cardinality-based DSFM, from paper "Approximate Decomposable Submodular Function Minimi

Nate Veldt 1 Nov 25, 2022
A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK

Pytorch-MBNet A pytorch implementation of MBNET: MOS PREDICTION FOR SYNTHESIZED SPEECH WITH MEAN-BIAS NETWORK Training To train a new model, please ru

46 Dec 28, 2022
FOSS Digital Asset Distribution Platform built on Frappe.

Digistore FOSS Digital Assets Marketplace. Distribute digital assets, like a pro. Video Demo Here Features Create, attach and list digital assets (PDF

Mohammad Hussain Nagaria 30 Dec 08, 2022
All materials of Cassandra Event, Udyam'22

Cassandra 2022 Workspace Workshop Materials Workshop-1 Workshop-2 Workshop-3 Workshop-4 Assignments Assignment-1 Assignment-2 Assignment-3 Resources P

36 Dec 31, 2022
[CVPR 2019 Oral] Multi-Channel Attention Selection GAN with Cascaded Semantic Guidance for Cross-View Image Translation

SelectionGAN for Guided Image-to-Image Translation CVPR Paper | Extended Paper | Guided-I2I-Translation-Papers Citation If you use this code for your

Hao Tang 424 Dec 02, 2022