[ICLR 2021] HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

Overview

HW-NAS-Bench: Hardware-Aware Neural Architecture Search Benchmark

Accepted as a spotlight paper at ICLR 2021.

Table of content

File structure

.
├── hw_nas_bench_api # HW-NAS-Bench API
│   ├── fbnet_models # FBNet's space
│   └── nas_201_models # NAS-Bench-201's space
│       ├── cell_infers
│       ├── cell_searchs
│       ├── config_utils
│       ├── shape_infers
│       └── shape_searchs
└── nas_201_api # NAS-Bench-201 API

Prerequisites

The code has the following dependencies:

  • python >= 3.6.10
  • pytorch >= 1.2.0
  • numpy >= 1.18.5

Preparation and download

No addtional file needs to be downloaded, our HW-NAS-Bench dataset has been included in this repository.

[Optional] If you want to use NAS-Bench-201 to access information about the architectures' accuracy and loss, please follow the NAS-Bench-201 guide, and download the NAS-Bench-201-v1_1-096897.pth.

How to use HW-NAS-Bench

More usage can be found in our jupyter notebook example

  1. Create an API instance from a file:
from hw_nas_bench_api import HWNASBenchAPI as HWAPI
hw_api = HWAPI("HW-NAS-Bench-v1_0.pickle", search_space="nasbench201")
  1. Show the real measured/estimated hardware-cost in different datasets:
# Example to get all the hardware metrics in the No.0,1,2 architectures under NAS-Bench-201's Space
for idx in range(3):
    for dataset in ["cifar10", "cifar100", "ImageNet16-120"]:
        HW_metrics = hw_api.query_by_index(idx, dataset)
        print("The HW_metrics (type: {}) for No.{} @ {} under NAS-Bench-201: {}".format(type(HW_metrics),

Corresponding printed information:

===> Example to get all the hardware metrics in the No.0,1,2 architectures under NAS-Bench-201's Space
The HW_metrics (type: <class 'dict'>) for No.0 @ cifar10 under NAS-Bench-201: {'edgegpu_latency': 5.807418537139893, 'edgegpu_energy': 24.226614330768584, 'raspi4_latency': 10.481976820010459, 'edgetpu_latency': 0.9571811309997429, 'pixel3_latency': 3.6058499999999998, 'eyeriss_latency': 3.645620000000001, 'eyeriss_energy': 0.6872827644999999, 'fpga_latency': 2.57296, 'fpga_energy': 18.01072}
...
  1. Show the real measured/estimated hardware-cost for a single architecture:
# Example to get use the hardware metrics in the No.0 architectures in CIFAR-10 under NAS-Bench-201's Space
print("===> Example to get use the hardware metrics in the No.0 architectures in CIFAR-10 under NAS-Bench-201's Space")
HW_metrics = hw_api.query_by_index(0, "cifar10")
for k in HW_metrics:
    if "latency" in k:
        unit = "ms"
    else:
        unit = "mJ"
    print("{}: {} ({})".format(k, HW_metrics[k], unit))

Corresponding printed information:

===> Example to get use the hardware metrics in the No.0 architectures in CIFAR-10 under NAS-Bench-201's Space
edgegpu_latency: 5.807418537139893 (ms)
edgegpu_energy: 24.226614330768584 (mJ)
raspi4_latency: 10.481976820010459 (ms)
edgetpu_latency: 0.9571811309997429 (ms)
pixel3_latency: 3.6058499999999998 (ms)
eyeriss_latency: 3.645620000000001 (ms)
eyeriss_energy: 0.6872827644999999 (mJ)
fpga_latency: 2.57296 (ms)
fpga_energy: 18.01072 (mJ)
  1. Create the network from api:
# Create the network
config = hw_api.get_net_config(0, "cifar10")
print(config)
from hw_nas_bench_api.nas_201_models import get_cell_based_tiny_net
network = get_cell_based_tiny_net(config) # create the network from configurration
print(network) # show the structure of this architecture

Corresponding printed information:

{'name': 'infer.tiny', 'C': 16, 'N': 5, 'arch_str': '|avg_pool_3x3~0|+|nor_conv_1x1~0|skip_connect~1|+|nor_conv_1x1~0|skip_connect~1|skip_connect~2|', 'num_classes': 10}
TinyNetwork(
  TinyNetwork(C=16, N=5, L=17)
  (stem): Sequential(
    (0): Conv2d(3, 16, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  )
  (cells): ModuleList(
    (0): InferCell(
      info :: nodes=4, inC=16, outC=16, [1<-(I0-L0) | 2<-(I0-L1,I1-L2) | 3<-(I0-L3,I1-L4,I2-L5)], |avg_pool_3x3~0|+|nor_conv_1x1~0|skip_connect~1|+|nor_conv_1x1~0|skip_connect~1|skip_connect~2|
      (layers): ModuleList(
        (0): POOLING(
          (op): AvgPool2d(kernel_size=3, stride=1, padding=1)
        )
        (1): ReLUConvBN(
...

Misc

Part of the devices used in HW-NAS-Bench:

Part of the devices used in HW-NAS-Bench

Acknowledgment

Owner
Efficient and Intelligent Computing Lab
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

Awesome production machine learning This repository contains a curated list of awesome open source libraries that will help you deploy, monitor, versi

The Institute for Ethical Machine Learning 12.9k Jan 04, 2023
High-resolution networks and Segmentation Transformer for Semantic Segmentation

High-resolution networks and Segmentation Transformer for Semantic Segmentation Branches This is the implementation for HRNet + OCR. The PyTroch 1.1 v

HRNet 2.8k Jan 07, 2023
A PyTorch implementation of SIN: Superpixel Interpolation Network

SIN: Superpixel Interpolation Network This is is a PyTorch implementation of the superpixel segmentation network introduced in our PRICAI-2021 paper:

6 Sep 28, 2022
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

This repository holds NVIDIA-maintained utilities to streamline mixed precision and distributed training in Pytorch. Some of the code here will be included in upstream Pytorch eventually. The intenti

NVIDIA Corporation 6.9k Jan 03, 2023
The 2nd place solution of 2021 google landmark retrieval on kaggle.

Google_Landmark_Retrieval_2021_2nd_Place_Solution The 2nd place solution of 2021 google landmark retrieval on kaggle. Environment We use cuda 11.1/pyt

229 Dec 13, 2022
Official Pytorch implementation for 2021 ICCV paper "Learning Motion Priors for 4D Human Body Capture in 3D Scenes" and trained models / data

Learning Motion Priors for 4D Human Body Capture in 3D Scenes (LEMO) Official Pytorch implementation for 2021 ICCV (oral) paper "Learning Motion Prior

165 Dec 19, 2022
Fuse radar and camera for detection

SAF-FCOS: Spatial Attention Fusion for Obstacle Detection using MmWave Radar and Vision Sensor This project hosts the code for implementing the SAF-FC

ChangShuo 18 Jan 01, 2023
3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans.

3DMV 3DMV jointly combines RGB color and geometric information to perform 3D semantic segmentation of RGB-D scans. This work is based on our ECCV'18 p

Владислав Молодцов 0 Feb 06, 2022
Efficient 3D Backbone Network for Temporal Modeling

VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and

102 Dec 06, 2022
Non-stationary GP package written from scratch in PyTorch

NSGP-Torch Examples gpytorch model with skgpytorch # Import packages import torch from regdata import NonStat2D from gpytorch.kernels import RBFKernel

Zeel B Patel 1 Mar 06, 2022
Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection

Effect of Deep Transfer and Multi task Learning on Sperm Abnormality Detection Introduction This repository includes codes and models of "Effect of De

Amir Abbasi 5 Sep 05, 2022
Pytorch Implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension)

DiffSinger - PyTorch Implementation PyTorch implementation of DiffSinger: Diffusion Acoustic Model for Singing Voice Synthesis (TTS Extension). Status

Keon Lee 152 Jan 02, 2023
TensorLight - A high-level framework for TensorFlow

TensorLight is a high-level framework for TensorFlow-based machine intelligence applications. It reduces boilerplate code and enables advanced feature

Benjamin Kan 10 Jul 31, 2022
HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands

HALO: A Skeleton-Driven Neural Occupancy Representation for Articulated Hands Oral Presentation, 3DV 2021 Korrawe Karunratanakul, Adrian Spurr, Zicong

Korrawe Karunratanakul 43 Oct 07, 2022
This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters.

openmc-plasma-source This python-based package offers a way of creating a parametric OpenMC plasma source from plasma parameters. The OpenMC sources a

Fusion Energy 10 Oct 18, 2022
Just Randoms Cats with python

Random-Cat Just Randoms Cats with python.

OriCode 2 Dec 21, 2021
Lightweight plotting to the terminal. 4x resolution via Unicode.

Uniplot Lightweight plotting to the terminal. 4x resolution via Unicode. When working with production data science code it can be handy to have plotti

Olav Stetter 203 Dec 29, 2022
Pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model'

RTK-PAD This is an official pytorch implementation of 'Fingerprint Presentation Attack Detector Using Global-Local Model', which is accepted by IEEE T

6 Aug 01, 2022
A Keras implementation of YOLOv3 (Tensorflow backend)

keras-yolo3 Introduction A Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K. Quick Start Download YOLOv3 weights fro

7.1k Jan 03, 2023
DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe.

DeepLab Introduction DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe. It combines densely-compute

Ali 234 Nov 14, 2022