code for "AttentiveNAS Improving Neural Architecture Search via Attentive Sampling"

Overview

AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling

This repository contains our PyTorch training code, evaluation code and pretrained models for AttentiveNAS.

[Update 06/21] Recenty, we have improved AttentiveNAS using an adaptive knowledge distillation training strategy, see our AlphaNet repo for more details of this work. AlphaNet has been accepted by ICML'21.

[Update 07/21] We provide an example code for searching the best models of FLOPs vs. accuracy trade-offs at here.

For more details, please see AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling by Dilin Wang, Meng Li, Chengyue Gong and Vikas Chandra.

If you find this repo useful in your research, please consider citing our work:

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Evaluation

To reproduce our results:

  • Please first download our pretrained AttentiveNAS models from a Google Drive path and put the pretrained models under your local folder ./attentive_nas_data

  • To evaluate our pre-trained AttentiveNAS models, from AttentiveNAS-A0 to A6, on ImageNet with a single GPU, please run:

    python test_attentive_nas.py --config-file ./configs/eval_attentive_nas_models.yml --model a[0-6]

    Expected results:

    Name MFLOPs Top-1 (%)
    AttentiveNAS-A0 203 77.3
    AttentiveNAS-A1 279 78.4
    AttentiveNAS-A2 317 78.8
    AttentiveNAS-A3 357 79.1
    AttentiveNAS-A4 444 79.8
    AttentiveNAS-A5 491 80.1
    AttentiveNAS-A6 709 80.7

Training

To train our AttentiveNAS models from scratch, please run

python train_attentive_nas.py --config-file configs/train_attentive_nas_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_attentive_nas_models.yml.

Additional data

  • A (sub-network config, FLOPs) lookup table could be used for constructing the architecture distribution under FLOPs-constraints.
  • A accuracy predictor trained via scikit-learn, which takes a subnetwork configuration as input, and outputs its predicted accuracy on ImageNet.
    • Convert a subnetwork configuration to our accuracy predictor compatibale inputs:
        res = [cfg['resolution']]
        for k in ['width', 'depth', 'kernel_size', 'expand_ratio']:
            res += cfg[k]
        input = np.asarray(res).reshape((1, -1))
    

License

The majority of AttentiveNAS is licensed under CC-BY-NC, however portions of the project are available under separate license terms: Once For All is licensed under the Apache 2.0 license.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

Owner
Facebook Research
Facebook Research
ShapeGlot: Learning Language for Shape Differentiation

ShapeGlot: Learning Language for Shape Differentiation Created by Panos Achlioptas, Judy Fan, Robert X.D. Hawkins, Noah D. Goodman, Leonidas J. Guibas

Panos 32 Dec 23, 2022
Semantically Contrastive Learning for Low-light Image Enhancement

Semantically Contrastive Learning for Low-light Image Enhancement Here, we propose an effective semantically contrastive learning paradigm for Low-lig

48 Dec 16, 2022
GeneralOCR is open source Optical Character Recognition based on PyTorch.

Introduction GeneralOCR is open source Optical Character Recognition based on PyTorch. It makes a fidelity and useful tool to implement SOTA models on

57 Dec 29, 2022
Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[TensorFlow] Protein Interface Prediction using Graph Convolutional Networks Unofficial TensorFlow implementation of Protein Interface Prediction usin

YeongHyeon Park 9 Oct 25, 2022
Easy-to-use library to boost AI inference leveraging state-of-the-art optimization techniques.

NEW RELEASE How Nebullvm Works • Tutorials • Benchmarks • Installation • Get Started • Optimization Examples Discord | Website | LinkedIn | Twitter Ne

Nebuly 1.7k Dec 31, 2022
The code of "Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer".

Code data_preprocess.py: preprocess data for Dependent-T5. parameters.py: define parameters of Dependent-T5. train_tools.py: traning and evaluation co

1 Apr 21, 2022
Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

Complete system for facial identity system. Include one-shot model, database operation, features visualization, monitoring

2 Dec 28, 2021
Chinese clinical named entity recognition using pre-trained BERT model

Chinese clinical named entity recognition (CNER) using pre-trained BERT model Introduction Code for paper Chinese clinical named entity recognition wi

Xiangyang Li 109 Dec 14, 2022
Implementation of Monocular Direct Sparse Localization in a Prior 3D Surfel Map (DSL)

DSL Project page: https://sites.google.com/view/dsl-ram-lab/ Monocular Direct Sparse Localization in a Prior 3D Surfel Map Authors: Haoyang Ye, Huaiya

Haoyang Ye 93 Nov 30, 2022
Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation

DistMIS Distributing Deep Learning Hyperparameter Tuning for 3D Medical Image Segmentation. DistriMIS Distributing Deep Learning Hyperparameter Tuning

HiEST 2 Sep 09, 2022
PyTorch code of my WACV 2022 paper Improving Model Generalization by Agreement of Learned Representations from Data Augmentation

Improving Model Generalization by Agreement of Learned Representations from Data Augmentation (WACV 2022) Paper ArXiv Why it matters? When data augmen

Rowel Atienza 5 Mar 04, 2022
[NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning

SoCo [NeurIPS 2021 Spotlight] Aligning Pretraining for Detection via Object-Level Contrastive Learning By Fangyun Wei*, Yue Gao*, Zhirong Wu, Han Hu,

Yue Gao 139 Dec 14, 2022
Ensembling Off-the-shelf Models for GAN Training

Data-Efficient GANs with DiffAugment project | paper | datasets | video | slides Generated using only 100 images of Obama, grumpy cats, pandas, the Br

MIT HAN Lab 1.2k Dec 26, 2022
Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

Recursive-NeRF: An Efficient and Dynamically Growing NeRF This is a Jittor implementation of Recursive-NeRF: An Efficient and Dynamically Growing NeRF

33 Nov 30, 2022
Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021)

L1-Refinement Paddle implementation for "Cross-Lingual Word Embedding Refinement by ℓ1 Norm Optimisation" (NAACL 2021) 🙈 A more detailed readme is co

Lincedo Lab 4 Jun 09, 2021
Breast-Cancer-Prediction

Breast-Cancer-Prediction Trying to predict whether the cancer is benign or malignant using REGRESSION MODELS in Python. Team Members NAME ROLL-NUMBER

Shyamdev Krishnan J 3 Feb 18, 2022
Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

CaFM-pytorch ICCV ACCEPT Introduction of dataset VSD4K Our dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

96 Jul 05, 2022
✔️ Visual, reactive testing library for Julia. Time machine included.

PlutoTest.jl (alpha release) Visual, reactive testing library for Julia A macro @test that you can use to verify your code's correctness. But instead

Pluto 68 Dec 20, 2022
An auto discord account and token generator. Automatically verifies the phone number. Works without proxy. Bypasses captcha.

JOIN DISCORD SERVER https://discord.gg/uAc3agBY FREE HCAPTCHA SOLVING API Discord-Token-Gen An auto discord token generator. Auto verifies phone numbe

3kp 271 Jan 01, 2023
Building a real-time environment using webcam frame division in OpenCV and classify cropped images using a fine-tuned vision transformers on hybryd datasets samples for facial emotion recognition.

Visual Transformer for Facial Emotion Recognition (FER) This project has the aim to build an efficient Visual Transformer for the Facial Emotion Recog

Mario Sessa 8 Dec 12, 2022