AlphaNet Improved Training of Supernet with Alpha-Divergence

Last update: Oct 10, 2022

Related tags

Overview

AlphaNet: Improved Training of Supernet with Alpha-Divergence

This repository contains our PyTorch training code, evaluation code and pretrained models for AlphaNet.

Our implementation is largely based on AttentiveNAS. To reproduce our results, please first download the AttentiveNAS repo, and use our train_alphanet.py for training and test_alphanet.py for testing.

For more details, please see AlphaNet: Improved Training of Supernet with Alpha-Divergence by Dilin Wang, Chengyue Gong, Meng Li, Qiang Liu, Vikas Chandra.

If you find this repo useful in your research, please consider citing our work and AttentiveNAS:

@article{wang2021alphanet,
  title={AlphaNet: Improved Training of Supernet with Alpha-Divergence},
  author={Wang, Dilin and Gong, Chengyue and Li, Meng and Liu, Qiang and Chandra, Vikas},
  journal={arXiv preprint arXiv:2102.07954},
  year={2021}
}

@article{wang2020attentivenas,
  title={AttentiveNAS: Improving Neural Architecture Search via Attentive Sampling},
  author={Wang, Dilin and Li, Meng and Gong, Chengyue and Chandra, Vikas},
  journal={arXiv preprint arXiv:2011.09011},
  year={2020}
}

Evaluation

To reproduce our results:

Please first download our pretrained AlphaNet models from a Google Drive path and put the pretrained models under your local folder ./alphanet_data

To evaluate our pre-trained AlphaNet models, from AlphaNet-A0 to A6, on ImageNet with a single GPU, please run:

python test_alphanet.py --config-file ./configs/eval_alphanet_models.yml --model a[0-6]

Expected results:

Name	MFLOPs	Top-1 (%)
AlphaNet-A0	203	77.87
AlphaNet-A1	279	78.94
AlphaNet-A2	317	79.20
AlphaNet-A3	357	79.41
AlphaNet-A4	444	80.01
AlphaNet-A5 (small)	491	80.29
AlphaNet-A5 (base)	596	80.62
AlphaNet-A6	709	80.78

Additionally, here is our pretrained supernet with KL based inplace-KD and here is our pretrained supernet without inplace-KD.

Training

To train our AlphaNet models from scratch, please run:

python train_alphanet.py --config-file configs/train_alphanet_models.yml --machine-rank ${machine_rank} --num-machines ${num_machines} --dist-url ${dist_url}

We adopt SGD training on 64 GPUs. The mini-batch size is 32 per GPU; all training hyper-parameters are specified in train_alphanet_models.yml.

Evolutionary search

In case you want to search the set of models of your own interest - we provide an example to show how to search the Pareto models for the best FLOPs vs. accuracy tradeoffs in parallel_supernet_evo_search.py; to run this example:

python parallel_supernet_evo_search.py --config-file configs/parallel_supernet_evo_search.yml

License

AlphaNet is licensed under CC-BY-NC.

Contributing

We actively welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

AlphaNet Improved Training of Supernet with Alpha-Divergence

Related tags

Overview

AlphaNet: Improved Training of Supernet with Alpha-Divergence

Evaluation

Training

Evolutionary search

License

Contributing

Owner

Facebook Research

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

Open-source codebase for EfficientZero, from "Mastering Atari Games with Limited Data" at NeurIPS 2021.

Official PyTorch implementation of CAPTRA: CAtegory-level Pose Tracking for Rigid and Articulated Objects from Point Clouds

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

A Rao-Blackwellized Particle Filter for 6D Object Pose Tracking

Implementation of U-Net and SegNet for building segmentation

Weakly Supervised Dense Event Captioning in Videos, i.e. generating multiple sentence descriptions for a video in a weakly-supervised manner.

HHP-Net: A light Heteroscedastic neural network for Head Pose estimation with uncertainty

RMTD: Robust Moving Target Defence Against False Data Injection Attacks in Power Grids

This repository is an unoffical PyTorch implementation of Medical segmentation in 3D and 2D.

Paper Code：A Self-adaptive Weighted Differential Evolution Approach for Large-scale Feature Selection

A quantum game modeling of pandemic (QHack 2022)

Do Neural Networks for Segmentation Understand Insideness?

Python implementation of "Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation"

An original implementation of "MetaICL Learning to Learn In Context" by Sewon Min, Mike Lewis, Luke Zettlemoyer and Hannaneh Hajishirzi

IsoGCN code for ICLR2021

This project aim to create multi-label classification annotation tool to boost annotation speed and make it more easier.

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

https://arxiv.org/abs/2102.11005