How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Last update: Sep 20, 2022

Related tags

Overview

AdamBNN

This is the pytorch implementation of our paper "How Do Adam and Training Strategies Help BNNs Optimization?", published in ICML 2021.

In this work, we explore the intrisic reasons why Adam is superior to other optimizers like SGD for BNN optimization and provide analytical explanations that support specific training strategies. By visualizing the optimization trajectory, we show that the optimization lies in extremely rugged loss landscape and the second-order momentum in Adam is crucial to revitalize the weights that are dead due to the activation saturation in BNNs. Based on analysis, we derive a specific training scheme and achieve 70.5% top-1 accuracy on the ImageNet dataset using the same achitecture as ReActNet while achieving 1.1% higher accuracy.

Citation

If you find our code useful for your research, please consider citing:

@conference{liu2021how,
title = {How do adam and training strategies help bnns optimization?},
author = {Liu, Zechun and Shen, Zhiqiang and Li, Shichao and Helwegen, Koen and Huang, Dong and Cheng, Kwang-Ting},
booktitle = {International Conference on Machine Learning},
year = {2021},
organization={PMLR}
}

Run

1. Requirements:

python3, pytorch 1.7.1, torchvision 0.8.2

2. Data:

Download ImageNet dataset

3. Steps to run:

(1) Step1: binarizing activations

Change directory to ./step1/
run bash run.sh

(2) Step2: binarizing weights + activations

Change directory to ./step2/
run bash run.sh

Models

Methods	Backbone	Top1-Acc	FLOPs	Trained Model
ReActNet	ReActNet-A	69.4%	0.87 x 10^8	Model-ReAct
AdamBNN	ReActNet-A	70.5%	0.87 x 10^8	Model-ReAct-AdamBNN-Training

Contact

Zechun Liu, HKUST and CMU (zliubq at connect.ust.hk / zechunl at andrew.cmu.edu)

Zhiqiang Shen, CMU (zhiqians at andrew.cmu.edu)

How Do Adam and Training Strategies Help BNNs Optimization? In ICML 2021.

Related tags

Overview

AdamBNN

Citation

Run

1. Requirements:

2. Data:

3. Steps to run:

Models

Contact

Owner

Zechun Liu

This repository provides an unified frameworks to train and test the state-of-the-art few-shot font generation (FFG) models.

PaddleViT: State-of-the-art Visual Transformer and MLP Models for PaddlePaddle 2.0+

DeepMoCap: Deep Optical Motion Capture using multiple Depth Sensors and Retro-reflectors

Object classification with basic computer vision techniques

PyTorch implementation of our ICCV 2019 paper: Liquid Warping GAN: A Unified Framework for Human Motion Imitation, Appearance Transfer and Novel View Synthesis

QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)

Navigating StyleGAN2 w latent space using CLIP

Human-Pose-and-Motion History

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

Automatic Calibration for Non-repetitive Scanning Solid-State LiDAR and Camera Systems

This PyTorch package implements MoEBERT: from BERT to Mixture-of-Experts via Importance-Guided Adaptation (NAACL 2022).

SAFL: A Self-Attention Scene Text Recognizer with Focal Loss

Official PyTorch implementation of "BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation" (NeurIPS 2021)

Modified fork of Xuebin Qin's U-2-Net Repository. Used for demonstration purposes.

FactSeg: Foreground Activation Driven Small Object Semantic Segmentation in Large-Scale Remote Sensing Imagery (TGRS)

Advancing mathematics by guiding human intuition with AI

Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Github Traffic Insights as Prometheus metrics.

Official implementation of the MM'21 paper Constrained Graphic Layout Generation via Latent Optimization