[ICLR 2021] "CPT: Efficient Deep Neural Network Training via Cyclic Precision" by Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Overview

CPT: Efficient Deep Neural Network Training via Cyclic Precision

Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Lin

Accepted at ICLR 2021 (Spotlight) [Paper Link].

Overview

Low-precision deep neural network (DNN) training has gained tremendous attention as reducing precision is one of the most effective knobs for boosting DNNs’ training time/energy efficiency. In this paper, we attempt to explore low-precision training from a new perspective as inspired by recent findings in understanding DNN training: we conjecture that DNNs’ precision might have a similar effect as the learning rate during DNN training, and advocate dynamic precision along the training trajectory for further boosting the time/energy efficiency of DNN training. Specifically, we propose Cyclic Precision Training (CPT) to cyclically vary the precision between two boundary values to balance the coarse-grained exploration of low precision and fine-grained optimization of high precision. Through experiments and visualization we show that CPT helps to (1) converge to a wider minima with a lower generalization error and (2) reduce training variance, which opens up a new design knob for simultaneously improving the optimization and efficiency of DNN training.

Experimental Results

We evaluate CPT on eleven models & five datasets (i.e., ResNet-38/74/110/152/164/MobileNetV2 on CIFAR-10/100, ResNet-18/34/50 on ImageNet, Transformer on WikiText-103, LSTM on PTB). Please refer to our paper for more results.

Results on CIFAR-100

  • Test accuracy vs. training computational cost

  • Loss landscape visualization

Results on ImageNet

  • Accuracy - training efficiency trade-off

  • Boosting optimality

Results on WikiText-103 and PTB

Code Usage

cpt_cifar and cpt_imagenet are the codes customized for CIFAR-10/100 and ImageNet, respectively, with a similar code structure.

Prerequisites

See env.yml for the complete conda environment. Create a new conda environment:

conda env create -f env.yml
conda activate pytorch

Training on CIFAR-10/100 with CPT

In addition to the commonly considered args, e.g., the target network, dataset, and data path via --arch, --dataset, and --datadir, respectively, you also need to: (1) enable cyclic precision training via --is_cyclic_precision; (2) specify the precision bounds for both forward (weights and activations) and backward (gradients and errors) with --cyclic_num_bits_schedule and --cyclic_num_grad_bits_schedule, respectively (note that in CPT, we adopt a constant precision during backward for more stable training process as analyzed in our appendix); (3) specify the number of cyclic periods via --num_cyclic_period which can be set as 32 in all experiments and more ablation studies can be found in Sec. 4.3 of our paper.

  • Example: Training ResNet-74 on CIFAR-100 with CPT (3~8-bit forward, 8-bit backward, and a cyclic periods of 32).
cd cpt_cifar
python train.py --save_folder ./logs --arch cifar100_resnet_74 --workers 4 --dataset cifar100 --datadir path-to-cifar100 --is_cyclic_precision --cyclic_num_bits_schedule 3 8 --cyclic_num_grad_bits_schedule 8 8 --num_cyclic_period 32

We also integrate SWA in our code although it is not used in the reported results of our paper.

Training on ImageNet with CPT

The args for ImageNet experiments are similar with the ones on CIFAR-10/100.

  • Example: Training ResNet-34 on ImageNet with CPT (3~8-bit forward, 8-bit backward, and a cyclic periods of 32).
cd cpt_imagenet
python train.py --save_folder ./logs --arch resnet34 --warm_up --datadir PATH_TO_IMAGENET --is_cyclic_precision --cyclic_num_bits_schedule 3 8 --cyclic_num_grad_bits_schedule 8 8 --num_cyclic_period 32 --automatic_resume

Citation

@article{fu2021cpt,
  title={CPT: Efficient Deep Neural Network Training via Cyclic Precision},
  author={Fu, Yonggan and Guo, Han and Li, Meng and Yang, Xin and Ding, Yining and Chandra, Vikas and Lin, Yingyan},
  journal={arXiv preprint arXiv:2101.09868},
  year={2021}
}

Our Related Work

Please also check our work on how to fractionally squeeze out more training cost savings from the most redundant bit level, progressively along the training trajectory and dynamically per input:

Yonggan Fu, Haoran You, Yang Zhao, Yue Wang, Chaojian Li, Kailash Gopalakrishnan, Zhangyang Wang, Yingyan Lin. "FracTrain: Fractionally Squeezing Bit Savings Both Temporally and Spatially for Efficient DNN Training". NeurIPS, 2020. [Paper Link] [Code]

Owner
Efficient and Intelligent Computing Lab
PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds

PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds PCAM: Product of Cross-Attention Matrices for Rigid Registration of P

valeo.ai 24 May 31, 2022
A geometric deep learning pipeline for predicting protein interface contacts.

A geometric deep learning pipeline for predicting protein interface contacts.

44 Dec 30, 2022
Gray Zone Assessment

Gray Zone Assessment Get started Clone github repository git clone https://github.com/andreanne-lemay/gray_zone_assessment.git Build docker image dock

1 Jan 08, 2022
COPA-SSE contains crowdsourced explanations for the Balanced COPA dataset

COPA-SSE Repository for COPA-SSE: Semi-Structured Explanations for Commonsense Reasoning. COPA-SSE contains crowdsourced explanations for the Balanced

Ana Brassard 5 Jul 31, 2022
SAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)

SAT: 2D Semantics Assisted Training for 3D Visual Grounding SAT: 2D Semantics Assisted Training for 3D Visual Grounding by Zhengyuan Yang, Songyang Zh

Zhengyuan Yang 22 Nov 30, 2022
Shape-Adaptive Selection and Measurement for Oriented Object Detection

Source Code of AAAI22-2171 Introduction The source code includes training and inference procedures for the proposed method of the paper submitted to t

houliping 24 Nov 29, 2022
PyTorch implementation of federated learning framework based on the acceleration of global momentum

Federated Learning with Acceleration of Global Momentum PyTorch implementation of federated learning framework based on the acceleration of global mom

0 Dec 23, 2021
Unofficial PyTorch implementation of MobileViT.

MobileViT Overview This is a PyTorch implementation of MobileViT specified in "MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Tr

Chin-Hsuan Wu 348 Dec 23, 2022
A Fast Monotone Rotating Shallow Water model

pyRSW A Fast Monotone Rotating Shallow Water model How fast? As fast as a sustained 2 Gflop/s per core on a 2.5 GHz cpu (or 2048 Gflop/s with 1024 cor

Guillaume Roullet 13 Sep 28, 2022
[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration

HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration Introduction The repository contains the source code and pre-tr

Intelligent Sensing, Perception and Computing Group 55 Dec 14, 2022
StrongSORT: Make DeepSORT Great Again

StrongSORT StrongSORT: Make DeepSORT Great Again StrongSORT: Make DeepSORT Great Again Yunhao Du, Yang Song, Bo Yang, Yanyun Zhao arxiv 2202.13514 Abs

369 Jan 04, 2023
A symbolic-model-guided fuzzer for TLS

tlspuffin TLS Protocol Under FuzzINg A symbolic-model-guided fuzzer for TLS Master Thesis | Thesis Presentation | Documentation Disclaimer: The term "

69 Dec 20, 2022
Open-Set Recognition: A Good Closed-Set Classifier is All You Need

Open-Set Recognition: A Good Closed-Set Classifier is All You Need Code for our paper: "Open-Set Recognition: A Good Closed-Set Classifier is All You

194 Jan 03, 2023
La source de mon module 'pyfade' disponible sur Pypi.

Version: 1.2 Introduction Pyfade est un module permettant de créer des dégradés colorés. Il vous permettra de changer chaque ligne de votre texte par

Billy 20 Sep 12, 2021
[BMVC2021] "TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation"

TransFusion-Pose TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation Haoyu Ma, Liangjian Chen, Deying Kong, Zhe Wang, Xingwei

Haoyu Ma 29 Dec 23, 2022
OpenMMLab Computer Vision Foundation

English | 简体中文 Introduction MMCV is a foundational library for computer vision research and supports many research projects as below: MMCV: OpenMMLab

OpenMMLab 4.6k Jan 09, 2023
The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal Transport Maps, ICLR 2022.

Generative Modeling with Optimal Transport Maps The repository contains reproducible PyTorch source code of our paper Generative Modeling with Optimal

Litu Rout 30 Dec 22, 2022
a morph transfer UGATIT for image translation.

Morph-UGATIT a morph transfer UGATIT for image translation. Introduction 中文技术文档 This is Pytorch implementation of UGATIT, paper "U-GAT-IT: Unsupervise

55 Nov 14, 2022
An example of semantic segmentation using tensorflow in eager execution.

Semantic segmentation using Tensorflow eager execution Requirement Python 2.7+ Tensorflow-gpu OpenCv H5py Scikit-learn Numpy Imgaug Train with eager e

Iñigo Alonso Ruiz 25 Sep 29, 2022