Pytorch implementation of BRECQ, ICLR 2021

Related tags

Deep LearningBRECQ
Overview

BRECQ

Pytorch implementation of BRECQ, ICLR 2021

@inproceedings{
li&gong2021brecq,
title={BRECQ: Pushing the Limit of Post-Training Quantization by Block Reconstruction},
author={Yuhang Li and Ruihao Gong and Xu Tan and Yang Yang and Peng Hu and Qi Zhang and Fengwei Yu and Wei Wang and Shi Gu},
booktitle={International Conference on Learning Representations},
year={2021},
url={https://openreview.net/forum?id=POWv6hDd9XH}
}

Pretrained models

We provide all the pretrained models and they can be accessed via torch.hub

For example: use res18 = torch.hub.load('yhhhli/BRECQ', model='resnet18', pretrained=True) to get the pretrained ResNet-18 model.

If you encounter URLError when downloading the pretrained network, it's probably a network failure. An alternative way is to use wget to manually download the file, then move it to ~/.cache/torch/checkpoints, where the load_state_dict_from_url function will check before downloading it.

For example:

wget https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet50_imagenet.pth.tar 
mv resnet50_imagenet.pth.tar ~/.cache/torch/checkpoints

Usage

python main_imagenet.py --data_path PATN/TO/DATA --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 4 --act_quant --test_before_calibration

You can get the following output:

Quantized accuracy before brecq: 0.13599999248981476
Weight quantization accuracy: 66.32799530029297
Full quantization (W2A4) accuracy: 65.21199798583984
Comments
  • how to reproduce zero data result?

    how to reproduce zero data result?

    as title.

    there is a bug: https://github.com/yhhhli/BRECQ/blob/da93abc4f7e3ef437b356a2df8a5ecd8c326556e/main_imagenet.py#L173

    args.batchsize should be args.workers

    opened by yyfcc17 6
  • why not quantize  the activation of  the last conv layer in a block

    why not quantize the activation of the last conv layer in a block

    Hi, Thanks for the release of your code. But I have one problem regarding the detail of the implementation. In quant_block.py, take the following code of ResNet-18 and ResNet-34 for example. The disable_act_quant is set True for conv2, which disables the quantization of the output of conv2.

    class QuantBasicBlock(BaseQuantBlock):
        """
        Implementation of Quantized BasicBlock used in ResNet-18 and ResNet-34.
        """
        def __init__(self, basic_block: BasicBlock, weight_quant_params: dict = {}, act_quant_params: dict = {}):
            super().__init__(act_quant_params)
            self.conv1 = QuantModule(basic_block.conv1, weight_quant_params, act_quant_params)
            self.conv1.activation_function = basic_block.relu1
            self.conv2 = QuantModule(basic_block.conv2, weight_quant_params, act_quant_params, disable_act_quant=True)
    
            # modify the activation function to ReLU
            self.activation_function = basic_block.relu2
    
            if basic_block.downsample is None:
                self.downsample = None
            else:
                self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params,
                                              disable_act_quant=True)
            # copying all attributes in original block
            self.stride = basic_block.stride
    

    It will cause a boost in accuracy, the following is the result I get use the your code and the same ImageNet dataset you used in the paper. [1] and [2] denotes the modification I did to the original code.

    image

    [1]: quant_block.py→QuantBasicBlock→__init__→self.conv2=QuantModule(... , disable_act_quant=True) self.downsample = QuantModule(basic_block.downsample[0], weight_quant_params, act_quant_params, disable_act_quant=True). Change from True to False; [2]: quant_block.py→QuantInvertedResidual→__init__→self.conv=nn.Sequential(..., QuantModule(... , disable_act_quant=True), change from True to False

    But I do not think it is applicable for most of NPUs, which do quantization of every output of conv layer. So why not quantize the activation of the last conv layer in a block? Is there any particular reason for this? Also, for the methods you compared with in your paper, have you checked whether they do the same thing as you do or not?

    opened by frankgt 3
  • disable act quantization is designed for convolution

    disable act quantization is designed for convolution

    Hi, Very impressive coding.

    There is a question about the quantization of activation values.

    In the code:

    disable act quantization is designed for convolution before elemental-wise operation,

    in that case, we apply activation function and quantization after ele-wise op.

    Why can it be replaced like this?

    Thanks

    opened by xiayizhan2017 2
  • How to deal with data parallel and distributed data parallel?

    How to deal with data parallel and distributed data parallel?

    On my eyes, your code is just running with single gpu while I need to test this code with multi-gpu for other implementations. I just want to check that you have ran your code using data parallel and distributed data parallel.

    opened by jang0977 2
  • What is the purpose for setting retain_graph=True?

    What is the purpose for setting retain_graph=True?

    https://github.com/yhhhli/BRECQ/blob/2888b29de0a88ece561ae2443defc758444e41c1/quant/block_recon.py#L91

    What is the purpose for setting retain_graph=True?

    opened by un-knight 2
  • Cannot reproduce the accuracy

    Cannot reproduce the accuracy

    Greetings,

    Really appreciate your open source contribution.

    However, it seems the accuracy mentioned in the paper cannot be reproduced applying the standard Imagenet. For instance, with the full precision model, I have tested Resnet 18 (70.186%), MobileNetv2(71.618%), which is slightly lower than the results from your paper (71.08, 72.49 respectively).

    Have you utilized any preprocessing techniques other than imagenet.build_imagenet_data?

    Thanks

    opened by mike-zyz 2
  • suggest replacing .view with .reshape in accuracy() function

    suggest replacing .view with .reshape in accuracy() function

    Got an error:

    Traceback (most recent call last):
      File "main_imagenet.py", line 198, in <module>
        print('Quantized accuracy before brecq: {}'.format(validate_model(test_loader, qnn)))
      File "/home/xxxx/anaconda3/envs/torch/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
        return func(*args, **kwargs)
      File "main_imagenet.py", line 108, in validate_model
        acc1, acc5 = accuracy(output, target, topk=(1, 5))
      File "main_imagenet.py", line 77, in accuracy
        correct_k = correct[:k].view(-1).float().sum(0, keepdim=True)
    RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces). Use .reshape(...) instead.
    

    So suggest replacing .view with .reshape in accuracy() function.

    opened by un-knight 1
  • channel_wise quantization

    channel_wise quantization

    Hi, nice idea for quantizaton But it seems that the paper(not include the appendix) did not point that it is channel-wise quantization. however, the code showed it is. As we know, it is of course that channel-wise quntization would outperform layer-wise quantization. So, maybe it's hard to say that the performance of your method is close to QAT

    opened by shiyuetianqiang 1
  • Some questions about implementation details

    Some questions about implementation details

    Hello, thank you for an interesting paper and nice code.

    I have two questions concerning implementation details.

    1. Does the "one-by-one" block reconstruction mentioned in the paper mean that input to each block comes from already quantized preceding blocks, i.e. each block may correct quantization errors coming from previous blocks? Or maybe input to each block is collected from the full-precision model?
    2. Am I correct in my understanding that in block-wise reconstruction objective you use gradients for each object in calibration sample independently (i.e. no gradient averaging or smth, like in Adam mentioned on the paper)? Besides, what is happening here in data_utils.py, why do you add 1.0 to the gradients?
    cached_grads = cached_grads.abs() + 1.0
    # scaling to make sure its mean is 1
    # cached_grads = cached_grads * torch.sqrt(cached_grads.numel() / cached_grads.pow(2).sum())
    

    Thank you for your time and consideration!

    opened by AndreevP 0
  • Quantization doesn't work?

    Quantization doesn't work?

    Hi,

    So I tried running your code on CIFAR-10 with a pre-trained ResNet50 model. I've attached the code below. My accuracy however does not come nearly as close to the float model which is around 93% but after quanitzation: I get:

    • Accuracy of the network on the 10000 test images: 10.0 % top5: 52.28 %

    Please help me with this. The code is inside the zip file.

    main_cifar.zip s

    opened by praneet195 0
  • 在使用论文中提出的Fisher-diag方式进行Hessian估计时会提示Trying to backward through the graph a second time

    在使用论文中提出的Fisher-diag方式进行Hessian估计时会提示Trying to backward through the graph a second time

    如文中所提出的Fisher-diag方式来估计Hessian矩阵,需要计算每一层pre-activation的梯度。但在实际代码运行时,save_grad_data中的cur_grad = get_grad(cali_data[i * batch_size:(i + 1) * batch_size])在执行到第二个batch的时候会报错Trying to backward through the graph a second time,第一个batch的数据并不会报错。不知道作者是否遇到过类似的情况?

    opened by ariescts 2
  • Cuda Error when launching example

    Cuda Error when launching example

    [email protected]:/path_to/BRECQ# python main_imagenet.py --data_path /path_to/IMAGENET_2012/ --arch resnet18 --n_bits_w 2 --channel_wise --n_bits_a 4 --act_quant --test_before_calibration You are using fake SyncBatchNorm2d who is actually the official BatchNorm2d ==> Using Pytorch Dataset Downloading: "https://github.com/yhhhli/BRECQ/releases/download/v1.0/resnet18_imagenet.pth.tar" to /root/.cache/torch/hub/checkpoints/resnet18_imagenet.pth.tar 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 44.6M/44.6M [00:27<00:00, 1.70MB/s] Traceback (most recent call last): File "main_imagenet.py", line 178, in cnn.cuda() File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 680, in cuda return self._apply(lambda t: t.cuda(device)) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 570, in _apply module._apply(fn) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 593, in _apply param_applied = fn(param) File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 680, in return self._apply(lambda t: t.cuda(device)) RuntimeError: CUDA error: out of memory CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

    opened by L-ED 1
Owner
Yuhang Li
Research Intern at @SenseTime Group Limited
Yuhang Li
A Keras implementation of YOLOv4 (Tensorflow backend)

keras-yolo4 请使用更完善的版本: https://github.com/miemie2013/Keras-YOLOv4 Please visit here for more complete model: https://github.com/miemie2013/Keras-YOLOv

384 Nov 29, 2022
A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

GuidEye A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding h

Munal Jain 0 Aug 09, 2022
Code for the paper: On Pathologies in KL-Regularized Reinforcement Learning from Expert Demonstrations

Non-Parametric Prior Actor-Critic (N-PPAC) This repository contains the code for On Pathologies in KL-Regularized Reinforcement Learning from Expert D

Cong Lu 5 May 13, 2022
One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing".

Introduction One implementation of the paper "DMRST: A Joint Framework for Document-Level Multilingual RST Discourse Segmentation and Parsing". Users

seq-to-mind 18 Dec 11, 2022
Turning SymPy expressions into JAX functions

sympy2jax Turn SymPy expressions into parametrized, differentiable, vectorizable, JAX functions. All SymPy floats become trainable input parameters. S

Miles Cranmer 38 Dec 11, 2022
Self-describing JSON-RPC services made easy

ReflectRPC Self-describing JSON-RPC services made easy Contents What is ReflectRPC? Installation Features Datatypes Custom Datatypes Returning Errors

Andreas Heck 31 Jul 16, 2022
Code and hyperparameters for the paper "Generative Adversarial Networks"

Generative Adversarial Networks This repository contains the code and hyperparameters for the paper: "Generative Adversarial Networks." Ian J. Goodfel

Ian Goodfellow 3.5k Jan 08, 2023
Convert weight file.pth to weight file.blob

CONVERT YOUR MODEL TO IR FORMAT INSTALLATION OpenVino Toolkit Download openvinotoolkit 2021.3 version : Link Instruction of installation : Link Pytorc

Tran Anh Tuan 3 Nov 18, 2021
Radar-to-Lidar: Heterogeneous Place Recognition via Joint Learning

radar-to-lidar-place-recognition This page is the coder of a pre-print, implemented by PyTorch. If you have some questions on this project, please fee

Huan Yin 37 Oct 09, 2022
This is the winning solution of the Endocv-2021 grand challange.

Endocv2021-winner [Paper] This is the winning solution of the Endocv-2021 grand challange. Dependencies pytorch # tested with 1.7 and 1.8 torchvision

Vajira Thambawita 14 Dec 03, 2022
FedTorch is an open-source Python package for distributed and federated training of machine learning models using PyTorch distributed API

FedTorch is a generic repository for benchmarking different federated and distributed learning algorithms using PyTorch Distributed API.

Machine Learning and Optimization Lab @PennState 136 Dec 23, 2022
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

10 Nov 14, 2022
PyG (PyTorch Geometric) - A library built upon PyTorch to easily write and train Graph Neural Networks (GNNs)

PyG (PyTorch Geometric) is a library built upon PyTorch to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.

PyG 16.5k Jan 08, 2023
codebase for "A Theory of the Inductive Bias and Generalization of Kernel Regression and Wide Neural Networks"

Eigenlearning This repo contains code for replicating the experiments of the paper A Theory of the Inductive Bias and Generalization of Kernel Regress

Jamie Simon 45 Dec 02, 2022
Meta Representation Transformation for Low-resource Cross-lingual Learning

MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning This repo hosts the code for MetaXL, published at NAACL 2021. [Meta

Microsoft 36 Aug 17, 2022
The fastest way to visualize GradCAM with your Keras models.

VizGradCAM VizGradCam is the fastest way to visualize GradCAM in Keras models. GradCAM helps with providing visual explainability of trained models an

58 Nov 19, 2022
Prml - Repository of notes, code and notebooks in Python for the book Pattern Recognition and Machine Learning by Christopher Bishop

Pattern Recognition and Machine Learning (PRML) This project contains Jupyter notebooks of many the algorithms presented in Christopher Bishop's Patte

Gerardo Durán-Martín 1k Jan 07, 2023
PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition The unofficial code of CDistNet. Now, we ha

25 Jul 20, 2022
Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers.

Contra-OOD Code for EMNLP 2021 paper Contrastive Out-of-Distribution Detection for Pretrained Transformers. Requirements PyTorch Transformers datasets

Wenxuan Zhou 27 Oct 28, 2022
Securetar - A streaming wrapper around python tarfile and allow secure handling files and support encryption

Secure Tar Secure Tarfile library It's a streaming wrapper around python tarfile

Pascal Vizeli 2 Dec 09, 2022