DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Last update: Dec 30, 2022

Related tags

Deep Learning diffq

Overview

Differentiable Model Compression via Pseudo Quantization Noise

Go read our paper for more details.

Requirements

DiffQ requires Python 3.7, and a reasonably recent version of PyTorch (1.7.1 ideally). To install DiffQ, you can run from the root of the repository:

pip install .

You can also install directly from PyPI with pip install diffq.

Usage

import torch
from torch.nn import functional as F
from diffq import DiffQuantizer

my_model = MyModel()
my_optim = ...  # The optimizer must be created before the quantizer
quantizer = DiffQuantizer(my_model)
quantizer.setup_optimizer(my_optim)

# Or, if you want to use a specific optimizer for DiffQ
quantizer.opt = torch.optim.Adam([{"params": []}])
quantizer.setup_optimizer(quantizer.opt)

# Distributed data parallel must be created after DiffQuantizer!
dmodel = torch.distributed.DistributedDataParallel(...)

# Then go on training as usual, just don't forget to call my_model.train() and my_model.eval().
penalty = 1e-3
for batch in loader:
    ...
    my_optim.zero_grad()
    # If you used a separate optimizer for DiffQ, call
    # quantizer.opt.zero_grad()

    # The `penalty` parameter here will control the tradeoff between model size and model accuracy.
    loss = F.mse_loss(x, y) + penalty * quantizer.model_size()
    my_optim.step()
    # If you used a separate optimizer for DiffQ, call
    # quantizer.opt.step()

# To get the true "naive" model size call
quantizer.true_model_size()

# To get the gzipped model size without actually dumping to disk
quantizer.compressed_model_size()

# When you want to dump your final model:
torch.save(quantizer.get_quantized_state(), "some_file.th")
# DiffQ will not optimally code integers. In order to actually get most
# of the gain in terms of size, you should call call `gzip some_file.th`.

# You can later load back the model with
quantizer.restore_quantized_state(torch.load("some_file.th"))

Documentation

See the API documentation.

Examples

We provide three examples in the examples/ folder. One is for CIFAR-10/100, using standard architecture such as Wide-ResNet, ResNet or MobileNet. The second is based on the DeiT visual transformer. The third is a language modeling task on Wikitext-103, using Fairseq

The DeiT and Fairseq examples are provided as a patch on the original codebase at a specific commit. You can initialize the git submodule and apply the patches by running

make examples

For more details on each example, go checkout their specific READMEs:

Installation for development

This will install the dependencies and a diffq in developer mode (changes to the files will directly reflect), along with the dependencies to run unit tests.

pip install -e '.[dev]'

Updating the patch based examples

In order to update the patches, first run make examples to properly initialize the sub repos. Then perform all the changes you want, commit them and run make patches. This will update the patches for each repo. Once this is done, and you checked that all the changes you did are properly included in the new patch files, you can run make reset (this will remove all your changes you did from the submodules, so do check the patch files before calling this) before calling git add -u .; git commit -m "my changes" and pushing.

Test

You can run the unit tests with

make tests

Citation

If you use this code or results in your paper, please cite our work as:

@article{defossez2021differentiable,
  title={Differentiable Model Compression via Pseudo Quantization Noise},
  author={D{\'e}fossez, Alexandre and Adi, Yossi and Synnaeve, Gabriel},
  journal={arXiv preprint arXiv:2104.09987},
  year={2021}
}

License

This repository is released under the CC-BY-NC 4.0. license as found in the LICENSE file, except for the following parts that is under the MIT license. The files examples/cifar/src/mobilenet.py and examples/cifar/src/src/resnet.py are taken from kuangliu/pytorch-cifar, released as MIT. The file examples/cifar/src/wide_resnet.py is taken from meliketoy/wide-resnet, released as MIT. See each file headers for the detailed license.

DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Related tags

Overview

Differentiable Model Compression via Pseudo Quantization Noise

Requirements

Usage

Documentation

Examples

Installation for development

Updating the patch based examples

Test

Citation

License

Owner

Facebook Research

A more easy-to-use implementation of KPConv

Parallel and High-Fidelity Text-to-Lip Generation; AAAI 2022 ; Official code

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

functorch is a prototype of JAX-like composable function transforms for PyTorch.

(CVPR 2021) Lifting 2D StyleGAN for 3D-Aware Face Generation

Official project repository for 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination'

Face recognition system using MTCNN, FACENET, SVM and FAST API to track participants of Big Brother Brasil in real time.

Differentiable architecture search for convolutional and recurrent networks

CVPR 2021 - Official code repository for the paper: On Self-Contact and Human Pose.

Simple-Neural-Network From Scratch in Python

A Partition Filter Network for Joint Entity and Relation Extraction EMNLP 2021

Flybirds - BDD-driven natural language automated testing framework, present by Trip Flight

Chinese license plate recognition

[AAAI 2022] Sparse Structure Learning via Graph Neural Networks for Inductive Document Classification

Deep learning PyTorch library for time series forecasting, classification, and anomaly detection

Simple PyTorch implementations of Badnets on MNIST and CIFAR10.

Deep and online learning with spiking neural networks in Python

Pytorch implementations of the paper Value Functions Factorization with Latent State Information Sharing in Decentralized Multi-Agent Policy Gradients

Self-supervised learning (SSL) is a method of machine learning

Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentation"