Model Zoo for AI Model Efficiency Toolkit

Last update: Jan 03, 2023

Related tags

Overview

Model Zoo for AI Model Efficiency Toolkit

We provide a collection of popular neural network models and compare their floating point and quantized performance. Results demonstrate that quantized models can provide good accuracy, comparable to floating point models. Together with results, we also provide recipes for users to quantize floating-point models using the AI Model Efficiency ToolKit (AIMET).

Introduction
Tensorflow Models
- Model Zoo
- Detailed Results
PyTorch Models
- Model Zoo
- Detailed Results
Examples
Team
License

Introduction

Quantized inference is significantly faster than floating-point inference, and enables models to run in a power-efficient manner on mobile and edge devices. We use AIMET, a library that includes state-of-the-art techniques for quantization, to quantize various models available in TensorFlow and PyTorch frameworks. The list of models is provided in the sections below.

An original FP32 source model is quantized either using post-training quantization (PTQ) or Quantization-Aware-Training (QAT) technique available in AIMET. Example scripts for evaluation are provided for each model. When PTQ is needed, the evaluation script performs PTQ before evaluation. Wherever QAT is used, the fine-tuned model checkpoint is also provided.

Tensorflow Models

Model Zoo

Network	Model Source ^[1]	Floating Pt (FP32) Model ^[2]	Quantized Model ^[3]	Results ^[4]	Documentation
ResNet-50 (v1)	GitHub Repo	Pretrained Model	See Documentation	(ImageNet) Top-1 Accuracy FP32: 75.21% INT8: 74.96%	ResNet50.md
MobileNet-v2-1.4	GitHub Repo	Pretrained Model	Quantized Model	(ImageNet) Top-1 Accuracy FP32: 75% INT8: 74.21%	MobileNetV2.md
EfficientNet Lite	GitHub Repo	Pretrained Model	Quantized Model	(ImageNet) Top-1 Accuracy FP32: 74.93% INT8: 74.99%	EfficientNetLite.md
SSD MobileNet-v2	GitHub Repo	Pretrained Model	See Example	(COCO) Mean Avg. Precision (mAP) FP32: 0.2469 INT8: 0.2456	SSDMobileNetV2.md
RetinaNet	GitHub Repo	Pretrained Model	See Example	(COCO) mAP FP32: 0.35 INT8: 0.349 Detailed Results	RetinaNet.md
Pose Estimation	Based on Ref.	Based on Ref.	Quantized Model	(COCO) mAP FP32: 0.383 INT8: 0.379, Mean Avg.Recall (mAR) FP32: 0.452 INT8: 0.446	PoseEstimation.md
SRGAN	GitHub Repo	Pretrained Model	See Example	(BSD100) PSNR/SSIM FP32: 25.45/0.668 INT8: 24.78/0.628 INT8W/INT16Act.: 25.41/0.666 Detailed Results	SRGAN.md

^[1] Original FP32 model source
^[2] FP32 model checkpoint
^[3] Quantized Model: For models quantized with post-training technique, refers to FP32 model which can then be quantized using AIMET. For models optimized with QAT, refers to model checkpoint with fine-tuned weights. 8-bit weights and activations are typically used. For some models, 8-bit weights and 16-bit activations (INT8W/INT16Act.) are used to further improve performance of post-training quantization.
^[4] Results comparing float and quantized performance
^[5] Script for quantized evaluation using the model referenced in “Quantized Model” column

Detailed Results

RetinaNet

(COCO dataset)

Average Precision/Recall	@[ IoU \| area \| maxDets]	FP32	INT8
Average Precision	@[ 0.50:0.95 \| all \| 100 ]	0.350	0.349
Average Precision	@[ 0.50 \| all \| 100 ]	0.537	0.536
Average Precision	@[ 0.75 \| all \| 100 ]	0.374	0.372
Average Precision	@[ 0.50:0.95 \| small \| 100 ]	0.191	0.187
Average Precision	@[ 0.50:0.95 \| medium \| 100 ]	0.383	0.381
Average Precision	@[ 0.50:0.95 \| large \| 100 ]	0.472	0.472
Average Recall	@[ 0.50:0.95 \| all \| 1 ]	0.306	0.305
Average Recall	@[0.50:0.95 \| all \| 10 ]	0.491	0.490
Average Recall	@[ 0.50:0.95 \| all \|100 ]	0.533	0.532
Average Recall	@[ 0.50:0.95 \| small \| 100 ]	0.345	0.341
Average Recall	@[ 0.50:0.95 \| medium \| 100 ]	0.577	0.577
Average Recall	@[ 0.50:0.95 \| large \| 100 ]	0.681	0.679

SRGAN

Model	Dataset	PSNR	SSIM
FP32	Set5/Set14/BSD100	29.17/26.17/25.45	0.853/0.719/0.668
INT8/ACT8	Set5/Set14/BSD100	28.31/25.55/24.78	0.821/0.684/0.628
INT8/ACT16	Set5/Set14/BSD100	29.12/26.15/25.41	0.851/0.719/0.666

PyTorch Models

Model Zoo

Network	Model Source ^[1]	Floating Pt (FP32) Model ^[2]	Quantized Model ^[3]	Results ^[4]	Documentation
MobileNetV2	GitHub Repo	Pretrained Model	Quantized Model	(ImageNet) Top-1 Accuracy FP32: 71.67% INT8: 71.14%	MobileNetV2.md
EfficientNet-lite0	GitHub Repo	Pretrained Model	Quantized Model	(ImageNet) Top-1 Accuracy FP32: 75.42% INT8: 74.44%	EfficientNet-lite0.md
DeepLabV3+	GitHub Repo	Pretrained Model	Quantized Model	(PascalVOC) mIOU FP32: 72.62% INT8: 72.22%	DeepLabV3.md
MobileNetV2-SSD-Lite	GitHub Repo	Pretrained Model	Quantized Model	(PascalVOC) mAP FP32: 68.7% INT8: 68.6%	MobileNetV2-SSD-lite.md
Pose Estimation	Based on Ref.	Based on Ref.	Quantized Model	(COCO) mAP FP32: 0.364 INT8: 0.359 mAR FP32: 0.436 INT8: 0.432	PoseEstimation.md
SRGAN	GitHub Repo	Pretrained Model (older version from here)	See Example	(BSD100) PSNR/SSIM FP32: 25.51/0.653 INT8: 25.5/0.648 Detailed Results	SRGAN.md
DeepSpeech2	GitHub Repo	Pretrained Model	See Example	(Librispeech Test Clean) WER FP32 9.92% INT8: 10.22%	DeepSpeech2.md

^[1] Original FP32 model source
^[2] FP32 model checkpoint
^[3] Quantized Model: For models quantized with post-training technique, refers to FP32 model which can then be quantized using AIMET. For models optimized with QAT, refers to model checkpoint with fine-tuned weights. 8-bit weights and activations are typically used. For some models, 8-bit weights and 16-bit weights are used to further improve performance of post-training quantization.
^[4] Results comparing float and quantized performance
^[5] Script for quantized evaluation using the model referenced in “Quantized Model” column

Detailed Results

SRGAN Pytorch

Model	Dataset	PSNR	SSIM
FP32	Set5/Set14/BSD100	29.93/26.58/25.51	0.851/0.709/0.653
INT8	Set5/Set14/BSD100	29.86/26.59/25.55	0.845/0.705/0.648

Examples

Install AIMET

Before you can run the example script for a specific model, you need to install the AI Model Efficiency ToolKit (AIMET) software. Please see this Getting Started page for an overview. Then install AIMET and its dependencies using these Installation instructions.

NOTE: To obtain the exact version of AIMET software that was used to test this model zoo, please install release 1.13.0 when following the above instructions.

Running the scripts

Download the necessary datasets and code required to run the example for the model of interest. The examples run quantized evaluation and if necessary apply AIMET techniques to improve quantized model performance. They generate the final accuracy results noted in the table above. Refer to the Docs for TensorFlow or PyTorch folder to access the documentation and procedures for a specific model.

Team

AIMET Model Zoo is a project maintained by Qualcomm Innovation Center, Inc.

License

Please see the LICENSE file for details.

Comments

Added PyTorch FFNet model, added INT4 to several models
Added the following new model: PyTorch FFNet Added INT4 quantization support to the following models:

Pytorch Classification (regnet_x_3_2gf, resnet18, resnet50)

PyTorch HRNet Posenet

PyTorch HRNet

PyTorch EfficientNet Lite0

PyTorch DeeplabV3-MobileNetV2

Signed-off-by: Bharath Ramaswamy [email protected]
opened by quic-bharathr 0
Added TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models

Added two new models - TensorFlow ModuleDet-EdgeTPU and PyToch InverseForm models Fixed TF version for 2 models in README file Minor updates to Tensorflow EfficientNet Lite-0 doc and PyTorch ssd_mobilenetv2 script

Signed-off-by: Bharath Ramaswamy [email protected]

opened by quic-bharathr 0
Updated post estimation evaluation code and documentation for updated…

… model .pth file with weights state-dict Fixed model loading problem by including model definition in pose_estimation_quanteval.py Add Quantizer Op Assumptions to Pose Estimation document

Signed-off-by: Bharath Ramaswamy [email protected]

opened by quic-bharathr 0
error when run the pose estimation example

$ python3.6 pose_estimation_quanteval.py pe_weights.pth ./data/

2022-05-24 22:37:22,500 - root - INFO - AIMET defining network with shared weights Traceback (most recent call last): File "pose_estimation_quanteval.py", line 700, in pose_estimation_quanteval(args) File "pose_estimation_quanteval.py", line 687, in pose_estimation_quanteval sim = quantsim.QuantizationSimModel(model, dummy_input=(1, 3, 128, 128), quant_scheme=args.quant_scheme) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/quantsim.py", line 157, in init self.connected_graph = ConnectedGraph(self.model, dummy_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 132, in init self._construct_graph(model, model_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 254, in _construct_graph module_tensor_shapes_map = ConnectedGraph._generate_module_tensor_shapes_lookup_table(model, model_input) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/meta/connectedgraph.py", line 244, in _generate_module_tensor_shapes_lookup_table run_hook_for_layers_with_given_input(model, model_input, forward_hook, leaf_node_only=False) File "/home/jlchen/.local/lib/python3.6/site-packages/aimet_torch/utils.py", line 277, in run_hook_for_layers_with_given_input _ = model(*input_tensor) File "/home/jlchen/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1071, in _call_impl result = forward_call(*input, **kwargs) TypeError: forward() takes 2 positional arguments but 5 were given

opened by sundyCoder 0
I try to quantize deepspeech demo,but error happend

ImportError: /home/mi/anaconda3/envs/aimet/lib/python3.7/site-packages/aimet_common/x86_64-linux-gnu/aimet_tensor_quantizer-0.0.0-py3.7-linux-x86_64.egg/AimetTensorQuantizer.cpython-37m-x86_64-linux-gnu.so: undefined symbol: _ZNK2at6Tensor8data_ptrIfEEPT_v

platform:Ubuntu 18.04 GPU: nvidia 2070 CUDA:11.1 pytorch python:3.7

opened by fmbao 0
Request for the MobileNet-V1-1.0 quantized (INT8) model.

Thank you for sharing these valuable models. I'd like to evaluate and look into the 'MobileNet-v1-1.0' model quantized by the DFQ. I'd appreciate it if you could provide the quantized MobileNet-v1-1.0 model either in TF or in PyTorch.

opened by yschoi-dev 0
What's the runtime and AI Framework for DeepSpeech2?

For DeepSpeech2, may I know what's the runtime for it's quantized (INT8 ) model, Hexagan DSP, NPU or others? And what's the AI framework, SNPE, Hexagan NN or others? Thanks~

opened by sunfangxun 0
Unable to replicate DeepLabV3 Pytorch Tutorial numbers

I've been working through the DeepLabV3 Pytorch tutorial, which can be founded here: https://github.com/quic/aimet-model-zoo/blob/develop/zoo_torch/Docs/DeepLabV3.md.

However, when running the evaluation script using optimized checkpoint, I am unable to replicate the mIOU result that was listed in the table. The number that I got was 0.67 while the number reported by Qualcomm was 0.72. I was wondering if anyone have had this issue before and how to resolve it ?

opened by LLNLanLeN 3

Releases(repo_restructured_1)

repo_restructured_1(Dec 21, 2022)
Pre-Holiday Season 2022 Release

Restructured the repo (re-organized folders and files)

Updated model tables in README to add model task categories and grouped similar models together

NO new models were added in this release

Source code(tar.gz)
Source code(zip)
torch_transformer_quicksrnet(Dec 16, 2022)
December 2022 First Release

Reformatted README file to enhance readability

Added detailed model performance evaluation results in their respective Markdown document files

Added the following PyTorch models: QuickSRNet, Bert, DistilBert, MiniLM, MobileBert, MobileViT, Roberta, ViT

Source code(tar.gz)
Source code(zip)
torch_vit(Dec 9, 2022)

This releases PyTorch W8A8 optimized weight and quantization config file for vision transformer model (ViT) for image classification tasks using ImageNet dataset
Source code(tar.gz)
Source code(zip)
default_config.json(573 bytes)
imgnet_vit_5e4_clamp_rl.tar.gz(330.40 MB)
torch_roberta(Dec 9, 2022)

This releases PyTorch Roberta for Natural Language Classifier models' optimized checkpoints for GLUE dataset.
Source code(tar.gz)
Source code(zip)
cola_fp.pth(475.61 MB)
cola_qat.ckpt(476.24 MB)
mnli_fp.pth(475.61 MB)
mnli_qat.ckpt(476.25 MB)
mrpc_fp.pth(475.61 MB)
mrpc_qat.ckpt(476.24 MB)
qnli_fp.pth(475.61 MB)
qnli_qat.ckpt(476.24 MB)
qqp_fp.pth(475.61 MB)
qqp_qat.ckpt(476.24 MB)
rte_fp.pth(475.61 MB)
rte_qat.ckpt(476.24 MB)
sst2_fp.pth(475.61 MB)
sst2_qat.ckpt(476.24 MB)
stsb_fp.pth(475.61 MB)
stsb_qat.ckpt(476.24 MB)
torch_mobilevit(Dec 9, 2022)

This releases PyTorch W8A8 optimized weights for mobile friendly vision transformer model (MobileViT) for image classification tasks using ImageNet dataset
Source code(tar.gz)
Source code(zip)
default_config.json(573 bytes)
imgnet_mobilevit_5e4_clamp_rl.tar.gz(21.59 MB)
torch_mobilebert(Dec 9, 2022)

This releases PyTorch MobileBERT W8A8 optimized checkpoints for GLUE dataset and SQuAD dataset
Source code(tar.gz)
Source code(zip)
cola_fp.pth(94.30 MB)
cola_qat.ckpt(97.05 MB)
mnli_fp.pth(94.31 MB)
mnli_qat.ckpt(97.05 MB)
mrpc_fp.pth(94.30 MB)
mrpc_qat.ckpt(97.05 MB)
qnli_fp.pth(94.30 MB)
qnli_qat.ckpt(97.05 MB)
qqp_fp.pth(94.30 MB)
qqp_qat.ckpt(97.05 MB)
rte_fp.pth(94.30 MB)
rte_qat.ckpt(97.05 MB)
squad_fp.pth(94.30 MB)
squad_qat.ckpt(95.46 MB)
sst2_fp.pth(94.30 MB)
sst2_qat.ckpt(97.05 MB)
stsb_fp.pth(94.30 MB)
stsb_qat.ckpt(97.05 MB)
torch_minilm(Dec 9, 2022)

This releases PyTorch MiniLM optimized W8A8 checkpoints for GLUE dataset and SQuAD dataset
Source code(tar.gz)
Source code(zip)
cola_fp.pth(127.39 MB)
cola_qat.ckpt(128.18 MB)
mnli_fp.pth(126.82 MB)
mnli_qat.ckpt(128.17 MB)
mrpc_fp.pth(127.39 MB)
mrpc_qat.ckpt(128.17 MB)
qnli_fp.pth(127.39 MB)
qnli_qat.ckpt(128.17 MB)
qqp_fp.pth(127.39 MB)
qqp_qat.ckpt(128.17 MB)
rte_fp.pth(127.39 MB)
rte_qat.ckpt(128.17 MB)
squad_fp.pth(126.82 MB)
squad_qat.ckpt(127.59 MB)
sst2_fp.pth(127.39 MB)
sst2_qat.ckpt(128.17 MB)
torch_int4_update(Nov 26, 2022)
November 2022 Release Added W4A8 quantization support and updated code, documentation and model checkpoint artifacts for the following existing models:

Pytorch Classification (regnet_x_3_2gf, resnet18, resnet50)

PyTorch HRNet Posenet

PyTorch HRNet

PyTorch EfficientNet Lite0

PyTorch DeeplabV3-MobileNetV2

Source code(tar.gz)
Source code(zip)
regnet_x_3_2gf_W4A8.encodings(7.75 MB)
regnet_x_3_2gf_W4A8.pth(58.38 MB)
resnet18_W4A8.encodings(1.19 MB)
resnet18_W4A8.pth(44.60 MB)
resnet50_W4A8.encodings(6.59 MB)
resnet50_W4A8.pth(97.47 MB)
model_dlv3+mnv2_w4a8_pc_checkpoint.pt(22.94 MB)
efficientnetlite0_w4a8_pc.encodings(5.47 MB)
model_efficientnetlite0_w4a8_pc_checkpoint.pth(17.74 MB)
hrnet_w4a8_pc.encodings(12.33 MB)
hrnet_w4a8_pc.pth(251.63 MB)
hrnet_posenet_W4A8.encodings(6.82 MB)
hrnet_posenet_W4A8.pth(109.29 MB)
torch_distilbert(Dec 9, 2022)

This releases PyTorch DistilBERT optimized W8A8 checkpoints for GLUE dataset and SQuAD dataset.
Source code(tar.gz)
Source code(zip)
cola_fp.pth(255.46 MB)
cola_qat.ckpt(255.80 MB)
mnli_fp.pth(255.46 MB)
mnli_qat.ckpt(255.75 MB)
mrpc_fp.pth(255.46 MB)
mrpc_qat.ckpt(255.75 MB)
qnli_fp.pth(255.46 MB)
qnli_qat.ckpt(255.75 MB)
qqp_fp.pth(255.46 MB)
qqp_qat.ckpt(255.75 MB)
rte_fp.pth(255.46 MB)
rte_qat.ckpt(255.75 MB)
squad_fp.pth(253.21 MB)
squad_qat.ckpt(253.49 MB)
sst2_fp.pth(255.46 MB)
sst2_qat.ckpt(255.75 MB)
stsb_fp.pth(255.46 MB)
stsb_qat.ckpt(255.74 MB)
torch_bert(Dec 9, 2022)

This releases PyTorch Bert optimized W8A8 checkpoints for GLUE dataset and SQuAD dataset.
Source code(tar.gz)
Source code(zip)
cola_fp.pth(417.77 MB)
cola_qat.ckpt(418.56 MB)
mnli_fp.pth(417.77 MB)
mnli_qat.ckpt(418.56 MB)
mrpc_fp.pth(417.77 MB)
mrpc_qat.ckpt(418.56 MB)
qnli_fp.pth(417.77 MB)
qnli_qat.ckpt(418.55 MB)
qqp_fp.pth(417.77 MB)
qqp_qat.ckpt(418.55 MB)
rte_fp.pth(417.77 MB)
rte_qat.ckpt(418.55 MB)
squad_fp.pth(415.52 MB)
squad_qat.ckpt(416.29 MB)
sst2_fp.pth(417.77 MB)
sst2_qat.ckpt(418.56 MB)
stsb_fp.pth(417.77 MB)
stsb_qat.ckpt(418.55 MB)
quicksrnet-checkpoint-pytorch(Dec 9, 2022)
The release provides the model checkpoint tarballs for different variations of the PyTorch-based QuickSRNet. Each model tarball corresponds to a given num_channels and scaling_factor (release_quicksrnet__<scaling_factor>x.tar.gz). Each tarball contains the following:

checkpoint_float32.pth.tar - full-precision model with the highest validation accuracy on the DIV2k dataset

checkpoint_int8.pth - quantized model with the highest validation accuracy obtained with Quantization-aware Training using AIMET

checkpoint_int8.encodings - Encodings for the quantized models

checkpoint_int8_param_only.encodings - Encodings for the quantized models, with only parameters, without activation encodings

Source code(tar.gz)
Source code(zip)
release_quicksrnet_large_1.5x.tar.gz(5.94 MB)
release_quicksrnet_large_2x.tar.gz(5.99 MB)
release_quicksrnet_large_3x.tar.gz(6.09 MB)
release_quicksrnet_large_4x.tar.gz(6.16 MB)
release_quicksrnet_medium_1.5x.tar.gz(773.30 KB)
release_quicksrnet_medium_2x.tar.gz(772.60 KB)
release_quicksrnet_medium_3x.tar.gz(838.24 KB)
release_quicksrnet_medium_4x.tar.gz(926.72 KB)
release_quicksrnet_small_1.5x.tar.gz(351.86 KB)
release_quicksrnet_small_2x.tar.gz(352.26 KB)
release_quicksrnet_small_3x.tar.gz(415.98 KB)
release_quicksrnet_small_4x.tar.gz(505.15 KB)
torch_segmentation_ffnet(Nov 16, 2022)
There are 5 models in total. The first three models are for high resolution (10242048) input, and the last two models for low resolution (5121024) input.

segmentation_ffnet78S_dBBB_mobile

segmentation_ffnet54S_dBBB_mobile

segmentation_ffnet40S_dBBB_mobile

segmentation_ffnet78S_BCC_mobile_pre_down

segmentation_ffnet122NS_CCC_mobile_pre_down

The following assets are provided for each model:

Prepared FP32 checkpoint

Optimized INT8 checkpoint

Optimized INT8 encodings

Source code(tar.gz)
Source code(zip)
ffnet40S_dBBB_cityscapes_state_dict_quarts.pth(53.25 MB)
ffnet54S_dBBB_cityscapes_state_dict_quarts.pth(69.06 MB)
ffnet78S_BCC_cityscapes_state_dict_quarts_pre_down.pth(102.65 MB)
ffnet78S_dBBB_cityscapes_state_dict_quarts.pth(105.18 MB)
ffnet122NS_CCC_cityscapes_state_dict_quarts_pre_down.pth(123.14 MB)
prepared_segmentation_ffnet40S_dBBB_mobile.pth(53.30 MB)
prepared_segmentation_ffnet54S_dBBB_mobile.pth(69.12 MB)
prepared_segmentation_ffnet78S_BCC_mobile_pre_down.pth(102.73 MB)
prepared_segmentation_ffnet78S_dBBB_mobile.pth(105.26 MB)
prepared_segmentation_ffnet122NS_CCC_mobile_pre_down.pth(123.26 MB)
segmentation_ffnet40S_dBBB_mobile_W8A8_CLE_tfe_perchannel.pth(53.13 MB)
segmentation_ffnet54S_dBBB_mobile_W8A8_CLE_tfe_perchannel.pth(68.91 MB)
segmentation_ffnet78S_BCC_mobile_pre_down_W8A8_CLE_tfe_perchannel.pth(102.45 MB)
segmentation_ffnet78S_dBBB_mobile_W8A8_CLE_tfe_perchannel.pth(104.97 MB)
segmentation_ffnet122NS_CCC_mobile_pre_down_W8A8_CLE_tfe_perchannel.pth(122.77 MB)
torch_inverseform(Oct 31, 2022)
Optimized W8A8 checkpoints and encodings for InverseForm OCRNet-48-IF and HRNet-16-Slim-IF models.

Cross Layer Equalization and AdaRound in per channel mode has been applied on the original model

Adaround has been optimized with 8-bit width and "tf_enhanced" quant scheme

Quantization evaluated with "tf_enhanced" quant scheme in 8 bit width weight & activation quantization

Source code(tar.gz)
Source code(zip)
inverseform-w16_w8a8.encodings(84.87 KB)
inverseform-w16_w8a8.pth(12.34 MB)
inverseform-w48_w8a8.encodings(177.23 KB)
inverseform-w48_w8a8.pth(268.63 MB)
tensorflow_mobiledet_edgetpu_W8A8_quantsim(Oct 31, 2022)
This is the TensorFlow MobileDet-EdgeTPU (FP32 pretrained model is from mlcommons) optimized checkpoint

Batch Norm folding and Adaround has been applied on the original model

Adaround has been optimized with 8-bit width and "tf" quant scheme in the per-tensor mode

Quantization evaluated with "tf" quant scheme in 8 bit width weight & activation quantization

Source code(tar.gz)
Source code(zip)
adaround_mobiledet.encodings(28.53 KB)
checkpoint4AIMET.zip(15.25 MB)
mobiledet_edgetpu_W8A8_optimized_checkpoint_export.zip(15.45 MB)
model_standardization_v1(Sep 14, 2022)
September 2022 Release

Updated quantization evaluation code and documentation that is compatible with the latest AIMET OS release 1.22

Increased clarity and ease of use

More standardized and consistent procedure across all models

Better automation and streamlining of setup tasks

Code and documentation cleanup and fixes.

NO new models were added and NO new binaries or artifacts were published as part of this release

Source code(tar.gz)
Source code(zip)
torchvision_classification_INT4/8(Aug 26, 2022)
Optimized W4A8 and W8A8 checkpoints and encodings for Pytorch Torchvision instances of Resnet18, Resnet50 and Regnet_x_3_2gf.

Cross-Layer Equalization and AdaRound have been applied in per channel mode on the original model for both INT4 and INT8 optimization

"TF_enhanced" quantscheme was used for quantsim

Source code(tar.gz)
Source code(zip)
regnet_x_3_2gf_W8A8.encodings(8.15 MB)
regnet_x_3_2gf_W8A8.pth(58.34 MB)
resnet18_W8A8.encodings(1.46 MB)
resnet18_W8A8.pth(44.59 MB)
resnet50_W8A8.encodings(6.97 MB)
resnet50_W8A8.pth(97.44 MB)
regnet_x_3_2gf_W4A8.encodings(7.75 MB)
regnet_x_3_2gf_W4A8.pth(58.38 MB)
resnet18_W4A8.encodings(1.19 MB)
resnet18_W4A8.pth(44.60 MB)
resnet50_W4A8.encodings(6.59 MB)
resnet50_W4A8.pth(97.47 MB)
hrnet-posenet(Aug 26, 2022)
Optimized W4A8 and W8A8 checkpoints, encodings and FP32 checkpoint for Pytorch HRNET-Posenet model.

For INT8 optimization:

Batchnorm folding has been applied on the original FP32 instance of HRNET-Posenet model.

TF quantscheme was used in per channel mode for quantsim.

For INT4 optimization:

Batchnorm folding followed by AdaRound in per channel mode have been applied on the original FP32 instance of HRNET-Posenet model.

TF quantscheme was used in per channel mode for quantsim.

Source code(tar.gz)
Source code(zip)
hrnet_posenet_FP32.pth(109.71 MB)
hrnet_posenet_W8A8.encodings(4.19 MB)
hrnet_posenet_W8A8.pth(109.28 MB)
hrnet_posenet_W4A8.encodings(6.82 MB)
hrnet_posenet_W4A8.pth(109.29 MB)
torch_hrnet_w8a8_pc(Jul 15, 2022)
This are the PyTorch HRNet W8A8 and W4A8 per channel quantsim model checkpoint

Cross Layer Equalization and Adaround has been applied on the original model

Adaround has been optimized with 8-bit width OR 4-bit width and "tf_enhanced" quant scheme

Quantization evaluated with "tf_enhanced" quant scheme in 8 bit width activation quantization

Source code(tar.gz)
Source code(zip)
hrnet_w8a8_pc.encodings(12.54 MB)
hrnet_w8a8_pc.pth(251.62 MB)
hrnet_w4a8_pc.encodings(12.33 MB)
hrnet_w4a8_pc.pth(251.63 MB)
torch_effnet_lite0_w8a8_pc(Jul 15, 2022)
These are the PyTorch EfficientNet Lite W8A8 and W4A8 per channel quantsim model checkpoints:

Original FP32 model is transformed to a new FP32 model by model validate->model preparer -> model validate

Batch Norm folding and Adaround has been applied on the transformed model

Adaround has been optimized with 4-bit width OR 8-bit width and "tf_enhanced" quant scheme

Quantization evaluated with "tf" quant scheme in 8 bit width activation quantization

Source code(tar.gz)
Source code(zip)
efficientnetlite0_w8a8_pc.encodings(5.53 MB)
model_efficientnetlite0_w8a8_pc_checkpoint.pth(17.72 MB)
efficientnetlite0_w4a8_pc.encodings(5.47 MB)
model_efficientnetlite0_w4a8_pc_checkpoint.pth(17.74 MB)
torch_dlv3_w8a8_pc(Jul 15, 2022)
This is the PyTorch Deeplab V3+ W8A8 OR W4A8 per channel QuantSim model weights file and encodings file

Cross Layer Equalization and Adaround has been applied on the original model

Adaround has been optimized with 8-bit OR 4-bit width width, and "tf_enhanced" quant scheme

Quantization evaluated with "tf_enhanced" quant scheme in 8 bit width activation quantization

Source code(tar.gz)
Source code(zip)
deeplabv3+w8a8_tfe_perchannel.pth(22.18 MB)
deeplabv3+w8a8_tfe_perchannel_param.encodings(4.49 MB)
model_dlv3+mnv2_w4a8_pc_checkpoint.pt(22.94 MB)
xlsr-checkpoint-pytorch(Jun 3, 2022)
The release provides the model checkpoint tarballs for different variations of the PyTorch-based Extremely Lightweight Quantization Robust Real-Time Single-Image Super Resolution (XLSR) model. Each model tarball corresponds to a given scaling_factor (release_xlsr_<scaling_factor>x.tar.gz). Each tarball contains the following:

checkpoint_float32.pth.tar - full-precision model with the highest validation accuracy on the DIV2k dataset

checkpoint_int8.pth - quantized model with the highest validation accuracy obtained with Quantization-aware Training using AIMET

checkpoint_int8.encodings - Encodings for the quantized models

Source code(tar.gz)
Source code(zip)
release_xlsr_2x.tar.gz(271.18 KB)
release_xlsr_3x.tar.gz(334.62 KB)
release_xlsr_4x.tar.gz(420.87 KB)
sesr-checkpoint-pytorch(Jun 3, 2022)
The release provides the model checkpoint tarballs for different variations of the PyTorch-based PyTorch Super-Efficient Super Resolution (SESR) model. Each model tarball corresponds to a given scaling_factor (release_sesr_<config_name>_<scaling_factor>x.tar.gz). Each tarball contains the following:

checkpoint_float32.pth.tar - full-precision model with the highest validation accuracy on the DIV2k dataset

checkpoint_int8.pth - quantized model with the highest validation accuracy obtained with Quantization-aware Training using AIMET

checkpoint_int8.encodings - Encodings for the quantized models

Source code(tar.gz)
Source code(zip)
release_sesr_m3_2x.tar.gz(2.78 MB)
release_sesr_m3_3x.tar.gz(2.83 MB)
release_sesr_m3_4x.tar.gz(1.08 MB)
release_sesr_m5_2x.tar.gz(3.69 MB)
release_sesr_m5_3x.tar.gz(3.73 MB)
release_sesr_m5_4x.tar.gz(3.81 MB)
release_sesr_m7_2x.tar.gz(4.57 MB)
release_sesr_m7_3x.tar.gz(4.65 MB)
release_sesr_m7_4x.tar.gz(4.72 MB)
release_sesr_m11_2x.tar.gz(6.37 MB)
release_sesr_m11_3x.tar.gz(6.42 MB)
release_sesr_m11_4x.tar.gz(6.43 MB)
release_sesr_xl_2x.tar.gz(12.68 MB)
release_sesr_xl_3x.tar.gz(12.64 MB)
release_sesr_xl_4x.tar.gz(12.63 MB)
abpn-checkpoint-pytorch(Apr 26, 2022)
The release provides the model checkpoint tarballs for different variations of the PyTorch-based Anchor-based PlainNet Model (ABPN). Each model tarball corresponds to a given num_channels and scaling_factor (release_abpn_<num_channels>_<scaling_factor>x.tar.gz). Each tarball contains the following:

checkpoint_float32.pth.tar - full-precision model with the highest validation accuracy on the DIV2k dataset

checkpoint_int8.pth - quantized model with the highest validation accuracy obtained with Quantization-aware Training using AIMET

checkpoint_int8.encodings - Encodings for the quantized models

Source code(tar.gz)
Source code(zip)
release_abpn_28_2x.tar.gz(527.40 KB)
release_abpn_28_3x.tar.gz(589.74 KB)
release_abpn_28_4x.tar.gz(647.21 KB)
release_abpn_32_2x.tar.gz(671.58 KB)
release_abpn_32_3x.tar.gz(762.15 KB)
release_abpn_32_4x.tar.gz(820.97 KB)
pt-effnet-checkpoint(Dec 31, 2020)
This is the PyTorch EfficientNet Lite optimized checkpoint

Batch Norm folding and Adaround has been applied on the original model

Adaround has been optimized with 8-bit width and "tf_enhanced" quant scheme

Quantization evaluated with "tf_enhanced" quant scheme in 8 bit width weight & activation quantization

Source code(tar.gz)
Source code(zip)
adaround_efficient_lite.pth(17.72 MB)
ssd_mobilenet_v2_tf(Dec 29, 2020)

fake quant removed ssd mobilenet_v2 quantized 300x300 coco 2019 01 03
Source code(tar.gz)
Source code(zip)
ssd_mobilenet_v2.tar.gz(22.26 MB)
srgan_mmsr_model(Dec 29, 2020)

The model was from an older version of mmediting which was called mmsr. It is not publicly available anymore so we release the model here.

srgan_mmsr_MSRGANx4.gz

A forked version is found here. No model file available just the script.
Source code(tar.gz)
Source code(zip)
srgan_mmsr_MSRGANx4.gz(5.38 MB)
pose_estimation_pytorch(Dec 29, 2020)

The tarball file contains a compressed PyTorch pose estimation model.
Source code(tar.gz)
Source code(zip)
pose_estimation_pytorch_weights.tgz(11.84 MB)
pose_estimation(Dec 29, 2020)

This contains the model checkpoint for optimized Tensorflow pose estimation model.
Source code(tar.gz)
Source code(zip)
pose_estimation_tensorflow.tar.gz(23.88 MB)
mobilenetv2-pytorch(Dec 29, 2020)

• Model has been optimized with Data Free Quantization + Quantization Aware Training • Batch norm folding and Data Free Quantization has been applied, this checkpoint does not contain batch norm layers and ReLU6 in model definition MUST be replaced with RELU • This checkpoint has been evaluated in INT8 - 71.14%
Source code(tar.gz)
Source code(zip)
mv2qat_modeldef.tar.gz(12.49 MB)
mobilenet-v2-1.4(Dec 29, 2020)
General

Post QAT checkpoint for mobilenetv2-1.4. Quantization was done after Batch Norm folding, with tf quant scheme encodings, and the default configuration file. Note that this checkpoint has Batch Norms folded.

Quantized Accuracy: 74.11%

Quantizer Op Assumptions

In the evaluation script included, we have used the default config file, which configures the quantizer ops with the following assumptions:

Weight quantization: 8 bits, asymmetric quantization

Bias parameters are not quantized

Activation quantization: 8 bits, asymmetric quantization

Model inputs are not quantized

Operations which shuffle data such as reshape or transpose do not require additional quantizers

Contents

The tarball contains the following files: checkpoint – Text file for TensorFlow to find latest checkpoint model.data-00000-of-00001 model.index model.meta - Model checkpoint and meta files
Source code(tar.gz)
Source code(zip)
mobilenetv2-1.4.tar.gz(64.79 MB)

Owner

Qualcomm Innovation Center

GitHub Repository

Implicit Graph Neural Networks

Implicit Graph Neural Networks This repository is the official PyTorch implementation of "Implicit Graph Neural Networks". Fangda Gu*, Heng Chang*, We

48 Nov 29, 2022

Segmentation vgg16 fcn - cityscapes

VGGSegmentation Segmentation vgg16 fcn - cityscapes Priprema skupa skripta prepare_dataset_downsampled.py Iz slika cityscapesa izrezuje haubu automobi

6 Oct 24, 2020

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes Setup virtualenv -p python3 venv source venv/bin/activate pip instal

9 May 20, 2022

Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

114 Dec 20, 2022

Semi-supervised learning for object detection

Source code for STAC: A Simple Semi-Supervised Learning Framework for Object Detection STAC is a simple yet effective SSL framework for visual object

348 Dec 25, 2022

This repository contains demos I made with the Transformers library by HuggingFace.

Transformers-Tutorials Hi there! This repository contains demos I made with the Transformers library by 🤗 HuggingFace. Currently, all of them are imp

3.5k Jan 01, 2023

Official PyTorch implementation of the paper: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample

DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample (ICCV 2021 Oral) Project | Paper Official PyTorch implementation of the pape

393 Dec 22, 2022

Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]

Neural Material Official code repository for the paper: Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021] Henzler, Deschai

80 Dec 20, 2022

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

This repository contains the implementation for the paper: No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consiste

75 Dec 30, 2022

YOLOv5🚀 reproduction by Guo Quanhao using PaddlePaddle

YOLOv5-Paddle YOLOv5 🚀 reproduction by Guo Quanhao using PaddlePaddle 支持AutoBatch 支持AutoAnchor 支持GPU Memory 快速开始使用AIStudio高性能环境快速构建YOLOv5训练(PaddlePa

20 Nov 14, 2022

AITUS - An atomatic notr maker for CYTUS

AITUS an automatic note maker for CYTUS. 利用AI根据指定乐曲生成CYTUS游戏谱面。效果展示：https://www

6 Feb 24, 2022

Fermi Problems: A New Reasoning Challenge for AI

Fermi Problems: A New Reasoning Challenge for AI Fermi Problems are questions whose answer is a number that can only be reasonably estimated as a prec

15 May 28, 2022

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

CLGo This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints An earlier

32 Dec 20, 2022

[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

FiGNN for CTR prediction The code and data for our paper in CIKM2019: Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Predicti

75 Dec 30, 2022

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

Memformer - Pytorch Implementation of Memformer, a Memory-augmented Transformer, in Pytorch. It includes memory slots, which are updated with attentio

60 Nov 06, 2022

Aquarius - Enabling Fast, Scalable, Data-Driven Virtual Network Functions

Aquarius Aquarius - Enabling Fast, Scalable, Data-Driven Virtual Network Functions NOTE: We are currently going through the open-source process requir

0 Jun 02, 2022

[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

SurRoL IROS 2021 SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning Features dVRK compati

[email protected]"> 55 Jan 03, 2023

Anomaly detection in multi-agent trajectories: Code for training, evaluation and the OpenAI highway simulation.

Anomaly Detection in Multi-Agent Trajectories for Automated Driving This is the official project page including the paper, code, simulation, baseline

12 Dec 02, 2022

Python framework for Stochastic Differential Equations modeling

SDElearn: a Python package for SDE modeling This package implements functionalities for working with Stochastic Differential Equations models (SDEs fo

4 May 10, 2022

'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.

Across-animal odor decoding by probabilistic manifold alignment (NeurIPS 2021) This repository is the official implementation of aligned mixture of la

3 Jul 12, 2022

Model Zoo for AI Model Efficiency Toolkit

Related tags

Overview

Model Zoo for AI Model Efficiency Toolkit

Table of Contents

Introduction

Tensorflow Models

Model Zoo

Detailed Results

RetinaNet

SRGAN

PyTorch Models

Model Zoo

Detailed Results

SRGAN Pytorch

Examples

Install AIMET

Running the scripts

Team

License

Comments

Releases(repo_restructured_1)

repo_restructured_1(Dec 21, 2022)

torch_transformer_quicksrnet(Dec 16, 2022)

torch_vit(Dec 9, 2022)

torch_roberta(Dec 9, 2022)

torch_mobilevit(Dec 9, 2022)

torch_mobilebert(Dec 9, 2022)

torch_minilm(Dec 9, 2022)

torch_int4_update(Nov 26, 2022)

torch_distilbert(Dec 9, 2022)

torch_bert(Dec 9, 2022)

quicksrnet-checkpoint-pytorch(Dec 9, 2022)

torch_segmentation_ffnet(Nov 16, 2022)

torch_inverseform(Oct 31, 2022)

tensorflow_mobiledet_edgetpu_W8A8_quantsim(Oct 31, 2022)

model_standardization_v1(Sep 14, 2022)

torchvision_classification_INT4/8(Aug 26, 2022)

hrnet-posenet(Aug 26, 2022)

torch_hrnet_w8a8_pc(Jul 15, 2022)

torch_effnet_lite0_w8a8_pc(Jul 15, 2022)

torch_dlv3_w8a8_pc(Jul 15, 2022)

xlsr-checkpoint-pytorch(Jun 3, 2022)

sesr-checkpoint-pytorch(Jun 3, 2022)

abpn-checkpoint-pytorch(Apr 26, 2022)

pt-effnet-checkpoint(Dec 31, 2020)

ssd_mobilenet_v2_tf(Dec 29, 2020)

srgan_mmsr_model(Dec 29, 2020)

pose_estimation_pytorch(Dec 29, 2020)

pose_estimation(Dec 29, 2020)

mobilenetv2-pytorch(Dec 29, 2020)

mobilenet-v2-1.4(Dec 29, 2020)

General

Quantizer Op Assumptions

Contents

Owner

Qualcomm Innovation Center

Implicit Graph Neural Networks

Segmentation vgg16 fcn - cityscapes

Repository sharing code and the model for the paper "Rescoring Sequence-to-Sequence Models for Text Line Recognition with CTC-Prefixes"

Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

Semi-supervised learning for object detection

This repository contains demos I made with the Transformers library by HuggingFace.

Official PyTorch implementation of the paper: DeepSIM: Image Shape Manipulation from a Single Augmented Training Sample

Generative Modelling of BRDF Textures from Flash Images [SIGGRAPH Asia, 2021]

No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

YOLOv5🚀 reproduction by Guo Quanhao using PaddlePaddle

AITUS - An atomatic notr maker for CYTUS

Fermi Problems: A New Reasoning Challenge for AI

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

[CIKM 2019] Code and dataset for "Fi-GNN: Modeling Feature Interactions via Graph Neural Networks for CTR Prediction"

Implementation of Memformer, a Memory-augmented Transformer, in Pytorch

Aquarius - Enabling Fast, Scalable, Data-Driven Virtual Network Functions

[IROS'21] SurRoL: An Open-source Reinforcement Learning Centered and dVRK Compatible Platform for Surgical Robot Learning

Anomaly detection in multi-agent trajectories: Code for training, evaluation and the OpenAI highway simulation.

Python framework for Stochastic Differential Equations modeling

'Aligned mixture of latent dynamical systems' (amLDS) for stimulus decoding probabilistic manifold alignment across animals. P. Herrero-Vidal et al. NeurIPS 2021 code.