CVPR 2022 "Online Convolutional Re-parameterization"

Overview

OREPA: Online Convolutional Re-parameterization

This repo is the PyTorch implementation of our paper to appear in CVPR2022 on "Online Convolutional Re-parameterization", authored by Mu Hu, Junyi Feng, Jiashen Hua, Baisheng Lai, Jianqiang Huang, Xiaojin Gong and Xiansheng Hua from Zhejiang University and Alibaba Cloud.

What is Structural Re-parameterization?

  • Re-parameterization (Re-param) means different architectures can be mutually converted through equivalent transformation of parameters. For example, a branch of 1x1 convolution and a branch of 3x3 convolution, can be transferred into a single branch of 3x3 convolution for faster inference.
  • When the model for deployment is fixed, the task of re-param can be regarded as finding a complex training-time structure, which can be transfered back to the original one, for free performance improvements.

Why do we propose Online RE-PAram? (OREPA)

  • While current re-param blocks (ACNet, ExpandNet, ACNetv2, etc) are still feasible for small models, more complecated design for further performance gain on larger models could lead to unaffordable training budgets.
  • We observed that batch normalization (norm) layers are significant in re-param blocks, while their training-time non-linearity prevents us from optimizing computational costs during training.

What is OREPA?

OREPA is a two-step pipeline.

  • Linearization: Replace the branch-wise norm layers to scaling layers to enable the linear squeezing of a multi-branch/layer topology.
  • Squeezing: Squeeze the linearized block into a single layer, where the convolution upon feature maps is reduced from multiple times to one.

Overview

How does OREPA work?

  • Through OREPA we could reduce the training budgets while keeping a comparable performance. Then we improve accuracy by additional components, which brings minor extra training costs since they are merged in an online scheme.
  • We theoretically present that the removal of branch-wise norm layers risks a multi-branch structure degrading into a single-branch one, indicating that the norm-scaling layer replacement is critical for protecting branch diversity.

ImageNet Results

ImageNet2

Create a new issue for any code-related questions. Feel free to direct me as well at [email protected] for any paper-related questions.

Contents

  1. Dependency
  2. Checkpoints
  3. Training
  4. Evaluation
  5. Transfer Learning on COCO and Cityscapes
  6. About Quantization and Gradient Tweaking
  7. Citation

Dependency

Models released in this work is trained and tested on:

  • CentOS Linux
  • Python 3.8.8 (Anaconda 4.9.1)
  • PyTorch 1.9.0 / torchvision 0.10.0
  • NVIDIA CUDA 10.2
  • 4x NVIDIA V100 GPUs
pip install torch torchvision
pip install numpy matplotlib Pillow
pip install scikit-image

Checkpoints

Download our pre-trained models with OREPA:

Note that we don't need to decompress the pre-trained models. Just load the files of .pth.tar format directly.

Training

A complete list of training options is available with

python train.py -h
python test.py -h
python convert.py -h
  1. Train ResNets (ResNeXt and WideResNet included)
CUDA_VISIBLE_DEVICES="0,1,2,3" python train.py -a ResNet-18 -t OREPA --data [imagenet-path]
# -a for architecture (ResNet-18, ResNet-34, ResNet-50, ResNet-101, ResNet-18-2x, ResNeXt-50)
# -t for re-param method (base, DBB, OREPA)
  1. Train RepVGGs
CUDA_VISIBLE_DEVICES="0,1,2,3" python train.py -a RepVGG-A0 -t OREPA_VGG --data [imagenet-path]
# -a for architecture (RepVGG-A0, RepVGG-A1, RepVGG-A2)
# -t for re-param method (base, RepVGG, OREPA_VGG)

Evaluation

  1. Use your self-trained model or our pretrained model
CUDA_VISIBLE_DEVICES="0" python test.py train [trained-model-path] -a ResNet-18 -t OREPA
  1. Convert the training-time models into inference-time models
CUDA_VISIBLE_DEVICES="0" python convert.py [trained-model-path] [deploy-model-path-to-save] -a ResNet-18 -t OREPA
  1. Evaluate with the converted model
CUDA_VISIBLE_DEVICES="0" python test.py deploy [deploy-model-path] -a ResNet-18 -t OREPA

Transfer Learning on COCO and Cityscapes

We use mmdetection and mmsegmentation tools on COCO and Cityscapes respectively. If you decide to use our pretrained model for downstream tasks, it is strongly suggested that the learning rate of the first stem layer should be fine adjusted, since the deep linear stem layer has a very different weight distribution from the vanilla one after ImageNet training. Contact @Sixkplus (Junyi Feng) for more details on configurations and checkpoints of the reported ResNet-50-backbone models.

About Quantization and Gradient Tweaking

For re-param models, special weight regulization strategies are required for furthur quantization. Meanwhile, dynamic gradient tweaking or differential searching methods might greatly boost the performance. Currently we have not deployed such techniques to OREPA yet. However such methods could be probably applied to our industrial usage in the future. For experience exchanging and sharing on such topics please contact @Sixkplus (Junyi Feng).

Citation

If you use our code or method in your work, please cite the following:

@inproceedings{hu22OREPA,
	title={Online Convolutional Re-parameterization},
	author={Mu Hu and Junyi Feng and Jiashen Hua and Baisheng Lai and Jianqiang Huang and Xiansheng Hua and Xiaojin Gong},
	booktitle={CVPR},
	year={2022}
}

Related Repositories

Codes of this work is developed upon Xiaohan Ding's re-param repositories "Diverse Branch Block: Building a Convolution as an Inception-like Unit" and "RepVGG: Making VGG-style ConvNets Great Again" with similar protocols. Xiaohan Ding is a Ph.D. from Tsinghua University and an expert in structural re-parameterization.

Owner
Mu Hu
B.Eng. & M.Sc, Zhejiang University, China. I will be in pursuit of a Ph.D. degree in HKUST.
Mu Hu
Back to the Feature: Learning Robust Camera Localization from Pixels to Pose (CVPR 2021)

Back to the Feature with PixLoc We introduce PixLoc, a neural network for end-to-end learning of camera localization from an image and a 3D model via

Computer Vision and Geometry Lab 610 Jan 05, 2023
Real-Time High-Resolution Background Matting

Real-Time High-Resolution Background Matting Official repository for the paper Real-Time High-Resolution Background Matting. Our model requires captur

Peter Lin 6.1k Jan 03, 2023
Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

WECHSEL Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models. arXiv: https://arx

Institute of Computational Perception 45 Dec 29, 2022
AI that generate music

PianoGPT ai that generate music try it here https://share.streamlit.io/annasajkh/pianogpt/main/main.py or here https://huggingface.co/spaces/Annas/Pia

Annas 28 Nov 27, 2022
Code for "Localization with Sampling-Argmax", NeurIPS 2021

Localization with Sampling-Argmax [Paper] [arXiv] [Project Page] Localization with Sampling-Argmax Jiefeng Li, Tong Chen, Ruiqi Shi, Yujing Lou, Yong-

JeffLi 71 Dec 17, 2022
Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Industrial KNN-based Anomaly Detection ⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py This repo aims to reproduce the results of

aventau 102 Dec 26, 2022
PyTorch implementation of the implicit Q-learning algorithm (IQL)

Implicit-Q-Learning (IQL) PyTorch implementation of the implicit Q-learning algorithm IQL (Paper) Currently only implemented for online learning. Offl

Sebastian Dittert 27 Dec 30, 2022
This script runs neural style transfer against the provided content image.

Neural Style Transfer Content Style Output Description: This script runs neural style transfer against the provided content image. The content image m

Martynas Subonis 0 Nov 25, 2021
SurfEmb (CVPR 2022) - SurfEmb: Dense and Continuous Correspondence Distributions

SurfEmb SurfEmb: Dense and Continuous Correspondence Distributions for Object Pose Estimation with Learnt Surface Embeddings Rasmus Laurvig Haugard, A

Rasmus Haugaard 56 Nov 19, 2022
A PyTorch-centric hybrid classical-quantum machine learning framework

torchquantum A PyTorch-centric hybrid classical-quantum dynamic neural networks framework. News Add a simple example script using quantum gates to do

MIT HAN Lab 400 Jan 02, 2023
Second-Order Neural ODE Optimizer, NeurIPS 2021 spotlight

Second-order Neural ODE Optimizer (NeurIPS 2021 Spotlight) [arXiv] ✔️ faster convergence in wall-clock time | ✔️ O(1) memory cost | ✔️ better test-tim

Guan-Horng Liu 39 Oct 22, 2022
MM1 and MMC Queue Simulation using python - Results and parameters in excel and csv files

implementation of MM1 and MMC Queue on randomly generated data and evaluate simulation results then compare with analytical results and draw a plot curve for them, simulate some integrals and compare

Mohamadreza Rezaei 1 Jan 19, 2022
Ratatoskr: Worcester Tech's conference scheduling system

Ratatoskr: Worcester Tech's conference scheduling system In Norse mythology, Ratatoskr is a squirrel who runs up and down the world tree Yggdrasil to

4 Dec 22, 2022
QueryFuzz implements a metamorphic testing approach to test Datalog engines.

Datalog is a popular query language with applications in several domains. Like any complex piece of software, Datalog engines may contain bugs. The mo

34 Sep 10, 2022
Use tensorflow to implement a Deep Neural Network for real time lane detection

LaneNet-Lane-Detection Use tensorflow to implement a Deep Neural Network for real time lane detection mainly based on the IEEE IV conference paper "To

MaybeShewill-CV 1.9k Jan 08, 2023
Code for ICLR 2021 Paper, "Anytime Sampling for Autoregressive Models via Ordered Autoencoding"

Anytime Autoregressive Model Anytime Sampling for Autoregressive Models via Ordered Autoencoding , ICLR 21 Yilun Xu, Yang Song, Sahaj Gara, Linyuan Go

Yilun Xu 22 Sep 08, 2022
Merlion: A Machine Learning Framework for Time Series Intelligence

Merlion: A Machine Learning Library for Time Series Table of Contents Introduction Installation Documentation Getting Started Anomaly Detection Foreca

Salesforce 2.8k Dec 30, 2022
Exemplo de implementação do padrão circuit breaker em python

fast-circuit-breaker Circuit breakers existem para permitir que uma parte do seu sistema falhe sem destruir todo seu ecossistema de serviços. Michael

James G Silva 17 Nov 10, 2022
A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM's

sign-language-detection A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM. The project is built for a vocabular

Hashim 4 Feb 06, 2022
MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

MAU (NeurIPS2021) Zheng Chang, Xinfeng Zhang, Shanshe Wang, Siwei Ma, Yan Ye, Xinguang Xiang, Wen GAo. Official PyTorch Code for "MAU: A Motion-Aware

ZhengChang 20 Nov 25, 2022