Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Last update: Dec 27, 2022

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

This repository contains the official Pytorch implementation of the paper Activate or Not: Learning Customized Activation, CVPR 2021.

ACON

We propose a novel activation function we term the ACON that explicitly learns to activate the neurons or not. Below we show the ACON activation function and its first derivatives. β controls how fast the first derivative asymptotes to the upper/lower bounds, which are determined by p1 and p2.

Training curves

We show the training curves of different activations here.

TFNet

To show the effectiveness of the proposed acon family, we also provide an extreme simple toy funnel network (TFNet) made only by pointwise convolution and ACON-FReLU operators.

Main results

The following results are the ImageNet top-1 accuracy relative improvements compared with the ReLU baselines. The relative improvements of Meta-ACON are about twice as much as SENet.

The comparison between ReLU, Swish and ACON-C. We show improvements without additional amount of FLOPs and parameters:

Model	FLOPs	#Params.	top-1 err. (ReLU)	top-1 err. (Swish)	top-1 err. (ACON)
ShuffleNetV2 0.5x	41M	1.4M	39.4	38.3 (+1.1)	37.0 (+2.4)
ShuffleNetV2 1.5x	299M	3.5M	27.4	26.8 (+0.6)	26.5 (+0.9)
ResNet 50	3.9G	25.5M	24.0	23.5 (+0.5)	23.2 (+0.8)
ResNet 101	7.6G	44.4M	22.8	22.7 (+0.1)	21.8 (+1.0)
ResNet 152	11.3G	60.0M	22.3	22.2 (+0.1)	21.2 (+1.1)

Next, by adding a negligible amount of FLOPs and parameters, meta-ACON shows sigificant improvements:

Model	FLOPs	#Params.	top-1 err.
ShuffleNetV2 0.5x (meta-acon)	41M	1.7M	34.8 (+4.6)
ShuffleNetV2 1.5x (meta-acon)	299M	3.9M	24.7 (+2.7)
ResNet 50 (meta-acon)	3.9G	25.7M	22.0 (+2.0)
ResNet 101 (meta-acon)	7.6G	44.8M	21.0 (+1.8)
ResNet 152 (meta-acon)	11.3G	60.5M	20.5 (+1.8)

The simple TFNet without the SE modules can outperform the state-of-the art light-weight networks without the SE modules.

	FLOPs	#Params.	top-1 err.
MobileNetV2 0.17	42M	1.4M	52.6
ShuffleNetV2 0.5x	41M	1.4M	39.4
TFNet 0.5	43M	1.3M	36.6 (+2.8)
MobileNetV2 0.6	141M	2.2M	33.3
ShuffleNetV2 1.0x	146M	2.3M	30.6
TFNet 1.0	135M	1.9M	29.7 (+0.9)
MobileNetV2 1.0	300M	3.4M	28.0
ShuffleNetV2 1.5x	299M	3.5M	27.4
TFNet 1.5	279M	2.7M	26.0 (+1.4)
MobileNetV2 1.4	585M	5.5M	25.3
ShuffleNetV2 2.0x	591M	7.4M	25.0
TFNet 2.0	474M	3.8M	24.3 (+0.7)

Trained Models

OneDrive download: Link
BaiduYun download: Link (extract code: 13fu)

Usage

Requirements

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Train:

python train.py  --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Eval:

python train.py --eval --eval-resume YOUR_WEIGHT_PATH --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Citation

If you use these models in your research, please cite:

@inproceedings{ma2021activate,
  title={Activate or Not: Learning Customized Activation},
  author={Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

ACON

Training curves

TFNet

Main results

Trained Models

Usage

Requirements

Citation

Owner

Probabilistic Cross-Modal Embedding (PCME) CVPR 2021

Create time-series datacubes for supervised machine learning with ICEYE SAR images.

MATLAB codes of the book "Digital Image Processing Fourth Edition" converted to Python

Code for Estimating Multi-cause Treatment Effects via Single-cause Perturbation (NeurIPS 2021)

An addernet CUDA version

adversarial_multi_armed_bandit_variable_plays

This application explain how we can easily integrate Deepface framework with Python Django application

Pure python implementation reverse-mode automatic differentiation

Equivariant CNNs for the sphere and SO(3) implemented in PyTorch

Tutel MoE: An Optimized Mixture-of-Experts Implementation

LSTM and QRNN Language Model Toolkit for PyTorch

Implementation based on Paper - Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling

Semantic Segmentation with SegFormer on Drone Dataset.

Joint deep network for feature line detection and description

Toontown House CT Edition

CCP dataset from Clothing Co-Parsing by Joint Image Segmentation and Labeling

Unpaired Caricature Generation with Multiple Exaggerations

"Neural Turing Machine" in Tensorflow

SnapMix: Semantically Proportional Mixing for Augmenting Fine-grained Data (AAAI 2021)

Pytorch Lightning Implementation of SC-Depth Methods.