Rational Activation Functions - Replacing Padé Activation Units

[email protected]">

Last update: Nov 22, 2022

Related tags

Deep Learning rational_activations

Overview

Rational Activations - Learnable Rational Activation Functions

First introduce as PAU in Padé Activation Units: End-to-end Learning of Activation Functions in Deep Neural Network.

1. About Rational Activation Functions

Rational Activations are a novel learnable activation functions. Rationals encode activation functions as rational functions, trainable in an end-to-end fashion using backpropagation and can be seemingless integrated into any neural network in the same way as common activation functions (e.g. ReLU).

Rationals: Beyond known Activation Functions

Rational can approximate any known activation function arbitrarily well (cf. Padé Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks): (*the dashed lines represent the rational approximation of every function)

Rational are made to be optimized by the gradient descent, and can discover good properties of activation functions after learning (cf Recurrent Rational Networks):

Rationals evaluation on different tasks

They were first applied (as Padé Activation Units) to Supervised Learning (image classification) in Padé Activation Units:....

See rational_sl github repo

Rational matches or outperforms common activations in terms of predictive performance and training time. And, therefore relieves the network designer of having to commit to a potentially underperforming choice.

Recurrent Rational Functions have then been introduced in Recurrent Rational Networks, and both Rational and Recurrent Rational Networks are evaluated on RL Tasks. See rational_rl github repo

2. Dependencies

We support MxNet, Keras, and PyTorch. Instructions for MxNet can be found here. Instructions for Keras here. The following README instructions assume that you want to use rational activations in PyTorch.

PyTorch>=1.4.0
CUDA>=10.2

3. Installation

To install the rational_activations module, you can use pip, but:

‼️ rational_activations is currently compatible with torch==1.9.0 by default ‼️

For non TensorFlow and MXNet users, or if the command bellow don't work the package listed bellow don't work on your machine:

TensorFlow or MXNet (and `torch==1.9.0`)

 pip3 install -U pip wheel
 pip3 install torch rational_activations

Other CUDA/Pytorch

For any other torch version, please install from source: Modify requirements.txt to your corresponding torch version

 pip3 install airspeed  # to compile the CUDA templates
 git clone https://github.com/ml-research/rational_activations.git
 cd rational_activations
 pip3 install -r requirements.txt --user
 python3 setup.py install --user

If you encounter any trouble installing rational, please contact this person.

4. Using Rational in Neural Networks

Rational can be integrated in the same way as any other common activation function.

import torch
from rational.torch import Rational

model = torch.nn.Sequential(
    torch.nn.Linear(D_in, H),
    Rational(), # e.g. instead of torch.nn.ReLU()
    torch.nn.Linear(H, D_out),
)

Please also check the documentation 📔

5. Cite Us in your paper

@inproceedings{molina2019pade,
  title={Pad{\'e} Activation Units: End-to-end Learning of Flexible Activation Functions in Deep Networks},
  author={Molina, Alejandro and Schramowski, Patrick and Kersting, Kristian},
  booktitle={International Conference on Learning Representations},
  year={2019}
}

@article{delfosse2021recurrent,
  title={Recurrent Rational Networks},
  author={Delfosse, Quentin and Schramowski, Patrick and Molina, Alejandro and Kersting, Kristian},
  journal={arXiv preprint arXiv:2102.09407},
  year={2021}
}

@misc{delfosse2020rationals,
  author = {Delfosse, Quentin and Schramowski, Patrick and Molina, Alejandro and Beck, Nils and Hsu, Ting-Yu and Kashef, Yasien and Rüling-Cachay, Salva and Zimmermann, Julius},
  title = {Rational Activation functions},
  year = {2020},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished={\url{https://github.com/ml-research/rational_activations}}
}

Rational Activation Functions - Replacing Padé Activation Units

Related tags

Overview

Rational Activations - Learnable Rational Activation Functions

1. About Rational Activation Functions

Rationals: Beyond known Activation Functions

Rationals evaluation on different tasks

2. Dependencies

3. Installation

TensorFlow or MXNet (and `torch==1.9.0`)

Other CUDA/Pytorch

4. Using Rational in Neural Networks

5. Cite Us in your paper

Owner

[email protected]

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Toolchain to build Yoshi's Island from source code

Normal Learning in Videos with Attention Prototype Network

Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

Omniscient Video Super-Resolution

TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

Some useful blender add-ons for SMPL skeleton's poses and global translation.

Layered Neural Atlases for Consistent Video Editing

Totally Versatile Miscellanea for Pytorch

Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

PyTorch implementation of SmoothGrad: removing noise by adding noise.

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Train Dense Passage Retriever (DPR) with a single GPU

A model that attempts to learn and benefit from data collected on card counting.

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Deploy recommendation engines with Edge Computing

[CVPR'20] TTSR: Learning Texture Transformer Network for Image Super-Resolution

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

Rational Activation Functions - Replacing Padé Activation Units

Related tags

Overview

Rational Activations - Learnable Rational Activation Functions

1. About Rational Activation Functions

Rationals: Beyond known Activation Functions

Rationals evaluation on different tasks

2. Dependencies

3. Installation

TensorFlow or MXNet (and torch==1.9.0)

Other CUDA/Pytorch

4. Using Rational in Neural Networks

5. Cite Us in your paper

Owner

[email protected]

Paper: Cross-View Kernel Similarity Metric Learning Using Pairwise Constraints for Person Re-identification

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).

Toolchain to build Yoshi's Island from source code

Normal Learning in Videos with Attention Prototype Network

Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

Unofficial implementation of MUSIQ (Multi-Scale Image Quality Transformer)

Omniscient Video Super-Resolution

TensorFlow implementation of "TokenLearner: What Can 8 Learned Tokens Do for Images and Videos?"

Some useful blender add-ons for SMPL skeleton's poses and global translation.

Layered Neural Atlases for Consistent Video Editing

Totally Versatile Miscellanea for Pytorch

Utility tools for the "Divide and Remaster" dataset, introduced as part of the Cocktail Fork problem paper

PyTorch implementation of SmoothGrad: removing noise by adding noise.

PyTorch implementation of the R2Plus1D convolution based ResNet architecture described in the paper "A Closer Look at Spatiotemporal Convolutions for Action Recognition"

Train Dense Passage Retriever (DPR) with a single GPU

A model that attempts to learn and benefit from data collected on card counting.

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Deploy recommendation engines with Edge Computing

[CVPR'20] TTSR: Learning Texture Transformer Network for Image Super-Resolution

A repository built on the Flow software package to explore cyber-security attacks on intelligent transportation systems.

TensorFlow or MXNet (and `torch==1.9.0`)