Fastformer

Notes from the authors

Pytorch/Keras implementation of Fastformer. The keras version only includes the core fastformer attention part. The pytorch version is written in a huggingface transformers style. The jupyter notebooks contain the quickstart codes for text classification on AG's News (without pretrained word embeddings for simplicity), which can be directly run. We noticed that in our experiments, NOT all tasks need FFNN, residual connection, layer normalization and even position embedding. For example, we find that in news recommendation, it is better to directly use Fastformer without layer normalization and position embedding. However, in Ad CVR prediction, both position embedding and layer normalization are needed.

Keras version: 2.2.4 (may not be compatible with higher versions)

TF version: from 1.12 to 1.15 (may be compatible with lower versions)

Pytorch version: 1.6.0 (may be compatible with higher/lower versions)

Citation

@article{wu2021fastformer,
  title={Fastformer: Additive Attention Can Be All You Need},
  author={Wu, Chuhan and Wu, Fangzhao and Qi, Tao and Huang, Yongfeng},
  journal={arXiv preprint arXiv:2108.09084},
  year={2021}
}

A pytorch &keras implementation and demo of Fastformer.

Related tags

Overview

Fastformer

Notes from the authors

Citation

Owner

A keras implementation of ENet (abandoned for the foreseeable future)

A curated (most recent) list of resources for Learning with Noisy Labels

QueryFuzz implements a metamorphic testing approach to test Datalog engines.

RetinaFace: Deep Face Detection Library in TensorFlow for Python

[CVPR'21] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild

FairyTailor: Multimodal Generative Framework for Storytelling

Rule Extraction Methods for Interactive eXplainability

Multi-tool reverse engineering collaboration solution.

Simple Tensorflow implementation of "Adaptive Convolutions for Structure-Aware Style Transfer" (CVPR 2021)

The story of Chicken for Club Bing

PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialogue System

⚡️Optimizing einsum functions in NumPy, Tensorflow, Dask, and more with contraction order optimization.

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

Tutorial to set up TensorFlow Object Detection API on the Raspberry Pi

Yolo algorithm for detection + centroid tracker to track vehicles

Exe-to-xlsm - Simple script to create VBscript of exe and inject to xlsm

Official implementation of GraphMask as presented in our paper Interpreting Graph Neural Networks for NLP With Differentiable Edge Masking.

Pytorch implementation of DeepMind's differentiable neural computer paper.

The code for our CVPR paper PISE: Person Image Synthesis and Editing with Decoupled GAN, Project Page, supp.