Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Last update: Dec 15, 2022

Overview

This is a fork of Fairseq(-py) with implementations of the following models:

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences.

Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models.

See README.

Efficient Wait-k Models for Simultaneous Machine Translation

Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths.

See README.

Fairseq Requirements and Installation

PyTorch version >= 1.4.0
Python version >= 3.6
For training new models, you'll also need an NVIDIA GPU and NCCL

Installing Fairseq

git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

For Pervasive Attention, please cite:

@InProceedings{elbayad18conll,
    author ="Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob",
    title = "Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction",
    booktitle = "Proceedings of the 22nd Conference on Computational Natural Language Learning",
    year = "2018",
 }

For our wait-k models, please cite:

@article{elbayad20waitk,
    title={Efficient Wait-k Models for Simultaneous Machine Translation},
    author={Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob},
    journal={arXiv preprint arXiv:2005.08595},
    year={2020}
}

For Fairseq, please cite:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Related tags

Overview

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Efficient Wait-k Models for Simultaneous Machine Translation

Fairseq Requirements and Installation

License

Citation

Owner

Maha

A collection of resources on GAN Inversion.

Official implementation of the method ContIG, for self-supervised learning from medical imaging with genomics

Code for paper "A Critical Assessment of State-of-the-Art in Entity Alignment" (https://arxiv.org/abs/2010.16314)

Scenarios, tutorials and demos for Autonomous Driving

Ἀνατομή is a PyTorch library to analyze representation of neural networks

某学校选课系统GIF验证码数据集 + Baseline模型 + 上下游相关工具

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

[CIKM 2021] Enhancing Aspect-Based Sentiment Analysis with Supervised Contrastive Learning

PyTorch implementation of SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching

PRIME: A Few Primitives Can Boost Robustness to Common Corruptions

Pytorch and Keras Implementations of Hyperspectral Image Classification -- Traditional to Deep Models: A Survey for Future Prospects.

Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

Re-implementation of the Noise Contrastive Estimation algorithm for pyTorch, following "Noise-contrastive estimation: A new estimation principle for unnormalized statistical models." (Gutmann and Hyvarinen, AISTATS 2010)

Official PyTorch implementation for paper "Efficient Two-Stage Detection of Human–Object Interactions with a Novel Unary–Pairwise Transformer"

Official repo for BMVC2021 paper ASFormer: Transformer for Action Segmentation

Disentangled Lifespan Face Synthesis

EssentialMC2 Video Understanding

Point Cloud Denoising input segmentation output raw point-cloud valid/clear fog rain de-noised Abstract Lidar sensors are frequently used in environme

Remote sensing change detection tool based on PaddlePaddle

The code for Expectation-Maximization Attention Networks for Semantic Segmentation (ICCV'2019 Oral)