Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Last update: Dec 15, 2022

Overview

This is a fork of Fairseq(-py) with implementations of the following models:

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

An NMT models with two-dimensional convolutions to jointly encode the source and the target sequences.

Pervasive Attention also provides an extensive decoding grid that we leverage to efficiently train wait-k models.

See README.

Efficient Wait-k Models for Simultaneous Machine Translation

Transformer Wait-k models (Ma et al., 2019) with unidirectional encoders and with joint training of multiple wait-k paths.

See README.

Fairseq Requirements and Installation

PyTorch version >= 1.4.0
Python version >= 3.6
For training new models, you'll also need an NVIDIA GPU and NCCL

Installing Fairseq

git clone https://github.com/elbayadm/attn2d
cd attn2d
pip install --editable .

License

fairseq(-py) is MIT-licensed. The license applies to the pre-trained models as well.

Citation

For Pervasive Attention, please cite:

@InProceedings{elbayad18conll,
    author ="Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob",
    title = "Pervasive Attention: 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction",
    booktitle = "Proceedings of the 22nd Conference on Computational Natural Language Learning",
    year = "2018",
 }

For our wait-k models, please cite:

@article{elbayad20waitk,
    title={Efficient Wait-k Models for Simultaneous Machine Translation},
    author={Elbayad, Maha and Besacier, Laurent and Verbeek, Jakob},
    journal={arXiv preprint arXiv:2005.08595},
    year={2020}
}

For Fairseq, please cite:

@inproceedings{ott2019fairseq,
  title = {fairseq: A Fast, Extensible Toolkit for Sequence Modeling},
  author = {Myle Ott and Sergey Edunov and Alexei Baevski and Angela Fan and Sam Gross and Nathan Ng and David Grangier and Michael Auli},
  booktitle = {Proceedings of NAACL-HLT 2019: Demonstrations},
  year = {2019},
}

Pervasive Attention: 2D Convolutional Networks for Sequence-to-Sequence Prediction

Related tags

Overview

Pervasive Attention - 2D Convolutional Neural Networks for Sequence-to-Sequence Prediction

Efficient Wait-k Models for Simultaneous Machine Translation

Fairseq Requirements and Installation

License

Citation

Owner

Maha

Search Git commits in natural language

IMS-Toucan is a toolkit to train state-of-the-art Speech Synthesis models

Code for CVPR 2021 paper: Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Optimal Transport Tools (OTT), A toolbox for all things Wasserstein.

Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents

DomainWordsDict, Chinese words dict that contains more than 68 domains, which can be used as text classification、knowledge enhance task

A Multi-modal Model Chinese Spell Checker Released on ACL2021.

Unsupervised text tokenizer focused on computational efficiency

Officile code repository for "A Game-Theoretic Perspective on Risk-Sensitive Reinforcement Learning"

Guide: Finetune GPT2-XL (1.5 Billion Parameters) and GPT-NEO (2.7 B) on a single 16 GB VRAM V100 Google Cloud instance with Huggingface Transformers using DeepSpeed

Clone a voice in 5 seconds to generate arbitrary speech in real-time

Translates basic English sentences into the Huna language (hoo-NAH)

मराठी भाषा वाचविण्याचा एक प्रयास. इंग्रजी ते मराठीचा शब्दकोश. An attempt to preserve the Marathi language. A lightweight and ad free English to Marathi thesaurus.

BERTAC (BERT-style transformer-based language model with Adversarially pretrained Convolutional neural network)

Generating new names based on trends in data using GPT2 (Transformer network)

Machine Learning Course Project, IMDB movie review sentiment analysis by lstm, cnn, and transformer

Stack based programming language that compiles to x86_64 assembly or can alternatively be interpreted in Python

Fixes mojibake and other glitches in Unicode text, after the fact.

NeoDays-based tileset for the roguelike CDDA (Cataclysm Dark Days Ahead)