Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

Last update: Jan 09, 2023

Related tags

Deep Learning GLAT

Overview

GLAT

Implementation for the ACL2021 paper "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

Requirements

Python >= 3.7
Pytorch >= 1.5.0
Fairseq 1.0.0a0

Preparation

Train an autoregressive Transformer according to the instructions in Fairseq.

Use the trained autoregressive Transformer to generate target sentences for the training set.

Binarize the distilled training data.

input_dir=path_to_raw_text_data
data_dir=path_to_binarized_output
src=source_language
tgt=target_language
python3 fairseq_cli/preprocess.py --source-lang ${src} --target-lang ${tgt} --trainpref ${input_dir}/train \
    --validpref ${input_dir}/valid --testpref ${input_dir}/test --destdir ${data_dir}/ \
    --workers 32 --src-dict ${input_dir}/dict.${src}.txt --tgt-dict {input_dir}/dict.${tgt}.txt

Train

save_path=path_for_saving_models
python3 train.py ${data_dir} --arch glat --noise full_mask --share-all-embeddings \
    --criterion glat_loss --label-smoothing 0.1 --lr 5e-4 --warmup-init-lr 1e-7 --stop-min-lr 1e-9 \
    --lr-scheduler inverse_sqrt --warmup-updates 4000 --optimizer adam --adam-betas '(0.9, 0.999)' \
    --adam-eps 1e-6 --task translation_lev_modified --max-tokens 8192 --weight-decay 0.01 --dropout 0.1 \
    --encoder-layers 6 --encoder-embed-dim 512 --decoder-layers 6 --decoder-embed-dim 512 --fp16 \
    --max-source-positions 1000 --max-target-positions 1000 --max-update 300000 --seed 0 --clip-norm 5\
    --save-dir ${save_path} --src-embedding-copy --pred-length-offset --log-interval 1000 \
    --eval-bleu --eval-bleu-args '{"iter_decode_max_iter": 0, "iter_decode_with_beam": 1}' \
    --eval-tokenized-bleu --eval-bleu-remove-bpe --best-checkpoint-metric bleu \
    --maximize-best-checkpoint-metric --decoder-learned-pos --encoder-learned-pos \
    --apply-bert-init --activation-fn gelu --user-dir glat_plugins \

Inference

checkpoint_path=path_to_your_checkpoint
python3 fairseq_cli/generate.py ${data_dir} --path ${checkpoint_path} --user-dir glat_plugins \
    --task translation_lev_modified --remove-bpe --max-sentences 20 --source-lang ${src} --target-lang ${tgt} \
    --quiet --iter-decode-max-iter 0 --iter-decode-eos-penalty 0 --iter-decode-with-beam 1 --gen-subset test

The script for averaging checkpoints is scripts/average_checkpoints.py

Implementation of "Glancing Transformer for Non-Autoregressive Neural Machine Translation"

Related tags

Overview

GLAT

Requirements

Preparation

Train

Inference

Owner

git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

PyTorch implementation of the YOLO (You Only Look Once) v2

TEA: A Sequential Recommendation Framework via Temporally Evolving Aggregations

The Most Efficient Temporal Difference Learning Framework for 2048

Amazing-Python-Scripts - 🚀 Curated collection of Amazing Python scripts from Basics to Advance with automation task scripts.

An implementation of Deep Graph Infomax (DGI) in PyTorch

RodoSol-ALPR Dataset

LightLog is an open source deep learning based lightweight log analysis tool for log anomaly detection.

Image augmentation library in Python for machine learning.

PyJokes - Joking around with Python library pyjokes

Learning Correspondence from the Cycle-consistency of Time (CVPR 2019)

FlowTorch is a PyTorch library for learning and sampling from complex probability distributions using a class of methods called Normalizing Flows

Python and C++ implementation of "MarkerPose: Robust real-time planar target tracking for accurate stereo pose estimation". Accepted at LXCV @ CVPR 2021.

TUPÃ was developed to analyze electric field properties in molecular simulations

A Python library for working with arbitrary-dimension hypercomplex numbers following the Cayley-Dickson construction of algebras.

A Python module for the generation and training of an entry-level feedforward neural network.

Stock-Prediction - prediction of stock market movements using sentiment analysis and deep learning.

Books, Presentations, Workshops, Notebook Labs, and Model Zoo for Software Engineers and Data Scientists wanting to learn the TF.Keras Machine Learning framework

Official PyTorch implementation of StyleGAN3

When BERT Plays the Lottery, All Tickets Are Winning