The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Last update: Dec 14, 2022

Related tags

Overview

Graformer

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Graformer (also named BridgeTransformer in the code) is a sequence-to-sequence model mainly for Neural Machine Translation. We improve the multilingual translation by taking advantage of pre-trained (masked) language models, including pre-trained encoder (BERT) and pre-trained decoder (GPT). The code is based on Fairseq.

Examples

You can start with run/run.sh, with some minor modification. The corresponding scripts represent:

train a pre-trained BERT:
    run_arnold_multilingual_masked_lm_6e6d.sh

train a pre-trained GPT:
    run_arnold_multilingual_lm_6e6d.sh

train a Graformer:
    run_arnold_multilingual_graft_transformer_12e12d_ted.sh

inference from Graformer:
    run_arnold_multilingual_graft_inference_ted.sh

Released Models

We release our pre-trained mBERT and mGPT, along with the trained Graformer model in here.

Tensorflow Version

We will provide the tensorflow version in Neurst, a popular toolkit for sequence processing.

Citation

Please cite as:

@inproceedings{sun2021mulilingual,
    title = "Multilingual Translation via Grafting Pre-trained Language Models",
    author = "Sun, Zewei and Wang, Mingxuan and Li, Lei",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    year = "2021"
}

Contact

If you have any questions, please feel free to contact me: [email protected]

The repository for the paper: Multilingual Translation via Grafting Pre-trained Language Models

Related tags

Overview

Graformer

Examples

Released Models

Tensorflow Version

Citation

Contact

Owner

Code for the paper "Language Models are Unsupervised Multitask Learners"

Official PyTorch implementation of "Dual Path Learning for Domain Adaptation of Semantic Segmentation".

Training and evaluation codes for the BertGen paper (ACL-IJCNLP 2021)

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Create a semantic search engine with a neural network (i.e. BERT) whose knowledge base can be updated

Modified GPT using average pooling to reduce the softmax attention memory constraints.

Shared code for training sentence embeddings with Flax / JAX

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

Let Xiao Ai speakers control third-party devices

LSTC: Boosting Atomic Action Detection with Long-Short-Term Context

A python package for deep multilingual punctuation prediction.

I can help you convert your images to pdf file.

Official Stanford NLP Python Library for Many Human Languages

Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

Chinese version of GPT2 training code, using BERT tokenizer.

CCKS-Title-based-large-scale-commodity-entity-retrieval-top1

Lyrics generation with GPT2-based Transformer

A Practitioner's Guide to Natural Language Processing

Cải thiện Elasticsearch trong bài toán semantic search sử dụng phương pháp Sentence Embeddings

ETM - R package for Topic Modelling in Embedding Spaces