PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Overview

Non-Autoregressive Transformer

Code release for Non-Autoregressive Neural Machine Translation by Jiatao Gu, James Bradbury, Caiming Xiong, Victor O.K. Li, and Richard Socher.

Requires PyTorch 0.3, torchtext 0.2.1, and SpaCy.

The pipeline for training a NAT model for a given language pair includes:

  1. run_alignment_wmt_LANG.sh (runs fast_align for alignment supervision)
  2. run_LANG.sh (trains an autoregressive model)
  3. run_LANG_decode.sh (produces the distillation corpus for training the NAT)
  4. run_LANG_fast.sh (trains the NAT model)
  5. run_LANG_fine.sh (fine-tunes the NAT model)
Owner
Salesforce
A variety of vendor agnostic projects which power Salesforce
Salesforce
FewCLUE: 为中文NLP定制的小样本学习测评基准

FewCLUE: 为中文NLP定制的小样本学习测评基准

CLUE benchmark 387 Jan 04, 2023
Easy to start. Use deep nerual network to predict the sentiment of movie review.

Easy to start. Use deep nerual network to predict the sentiment of movie review. Various methods, word2vec, tf-idf and df to generate text vectors. Various models including lstm and cov1d. Achieve f1

1 Nov 19, 2021
Code for ACL 2021 main conference paper "Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances".

Conversations are not Flat: Modeling the Intrinsic Information Flow between Dialogue Utterances This repository contains the code and pre-trained mode

ICTNLP 90 Dec 27, 2022
MPNet: Masked and Permuted Pre-training for Language Understanding

MPNet MPNet: Masked and Permuted Pre-training for Language Understanding, by Kaitao Song, Xu Tan, Tao Qin, Jianfeng Lu, Tie-Yan Liu, is a novel pre-tr

Microsoft 228 Nov 21, 2022
pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Python binding for Morfologik Morfologik is Polish morphological analyzer. For more information see http://github.com/morfologik/morfologik-stemming/

Damian Mirecki 18 Dec 29, 2021
A NLP program: tokenize method, PoS Tagging with deep learning

IRIS NLP SYSTEM A NLP program: tokenize method, PoS Tagging with deep learning Report Bug · Request Feature Table of Contents About The Project Built

Zakaria 7 Dec 13, 2022
BeautyNet is an AI powered model which can tell you whether you're beautiful or not.

BeautyNet BeautyNet is an AI powered model which can tell you whether you're beautiful or not. Download Dataset from here:https://www.kaggle.com/gpios

Ansh Gupta 0 May 06, 2022
Simple text to phones converter for multiple languages

Phonemizer -- foʊnmaɪzɚ The phonemizer allows simple phonemization of words and texts in many languages. Provides both the phonemize command-line tool

CoML 762 Dec 29, 2022
A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

A Python package implementing a new model for text classification with visualization tools for Explainable AI 🍣 Online live demos: http://tworld.io/s

Sergio Burdisso 285 Jan 02, 2023
Longformer: The Long-Document Transformer

Longformer Longformer and LongformerEncoderDecoder (LED) are pretrained transformer models for long documents. ***** New December 1st, 2020: Longforme

AI2 1.6k Dec 29, 2022
InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective

InfoBERT: Improving Robustness of Language Models from An Information Theoretic Perspective This is the official code base for our ICLR 2021 paper

AI Secure 71 Nov 25, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit. It provides easy-to-use, low-overhead, first-class Python wrappers for t

922 Dec 31, 2022
Basic Utilities for PyTorch Natural Language Processing (NLP)

Basic Utilities for PyTorch Natural Language Processing (NLP) PyTorch-NLP, or torchnlp for short, is a library of basic utilities for PyTorch NLP. tor

Michael Petrochuk 2.1k Jan 01, 2023
An IVR Chatbot which can exponentially reduce the burden of companies as well as can improve the consumer/end user experience.

IVR-Chatbot Achievements 🏆 Team Uhtred won the Maverick 2.0 Bot-a-thon 2021 organized by AbInbev India. ❓ Problem Statement As we all know that, lot

ARYAMAAN PANDEY 9 Dec 08, 2022
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021
A collection of Korean Text Datasets ready to use using Tensorflow-Datasets.

tfds-korean A collection of Korean Text Datasets ready to use using Tensorflow-Datasets. TensorFlow-Datasets를 이용한 한국어/한글 데이터셋 모음입니다. Dataset Catalog |

Jeong Ukjae 20 Jul 11, 2022
SimBERT升级版(SimBERTv2)!

RoFormer-Sim RoFormer-Sim,又称SimBERTv2,是我们之前发布的SimBERT模型的升级版。 介绍 https://kexue.fm/archives/8454 训练 tensorflow 1.14 + keras 2.3.1 + bert4keras 0.10.6 下载

317 Dec 23, 2022
Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

textgenrnn Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly tr

Max Woolf 4.8k Dec 30, 2022
ChessCoach is a neural network-based chess engine capable of natural-language commentary.

ChessCoach is a neural network-based chess engine capable of natural-language commentary.

Chris Butner 380 Dec 03, 2022