Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples"

Related tags

Deep LearningKSTER
Overview

KSTER

Code for our EMNLP 2021 paper "Learning Kernel-Smoothed Machine Translation with Retrieved Examples" [paper].

Usage

Download the processed datasets from this site. You can also download the built databases from this site and download the model checkpoints from this site.

Train a general-domain base model

Take English -> Germain translation for example.

export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m joeynmt train configs/transformer_base_wmt14_en2de.yaml

Finetuning trained base model on domain-specific datasets

Take English -> Germain translation in Koran domain for example.

export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m joeynmt train configs/transformer_base_koran_en2de.yaml

Build database

Take English -> Germain translation in Koran domain for example, wmt14_en_de.transformer.ckpt is the path of trained general-domain base model checkpoint.

mkdir database/koran_en_de_base
export CUDA_VISIBLE_DEVICES=0
python3 -m joeynmt build_database configs/transformer_base_koran_en2de.yaml \
        --ckpt wmt14_en_de.transformer.ckpt \
        --division train \
        --index_path database/koran_en_de_base/trained.index \
        --token_map_path database/koran_en_de_base/token_map \
        --embedding_path database/koran_en_de_base/embeddings.npy

Train the bandwidth estimator and weight estimator in KSTER

Take English -> Germain translation in Koran domain for example.

export CUDA_VISIBLE_DEVICES=0,1,2,3
python3 -m joeynmt combiner_train configs/transformer_base_koran_en2de.yaml \
        --ckpt wmt14_en_de.transformer.ckpt \
        --combiner dynamic_combiner \
        --top_k 16 \
        --kernel laplacian \
        --index_path database/koran_en_de_base/trained.index \
        --token_map_path database/koran_en_de_base/token_map \
        --embedding_path database/koran_en_de_base/embeddings.npy \
        --in_memory True

Inference

We unify the inference of base model, finetuned or joint-trained model, kNN-MT and KSTER with a concept of combiner (see joeynmt/combiners.py).

Combiner type Methods Description
NoCombiner Base, Finetuning, Joint-training Directly inference without retrieval.
StaticCombiner kNN-MT Retrieve similar examples during inference. mixing_weight and bandwidth are pre-specified.
DynamicCombiner KSTER Retrieve similar examples during inference. mixing_weight and bandwidth are dynamically estimated.

Inference with NoCombiner for Base model

Take English -> Germain translation in Koran domain for example.

export CUDA_VISIBLE_DEVICES=0
python3 -m joeynmt test configs/transformer_base_koran_en2de.yaml \
        --ckpt wmt14_en_de.transformer.ckpt \
        --combiner no_combiner

Inference with StaticCombiner for kNN-MT

Take English -> Germain translation in Koran domain for example.

export CUDA_VISIBLE_DEVICES=0
python3 -m joeynmt test configs/transformer_base_koran_en2de.yaml \
        --ckpt wmt14_en_de.transformer.ckpt \
        --combiner static_combiner \
        --top_k 16 \
        --mixing_weight 0.7 \
        --bandwidth 10 \
        --kernel gaussian \
        --index_path database/koran_en_de_base/trained.index \
        --token_map_path database/koran_en_de_base/token_map

Inference with DynamicCombiner for KSTER

Take English -> Germain translation in Koran domain for example, koran_en_de.laplacian.combiner.ckpt is the path of trained bandwidth estimator and weight estimator for Koran domain.
--in_memory option specifies whether to load the example embeddings to memory. Set in_memory == True for faster inference, set in_memory == False for lower memory demand.

export CUDA_VISIBLE_DEVICES=0
python3 -m joeynmt test configs/transformer_base_koran_en2de.yaml \
        --ckpt wmt14_en_de.transformer.ckpt \
        --combiner dynamic_combiner \
        --combiner_path koran_en_de.laplacian.combiner.ckpt \
        --top_k 16 \
        --kernel laplacian \
        --index_path database/koran_en_de_base/trained.index \
        --token_map_path database/koran_en_de_base/token_map \
        --embedding_path database/koran_en_de_base/embeddings.npy \
        --in_memory True

See bash_scripts/test_*.sh for reproducing our results.
See logs/*.log for the logs of our results.

Acknowledgements

We build the models based on the joeynmt codebase.

Owner
jiangqn
Interested in natural language processing and machine learning.
jiangqn
Learning to Identify Top Elo Ratings with A Dueling Bandits Approach

Learning to Identify Top Elo Ratings We propose two algorithms MaxIn-Elo and MaxIn-mElo to solve the top players identification on the transitive and

2 Jan 14, 2022
Adaptable tools to make reinforcement learning and evolutionary computation algorithms.

Pearl The Parallel Evolutionary and Reinforcement Learning Library (Pearl) is a pytorch based package with the goal of being excellent for rapid proto

38 Jan 01, 2023
A small tool to joint picture including gif

README 做设计的时候遇到拼接长图的情况,但是发现没有什么好用的能拼接gif的工具。 于是自己写了个gif拼接小工具。 可以自动拼接gif、png和jpg等常见格式。 效果 从上至下 从下至上 从左至右 从右至左 使用 克隆仓库 git clone https://github.com/Dels

3 Dec 15, 2021
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 03, 2023
[ICCV2021] Learning to Track Objects from Unlabeled Videos

Unsupervised Single Object Tracking (USOT) 🌿 Learning to Track Objects from Unlabeled Videos Jilai Zheng, Chao Ma, Houwen Peng and Xiaokang Yang 2021

53 Dec 28, 2022
Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition

Zen-NAS: A Zero-Shot NAS for High-Performance Deep Image Recognition How Fast Compare to Other Zero-Shot NAS Proxies on CIFAR-10/100 Pre-trained Model

190 Dec 29, 2022
3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces (ICCV 2021)

3DIAS_Pytorch This repository contains the official code to reproduce the results from the paper: 3DIAS: 3D Shape Reconstruction with Implicit Algebra

Mohsen Yavartanoo 21 Dec 12, 2022
Code for project: "Learning to Minimize Remainder in Supervised Learning".

Learning to Minimize Remainder in Supervised Learning Code for project: "Learning to Minimize Remainder in Supervised Learning". Requirements and Envi

Yan Luo 0 Jul 18, 2021
Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression YOLOv5 with alpha-IoU losses implemented in PyTorch. Example r

Jacobi(Jiabo He) 147 Dec 05, 2022
Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

Himashi Amanda Peiris 114 Dec 20, 2022
Code release for Convolutional Two-Stream Network Fusion for Video Action Recognition

Convolutional Two-Stream Network Fusion for Video Action Recognition

Christoph Feichtenhofer 676 Dec 31, 2022
Our VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks.

VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks. VMAgent is constructed based on one month r

56 Dec 12, 2022
Implementation of Uformer, Attention-based Unet, in Pytorch

Uformer - Pytorch Implementation of Uformer, Attention-based Unet, in Pytorch. It will only offer the concat-cross-skip connection. This repository wi

Phil Wang 72 Dec 19, 2022
harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Harmonic-percussive-residual separation plug-in This work is a study on the plausibility of a sines-transients-noise decomposition inspired algorithm

Derp Learning 9 Sep 01, 2022
Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

Finetuner allows one to tune the weights of any deep neural network for better embeddings on search tasks

Jina AI 794 Dec 31, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 06, 2023
Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

ModelNet-C Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions". For the latest updates, see: sites.google.com

Jiawei Ren 45 Dec 28, 2022
A (PyTorch) imbalanced dataset sampler for oversampling low frequent classes and undersampling high frequent ones.

Imbalanced Dataset Sampler Introduction In many machine learning applications, we often come across datasets where some types of data may be seen more

Ming 2k Jan 08, 2023
Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer)

Computational modelling of ray propagation through optical elements using the principles of geometric optics (Ray Tracer) Introduction By applying the

Son Gyo Jung 1 Jul 09, 2022
[SIGGRAPH Asia 2021] Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN

Pose with Style: Detail-Preserving Pose-Guided Image Synthesis with Conditional StyleGAN [Paper] [Project Website] [Output resutls] Official Pytorch i

Badour AlBahar 215 Dec 17, 2022