Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Last update: Dec 29, 2022

Related tags

Deep Learning wechsel

Overview

WECHSEL

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

arXiv: https://arxiv.org/abs/2112.06598

Models from the paper are available on the HuggingFace Hub:

Installation

We distribute a Python Package via PyPI:

pip install wechsel

Alternatively, clone the repository, install requirements.txt and run the code in wechsel/.

Example usage

Transferring English roberta-base to Swahili:

import torch
from transformers import AutoModel, AutoTokenizer
from datasets import load_dataset
from wechsel import WECHSEL, load_embeddings

source_tokenizer = AutoTokenizer.from_pretrained("roberta-base")
model = AutoModel.from_pretrained("roberta-base")

target_tokenizer = source_tokenizer.train_new_from_iterator(
    load_dataset("oscar", "unshuffled_deduplicated_sw", split="train")["text"],
    vocab_size=len(source_tokenizer)
)

wechsel = WECHSEL(
    load_embeddings("en"),
    load_embeddings("sw"),
    bilingual_dictionary="swahili"
)

target_embeddings, info = wechsel.apply(
    source_tokenizer,
    target_tokenizer,
    model.get_input_embeddings().weight.detach().numpy(),
)

model.get_input_embeddings().weight.data = torch.from_numpy(target_embeddings)

# use `model` and `target_tokenizer` to continue training in Swahili!

Bilingual dictionaries

We distribute 3276 bilingual dictionaries from English to other languages for use with WECHSEL in dicts/.

Citation

Please cite WECHSEL as

@misc{minixhofer2021wechsel,
      title={WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models}, 
      author={Benjamin Minixhofer and Fabian Paischer and Navid Rekabsaz},
      year={2021},
      eprint={2112.06598},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Code for WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models.

Related tags

Overview

WECHSEL

Installation

Example usage

Bilingual dictionaries

Citation

Owner

Institute of Computational Perception

GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

A Collection of LiDAR-Camera-Calibration Papers, Toolboxes and Notes

Garbage classification using structure data.

Ivy is a templated deep learning framework which maximizes the portability of deep learning codebases.

Code Impementation for "Mold into a Graph: Efficient Bayesian Optimization over Mixed Spaces"

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

Artificial Neural network regression model to predict the energy output in a combined cycle power plant.

Vertex AI: Serverless framework for MLOPs (ESP / ENG)

Yolov5 + Deep Sort with PyTorch

Machine learning Bot detection technique, based on United States election dataset

PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.

Implementation of the paper "Shapley Explanation Networks"

Code to accompany our paper "Continual Learning Through Synaptic Intelligence" ICML 2017

My freqtrade strategies

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

DI-HPC is an acceleration operator component for general algorithm modules in reinforcement learning algorithms

Segmentation Training Pipeline

Transformer model implemented with Pytorch

[CoRL 2021] A robotics benchmark for cross-embodiment imitation.

Blind Video Temporal Consistency via Deep Video Prior