Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Related tags

Deep LearningxTune
Overview

xTune

Code for ACL2021 paper Consistency Regularization for Cross-Lingual Fine-Tuning.

Environment

DockerFile: dancingsoul/pytorch:xTune

Install the fine-tuning code: pip install --user .

Data & Model Preparation

XTREME Datasets

  1. Create a download folder with mkdir -p download in the root of this project.
  2. manually download panx_dataset (for NER) [here][2], (note that it will download as AmazonPhotos.zip) to the download directory.
  3. run the following command to download the remaining datasets: bash scripts/download_data.sh The code of downloading dataset from XTREME is from [xtreme offical repo][1].

Note that we keep the labels in test set for easier evaluation. To prevent accidental evaluation on the test sets while running experiments, the code of [xtreme offical repo][1] removes labels of the test data during pre-processing and changes the order of the test sentences for cross-lingual sentence retrieval. Replace csv.writer(fout, delimiter='\t') with csv.writer(fout, delimiter='\t', quoting=csv.QUOTE_NONE, quotechar='') in utils_process.py if using XTREME official repo.

Translations

XTREME provides translations for SQuAD v1.1 (only train and dev), MLQA, PAWS-X, TyDiQA-GoldP, XNLI, and XQuAD, which can be downloaded from [here][3]. The xtreme_translations folder should be moved to the download directory.

The target language translations for panx and udpos are obtained with Google Translate, since they are not provided. Our processed version can be downloaded from [here][4]. It should be merged with the above xtreme_translations folder.

Bi-lingual dictionaries

We obtain the bi-lingual dictionaries from the [MUSE][6] repo. For convenience, you can download them from [here][7] and move it to the download directory, i.e., ./download/dicts.

Models

XLM-Roberta is supported. We utilize the [huggingface][5] format, which can be downloaded with bash scripts/download_model.sh.

Fine-tuning Usage

Our default settings were using Nvidia V100-32GB GPU cards. If there were out-of-memory errors, you can reduce per_gpu_train_batch_size while increasing gradient_accumulation_steps, or use multi-GPU training.

xTune consists of a two-stage training process.

  • Stage 1: fine-tuning with example consistency on the English training set.
  • Stage 2: fine-tuning with example consistency on the augmented training set and regularize model consistency with the model from Stage 1.

It's recommended to use both Stage 1 and Stage 2 for token-level tasks, such as sequential labeling, and question answering. For text classification, you can only use Stage 1 if the computation budget was limited.

bash ./scripts/train.sh [setting] [dataset] [model] [stage] [gpu] [data_dir] [output_dir]

where the options are described as follows:

  • [setting]: translate-train-all (using input translation for the languages other than English) or cross-lingual-transfer (only using English for zero-shot cross-lingual transfer)
  • [dataset]: dataset names in XTREME, i.e., xnli, panx, pawsx, udpos, mlqa, tydiqa, xquad
  • [model]: xlm-roberta-base, xlm-roberta-large
  • [stage]: 1 (first stage), 2 (second stage)
  • [gpu]: used to set environment variable CUDA_VISIBLE_DEVICES
  • [data_dir]: folder of training data
  • [output_dir]: folder of fine-tuning output

Examples: XTREME Tasks

XNLI fine-tuning on English training set and translated training sets (translate-train-all)

# run stage 1 of xTune
bash ./scripts/train.sh translate-train-all xnli xlm-roberta-base 1
# run stage 2 of xTune (optional)
bash ./scripts/train.sh translate-train-all xnli xlm-roberta-base 2

XNLI fine-tuning on English training set (cross-lingual-transfer)

# run stage 1 of xTune
bash ./scripts/train.sh cross-lingual-transfer xnli xlm-roberta-base 1
# run stage 2 of xTune (optional)
bash ./scripts/train.sh cross-lingual-transfer xnli xlm-roberta-base 2

Paper

Please cite our paper \cite{bo2021xtune} if you found the resources in the repository useful.

@inproceedings{bo2021xtune,
author = {Bo Zheng, Li Dong, Shaohan Huang, Wenhui Wang, Zewen Chi, Saksham Singhal, Wanxiang Che, Ting Liu, Xia Song, Furu Wei},
booktitle = {Proceedings of ACL 2021},
title = {{Consistency Regularization for Cross-Lingual Fine-Tuning}},
year = {2021}
}

Reference

  1. https://github.com/google-research/xtreme
  2. https://www.amazon.com/clouddrive/share/d3KGCRCIYwhKJF0H3eWA26hjg2ZCRhjpEQtDL70FSBN?_encoding=UTF8&%2AVersion%2A=1&%2Aentries%2A=0&mgh=1
  3. https://console.cloud.google.com/storage/browser/xtreme_translations
  4. https://drive.google.com/drive/folders/1Rdbc0Us_4I5MpRCwLASxBwqSW8_dlF87?usp=sharing
  5. https://github.com/huggingface/transformers/
  6. https://github.com/facebookresearch/MUSE
  7. https://drive.google.com/drive/folders/1k9rQinwUXicglA5oyzo9xtgqiuUVDkjT?usp=sharing
Owner
Bo Zheng
Bo Zheng
DRLib:A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos.

DRLib:A concise deep reinforcement learning library, integrating HER and PER for almost off policy RL algos A concise deep reinforcement learning libr

329 Jan 03, 2023
Simple object detection app with streamlit

object-detection-app Simple object detection app with streamlit. Upload an image and perform object detection. Adjust the confidence threshold to see

Robin Cole 68 Jan 02, 2023
Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras

Use stochastic processes to generate samples and use them to train a fully-connected neural network based on Keras which will then be used to generate residuals

Federico Lopez 2 Jan 14, 2022
PyTorch code for the paper "Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval".

Complementarity is the King: Multi-modal and Multi-grained Hierarchical Semantic Enhancement Network for Cross-modal Retrieval (M2HSE) PyTorch code fo

Xinlei-Pei 6 Dec 23, 2022
Deep Learning Pipelines for Apache Spark

Deep Learning Pipelines for Apache Spark The repo only contains HorovodRunner code for local CI and API docs. To use HorovodRunner for distributed tra

Databricks 2k Jan 08, 2023
Image Completion with Deep Learning in TensorFlow

Image Completion with Deep Learning in TensorFlow See my blog post for more details and usage instructions. This repository implements Raymond Yeh and

Brandon Amos 1.3k Dec 23, 2022
A copy of Ares that costs 30 fucking dollars.

Finalement, j'ai décidé d'abandonner cette idée, je me suis comporté comme un enfant qui été en colère. Comme m'ont dit certaines personnes j'ai des c

Bleu 24 Apr 14, 2022
Simple image captioning model - CLIP prefix captioning.

CLIP prefix captioning. Inference Notebook: 🥳 New: 🥳 Our technical papar is finally out! Official implementation for the paper "ClipCap: CLIP Prefix

688 Jan 04, 2023
“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Data Augmentation for Cross-Domain Named Entity Recognition Authors: Shuguang Chen, Gustavo Aguilar, Leonardo Neves and Thamar Solorio This repository

<a href=[email protected]"> 18 Sep 10, 2022
git《Investigating Loss Functions for Extreme Super-Resolution》(CVPR 2020) GitHub:

Investigating Loss Functions for Extreme Super-Resolution NTIRE 2020 Perceptual Extreme Super-Resolution Submission. Our method ranked first and secon

Sejong Yang 0 Oct 17, 2022
Housing Price Prediction

This project aim was to predict the price of houses in the Boston area during the great financial crisis through regression, as well as classify houses into different quality categories according to

Florian Klement 1 Jan 27, 2022
Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21)

Learning Structural Edits via Incremental Tree Transformations Code for "Learning Structural Edits via Incremental Tree Transformations" (ICLR'21) 1.

NeuLab 40 Dec 23, 2022
PSANet: Point-wise Spatial Attention Network for Scene Parsing, ECCV2018.

PSANet: Point-wise Spatial Attention Network for Scene Parsing (in construction) by Hengshuang Zhao*, Yi Zhang*, Shu Liu, Jianping Shi, Chen Change Lo

Hengshuang Zhao 217 Oct 30, 2022
Multi-label classification of retinal disorders

Multi-label classification of retinal disorders This is a deep learning course project. The goal is to develop a solution, using computer vision techn

Sundeep Bhimireddy 1 Jan 29, 2022
Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Expressive Power of Invariant and Equivaraint Graph Neural Networks In this repository, we show how to use powerful GNN (2-FGNN) to solve a graph alig

Marc Lelarge 36 Dec 12, 2022
SmartSim Infrastructure Library.

Home Install Documentation Slack Invite Cray Labs SmartSim SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and Ten

Cray Labs 139 Jan 01, 2023
Chinese clinical named entity recognition using pre-trained BERT model

Chinese clinical named entity recognition (CNER) using pre-trained BERT model Introduction Code for paper Chinese clinical named entity recognition wi

Xiangyang Li 109 Dec 14, 2022
Code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction

Official PyTorch code for Transformers Solve Limited Receptive Field for Monocular Depth Prediction. Guanglei Yang, Hao Tang, Mingli Ding, Nicu Sebe,

stanley 152 Dec 16, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
Dynamic Slimmable Network (CVPR 2021, Oral)

Dynamic Slimmable Network (DS-Net) This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral). Architecture of DS-

Changlin Li 197 Dec 09, 2022