DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Last update: Nov 14, 2022

Related tags

Overview

DeeBERT

This is the code base for the paper DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference.

Code in this repository is also available in the Huggingface Transformer repo (with minor modification for version compatibility). Check this page for models that we have trained in advance (the latest version of Huggingface Transformers Library is needed).

Installation

This repo is tested on Python 3.7.5, PyTorch 1.3.1, and Cuda 10.1. Using a virtulaenv or conda environemnt is recommended, for example:

conda install pytorch==1.3.1 torchvision cudatoolkit=10.1 -c pytorch

After installing the required environment, clone this repo, and install the following requirements:

git clone https://github.com/castorini/deebert
cd deebert
pip install -r ./requirements.txt
pip install -r ./examples/requirements.txt

Usage

There are four scripts in the scripts folder, which can be run from the repo root, e.g., scripts/train.sh.

In each script, there are several things to modify before running:

path to the GLUE dataset. Check this for more details.
path for saving fine-tuned models. Default: ./saved_models.
path for saving evaluation results. Default: ./plotting. Results are printed to stdout and also saved to npy files in this directory to facilitate plotting figures and further analyses.
model_type (bert or roberta)
model_size (base or large)
dataset (SST-2, MRPC, RTE, QNLI, QQP, or MNLI)

train.sh

This is for fine-tuning and evaluating models as in the original BERT paper.

train_highway.sh

This is for fine-tuning DeeBERT models.

eval_highway.sh

This is for evaluating each exit layer for fine-tuned DeeBERT models.

eval_entropy.sh

This is for evaluating fine-tuned DeeBERT models, given a number of different early exit entropy thresholds.

Citation

Please cite our paper if you find the repository useful:

@inproceedings{xin-etal-2020-deebert,
    title = "{D}ee{BERT}: Dynamic Early Exiting for Accelerating {BERT} Inference",
    author = "Xin, Ji  and
      Tang, Raphael  and
      Lee, Jaejun  and
      Yu, Yaoliang  and
      Lin, Jimmy",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.204",
    pages = "2246--2251",
}

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Related tags

Overview

DeeBERT

Installation

Usage

train.sh

train_highway.sh

eval_highway.sh

eval_entropy.sh

Citation

Owner

Castorini

An end-to-end implementation of intent prediction with Metaflow and other cool tools

GenshinMapAutoMarkTools - Tools To add/delete/refresh resources mark in Genshin Impact Map

The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 2021)

A Tensorfflow implementation of Attend, Infer, Repeat

Constrained Language Models Yield Few-Shot Semantic Parsers

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务

Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

CL-Gym: Full-Featured PyTorch Library for Continual Learning

Pytorch domain adaptation package

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

Pytorch implementation of SimSiam Architecture

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

WSDM‘2022: Knowledge Enhanced Sports Game Summarization

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

A method to perform unsupervised cross-region adaptation of crop classifiers trained with satellite image time series.

Open source annotation tool for machine learning practitioners.

Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight

sktime companion package for deep learning based on TensorFlow

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Related tags

Overview

DeeBERT

Installation

Usage

train.sh

train_highway.sh

eval_highway.sh

eval_entropy.sh

Citation

Owner

Castorini

An end-to-end implementation of intent prediction with Metaflow and other cool tools

GenshinMapAutoMarkTools - Tools To add/delete/refresh resources mark in Genshin Impact Map

The official implementation of EIGNN: Efficient Infinite-Depth Graph Neural Networks (NeurIPS 2021)

A Tensorfflow implementation of Attend, Infer, Repeat

Constrained Language Models Yield Few-Shot Semantic Parsers

“英特尔创新大师杯”深度学习挑战赛 赛道3：CCKS2021中文NLP地址相关性任务

Neural-fractal - Create Fractals Using Complex-Valued Neural Networks!

CL-Gym: Full-Featured PyTorch Library for Continual Learning

Pytorch domain adaptation package

Learning the Beauty in Songs: Neural Singing Voice Beautifier; ACL 2022 (Main conference); Official code

the code used for the preprint Embedding-based Instance Segmentation of Microscopy Images.

Pytorch implementation of SimSiam Architecture

A free, multiplatform SDK for real-time facial motion capture using blendshapes, and rigid head pose in 3D space from any RGB camera, photo, or video.

PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)

WSDM‘2022: Knowledge Enhanced Sports Game Summarization

This is the repo for our work "Towards Persona-Based Empathetic Conversational Models" (EMNLP 2020)

A method to perform unsupervised cross-region adaptation of crop classifiers trained with satellite image time series.

Open source annotation tool for machine learning practitioners.

Image super-resolution (SR) is a fast-moving field with novel architectures attracting the spotlight

sktime companion package for deep learning based on TensorFlow

“英特尔创新大师杯”深度学习挑战赛赛道3：CCKS2021中文NLP地址相关性任务