DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Related tags

Deep LearningDeeBERT
Overview

DeeBERT

This is the code base for the paper DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference.

Code in this repository is also available in the Huggingface Transformer repo (with minor modification for version compatibility). Check this page for models that we have trained in advance (the latest version of Huggingface Transformers Library is needed).

Installation

This repo is tested on Python 3.7.5, PyTorch 1.3.1, and Cuda 10.1. Using a virtulaenv or conda environemnt is recommended, for example:

conda install pytorch==1.3.1 torchvision cudatoolkit=10.1 -c pytorch

After installing the required environment, clone this repo, and install the following requirements:

git clone https://github.com/castorini/deebert
cd deebert
pip install -r ./requirements.txt
pip install -r ./examples/requirements.txt

Usage

There are four scripts in the scripts folder, which can be run from the repo root, e.g., scripts/train.sh.

In each script, there are several things to modify before running:

  • path to the GLUE dataset. Check this for more details.
  • path for saving fine-tuned models. Default: ./saved_models.
  • path for saving evaluation results. Default: ./plotting. Results are printed to stdout and also saved to npy files in this directory to facilitate plotting figures and further analyses.
  • model_type (bert or roberta)
  • model_size (base or large)
  • dataset (SST-2, MRPC, RTE, QNLI, QQP, or MNLI)

train.sh

This is for fine-tuning and evaluating models as in the original BERT paper.

train_highway.sh

This is for fine-tuning DeeBERT models.

eval_highway.sh

This is for evaluating each exit layer for fine-tuned DeeBERT models.

eval_entropy.sh

This is for evaluating fine-tuned DeeBERT models, given a number of different early exit entropy thresholds.

Citation

Please cite our paper if you find the repository useful:

@inproceedings{xin-etal-2020-deebert,
    title = "{D}ee{BERT}: Dynamic Early Exiting for Accelerating {BERT} Inference",
    author = "Xin, Ji  and
      Tang, Raphael  and
      Lee, Jaejun  and
      Yu, Yaoliang  and
      Lin, Jimmy",
    booktitle = "Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics",
    month = jul,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://www.aclweb.org/anthology/2020.acl-main.204",
    pages = "2246--2251",
}
Owner
Castorini
Deep learning for natural language processing and information retrieval at the University of Waterloo
Castorini
Predicting the duration of arrival delays for commercial flights.

Flight Delay Prediction Our objective is to predict arrival delays of commercial flights. According to the US Department of Transportation, about 21%

Jordan Silke 1 Jan 11, 2022
Posterior predictive distributions quantify uncertainties ignored by point estimates.

Posterior predictive distributions quantify uncertainties ignored by point estimates.

DeepMind 177 Dec 06, 2022
The King is Naked: on the Notion of Robustness for Natural Language Processing

the-king-is-naked: on the notion of robustness for natural language processing AAAI2022 DISCLAIMER:This repo will be updated soon with instructions on

Iperboreo_ 1 Nov 24, 2022
Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

Aviv Gabbay 41 Nov 29, 2022
Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Official implementation of NeurIPS 2021 paper "One Loss for All: Deep Hashing with a Single Cosine Similarity based Learning Objective"

Ng Kam Woh 71 Dec 22, 2022
Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth

Instance segmentation by jointly optimizing spatial embeddings and clustering bandwidth This codebase implements the loss function described in: Insta

209 Dec 07, 2022
Where2Act: From Pixels to Actions for Articulated 3D Objects

Where2Act: From Pixels to Actions for Articulated 3D Objects The Proposed Where2Act Task. Given as input an articulated 3D object, we learn to propose

Kaichun Mo 69 Nov 28, 2022
Based on Stockfish neural network(similar to LcZero)

MarcoEngine Marco Engine - interesnaya neyronnaya shakhmatnaya set', kotoraya ispol'zuyet metod samoobucheniya(dostizheniye khoroshoy igy putem proboy

Marcus Kemaul 4 Mar 12, 2022
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation

Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation Requirements This repository needs mmsegmentation Training To train

Adelaide Intelligent Machines (AIM) Group 7 Sep 12, 2022
Multi-label classification of retinal disorders

Multi-label classification of retinal disorders This is a deep learning course project. The goal is to develop a solution, using computer vision techn

Sundeep Bhimireddy 1 Jan 29, 2022
A collection of resources, problems, explanations and concepts that are/were important during my Data Science journey

Data Science Gurukul List of resources, interview questions, concepts I use for my Data Science work. Topics: Basics of Programming with Python + Unde

Smaranjit Ghose 10 Oct 25, 2022
A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

Unity Technologies 187 Dec 24, 2022
The code is an implementation of Feedback Convolutional Neural Network for Visual Localization and Segmentation.

Feedback Convolutional Neural Network for Visual Localization and Segmentation The code is an implementation of Feedback Convolutional Neural Network

19 Dec 04, 2022
A machine learning package for streaming data in Python. The other ancestor of River.

scikit-multiflow is a machine learning package for streaming data in Python. creme and scikit-multiflow are merging into a new project called River. W

670 Dec 30, 2022
A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation".

Dual-Contrastive-Learning A PyTorch implementation for our paper "Dual Contrastive Learning: Text Classification via Label-Aware Data Augmentation". Y

hoshi-hiyouga 85 Dec 26, 2022
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

CONQUER: Contexutal Query-aware Ranking for Video Corpus Moment Retreival PyTorch implementation of CONQUER: Contexutal Query-aware Ranking for Video

Hou zhijian 23 Dec 26, 2022
TensorFlow implementation of "Variational Inference with Normalizing Flows"

[TensorFlow 2] Variational Inference with Normalizing Flows TensorFlow implementation of "Variational Inference with Normalizing Flows" [1] Concept Co

YeongHyeon Park 7 Jun 08, 2022
[CVPR'22] COAP: Learning Compositional Occupancy of People

COAP: Compositional Articulated Occupancy of People Paper | Video | Project Page This is the official implementation of the CVPR 2022 paper COAP: Lear

Marko Mihajlovic 111 Dec 11, 2022
An implementation of MobileFormer

MobileFormer An implementation of MobileFormer proposed by Yinpeng Chen, Xiyang Dai et al. Including [1] Mobile-Former proposed in:

slwang9353 62 Dec 28, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge This is an implementation of the paper,

Mutian He 19 Oct 14, 2022