Accelerating BERT Inference for Sequence Labeling via Early-Exit

Overview

Sequence-Labeling-Early-Exit

Code for ACL 2021 paper: Accelerating BERT Inference for Sequence Labeling via Early-Exit

Requirement:

Please refer to requirements.txt

How to run?

For ontonotes (CN):

you should claim your dataset path in paths.py, and then

For the first stage training:

python -u main.py --device 0  --seed 100 --fast_ptm_name bert --lr 5e-5  --use_crf 0 --dataset ontonotes_cn --fix_ptm_epoch 2 --warmup_step 3000 --use_fastnlp_bert 0 --sampler bucket  --after_bert linear --use_char 0 --use_bigram 0 --gradient_clip_norm_other 5 --gradient_clip_norm_bert 1 --train_mode joint --test_mode joint --if_save 1 --warmup_schedule inverse_square --epoch 20 --joint_weighted 1 --ptm_lr_rate 0.1 --cls_common_lr_scale 0

Then find the exp_path in the corresponding fitlog entry, and self-sampling further train the model.

For the self-sampling training:

python -u further_train.py --seed 100 --msg fuxian --if_save 1 --warmup_schedule inverse_square --epoch 30 --keep_norm_same 1 --sandwich_small 2 --sandwich_full 4 --max_t_level_t -0.5 --train_mode joint_sample_copy --further 0 --flooding 1 --flooding_bias 0 --lr 1e-4 --ptm_lr_rate 0.1 --fix_ptm_epoch 2 --min_win_size 5 --copy_wordpiece all --ckpt_epoch 7 --exp_path 05_11_22_20_52.210103 --device 2 --max_threshold 0.25 --max_threshold_2 0.5

Then find the exp_path and best epoch in the corresponding fitlog entry, and use it for early-exit inference as:

speed 2X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 15 --threshold 0.1 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 3X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.15 --ckpt_epoch [ckpt_path] --exp_path [exp_path]
speed 4X:
python test.py --device 2 --further 1 --record_flops 1 --win_size 5 --threshold 0.25 --ckpt_epoch [ckpt_path] --exp_path [exp_path]


Other datasets' scripts coming soon

If you have any question, do not hesitate to ask it in issue. (English or Chinese both ok)

Owner
李孝男
a little bird
李孝男
Code for the ICML 2021 paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"

ViLT Code for the paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision" Install pip install -r requirements.txt pip

Wonjae Kim 922 Jan 01, 2023
Active window border replacement for window managers.

xborder Active window border replacement for window managers. Usage git clone https://github.com/deter0/xborder cd xborder chmod +x xborders ./xborder

deter 250 Dec 30, 2022
Using python and scikit-learn to make stock predictions

MachineLearningStocks in python: a starter project and guide EDIT as of Feb 2021: MachineLearningStocks is no longer actively maintained MachineLearni

Robert Martin 1.3k Dec 29, 2022
The pytorch implementation of DG-Font: Deformable Generative Networks for Unsupervised Font Generation

DG-Font: Deformable Generative Networks for Unsupervised Font Generation The source code for 'DG-Font: Deformable Generative Networks for Unsupervised

130 Dec 05, 2022
PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and Multi-Step Knowledge Distillation

PocketNet This is the official repository of the paper: PocketNet: Extreme Lightweight Face Recognition Network using Neural Architecture Search and M

Fadi Boutros 40 Dec 22, 2022
TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline.

TorchX is a library containing standard DSLs for authoring and running PyTorch related components for an E2E production ML pipeline

193 Dec 22, 2022
This repository contains the code and models necessary to replicate the results of paper: How to Robustify Black-Box ML Models? A Zeroth-Order Optimization Perspective

Black-Box-Defense This repository contains the code and models necessary to replicate the results of our recent paper: How to Robustify Black-Box ML M

OPTML Group 2 Oct 05, 2022
Hashformers is a framework for hashtag segmentation with transformers.

Hashtag segmentation is the task of automatically inserting the missing spaces between the words in a hashtag. Hashformers applies Transformer models

Ruan Chaves 41 Nov 09, 2022
The Adapter-Bot: All-In-One Controllable Conversational Model

The Adapter-Bot: All-In-One Controllable Conversational Model This is the implementation of the paper: The Adapter-Bot: All-In-One Controllable Conver

CAiRE 37 Nov 04, 2022
PyTorch Implement for Path Attention Graph Network

SPAGAN in PyTorch This is a PyTorch implementation of the paper "SPAGAN: Shortest Path Graph Attention Network" Prerequisites We prefer to create a ne

Yang Yiding 38 Dec 28, 2022
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
An implementation of "Learning human behaviors from motion capture by adversarial imitation"

Merel-MoCap-GAIL An implementation of Merel et al.'s paper on generative adversarial imitation learning (GAIL) using motion capture (MoCap) data: Lear

Yu-Wei Chao 34 Nov 12, 2022
Lane assist for ETS2, built with the ultra-fast-lane-detection model.

Euro-Truck-Simulator-2-Lane-Assist Lane assist for ETS2, built with the ultra-fast-lane-detection model. This project was made possible by the amazing

36 Jan 05, 2023
Alphabetical Letter Recognition

BayeesNetworks-Image-Classification Alphabetical Letter Recognition In these demo we are using "Bayees Networks" Our database is composed by Learning

Mohammed Firass 4 Nov 30, 2021
A set of tools for creating and testing machine learning features, with a scikit-learn compatible API

Feature Forge This library provides a set of tools that can be useful in many machine learning applications (classification, clustering, regression, e

Machinalis 380 Nov 05, 2022
The toolkit to generate auto labeled datasets

Ozeu Ozeu is the toolkit to autolabal dataset for instance segmentation. You can generate datasets labaled with segmentation mask and bounding box fro

Xiong Jie 28 Mar 28, 2022
Scalable Graph Neural Networks for Heterogeneous Graphs

Neighbor Averaging over Relation Subgraphs (NARS) NARS is an algorithm for node classification on heterogeneous graphs, based on scalable neighbor ave

Facebook Research 67 Dec 03, 2022
A baseline code for VSPW

A baseline code for VSPW Preparation Download VSPW dataset The VSPW dataset with extracted frames and masks is available here.

28 Aug 22, 2022
GAT - Graph Attention Network (PyTorch) 💻 + graphs + 📣 = ❤️

GAT - Graph Attention Network (PyTorch) 💻 + graphs + 📣 = ❤️ This repo contains a PyTorch implementation of the original GAT paper ( 🔗 Veličković et

Aleksa Gordić 1.9k Jan 09, 2023
Reinforcement learning library in JAX.

Reinforcement learning library in JAX.

Yicheng Luo 96 Oct 30, 2022