FastFormers - highly efficient transformer models for NLU

Last update: Jan 05, 2023

Related tags

Overview

FastFormers

FastFormers provides a set of recipes and methods to achieve highly efficient inference of Transformer models for Natural Language Understanding (NLU) including the demo models showing 233.87x speed-up (Yes, 233x on CPU with the multi-head self-attentive Transformer architecture. This is not an LSTM or an RNN). The details of the methods and analyses are described in the paper FastFormers: Highly Efficient Transformer Models for Natural Language Understanding paper.

Notes

(June 3, 2021) The public onnxruntime (v1.8.0) now supports all FastFormers models. Special thanks to @yufenglee and onnxruntime team.
(Nov. 4, 2020) We are actively working with Hugging Face and onnxruntime team so that you can utilize the features out of the box of huggingface's transformers and onnxruntime. Please stay tuned.
With this repository, you can replicate the results presented in the FastFormers paper.
The demo models of FastFormers are implemented with SuperGLUE benchmark. Data processing pipeline is based on Alex Wang's implementation reference code for SustaiNLP which is a fork from HuggingFace's transformers repository.
This repository is built on top of several open source projects including transformers from HuggingFace, onnxruntime, transformers from Alex Wang, FBGEMM, TinyBERT and etc.

Requirements

FastFormers currently only supports Linux operating systems.
CPU requirements:
- CPUs equipped with at least one, or both of AVX2 and AVX512 instruction sets are required. To get the full speed improvements and accuracy, AVX512 instruction set is required. We have tested our runtime on Intel CPUs.
GPU requirements:
- To utilize 16-bit floating point speed-up, GPUs with Volta or later architectures are required.
onnxruntime v1.8.0+ is required to run FastFormers models.
This repository is a branch of transformers, so you need to uninstall pre-existing transformers in your python environment.

Installation

This repo is tested on Python 3.6 and 3.7, PyTorch 1.5.0+.

You need to uninstall pre-existing transformers package as this repository uses customized versions of it.

You need to install PyTorch 1.5.0+. Then, execute following bash commands. You need to install onnxruntime 1.8.0+.

pip install onnxruntime==1.8.0 --user --upgrade --no-deps --force-reinstall
pip uninstall transformers -y
git clone https://github.com/microsoft/fastformers
cd fastformers
pip install .

Run the demo systems

All the models used to benchmark Table 3 in the paper are publicly shared. You can use below commands to reproduce the results. Table 3 measurement was done on one of the Azure F16s_v2 instances.

The installation step needs to be done before proceeding.

Download SuperGLUE dataset and decompress.
Download demo model files and decompress.

wget https://github.com/microsoft/fastformers/releases/download/v0.1-model/teacher-bert-base.tar.gz
wget https://github.com/microsoft/fastformers/releases/download/v0.2-model/student-4L-312.tar.gz
wget https://github.com/microsoft/fastformers/releases/download/v0.2-model/student-pruned-8h-600.tar.gz
wget https://github.com/microsoft/fastformers/releases/download/v0.2-model/student-pruned-9h-900.tar.gz

Run the teacher model (BERT-base) baseline

python3 examples/fastformers/run_superglue.py \
        --model_type bert --model_name_or_path ${teacher_model} \
        --task_name BoolQ --output_dir ${out_dir} --do_eval  \
        --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
        --use_fixed_seq_length --do_lower_case --max_seq_length 512 \
        --no_cuda

Run the teacher model (BERT-base) with dynamic sequence length

python3 examples/fastformers/run_superglue.py \
        --model_type bert --model_name_or_path ${teacher_model} \
        --task_name BoolQ --output_dir ${out_dir} --do_eval  \
        --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
        --do_lower_case --max_seq_length 512 --no_cuda

Run the distilled student model (PyTorch)

python3 examples/fastformers/run_superglue.py \
        --model_type bert --model_name_or_path ${student_model} \
        --task_name BoolQ --output_dir ${out_dir} --do_eval  \
        --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
        --do_lower_case --max_seq_length 512 --no_cuda

Run the distilled student with 8-bit quantization (onnxruntime)

python3 examples/fastformers/run_superglue.py \
        --model_type bert --model_name_or_path ${student_model} \
        --task_name BoolQ --output_dir ${out_dir} --do_eval \
        --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
        --do_lower_case --max_seq_length 512 --use_onnxrt --no_cuda

Run the distilled student with 8-bit quantization + multi-intance inference (onnxruntime)

OMP_NUM_THREADS=1 python3 examples/fastformers/run_superglue.py \
                          --model_type bert \
                          --model_name_or_path ${student_model} \
                          --task_name BoolQ --output_dir ${out_dir} --do_eval \
                          --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
                          --do_lower_case --max_seq_length 512 --use_onnxrt \
                          --threads_per_instance 1 --no_cuda

Run the distilled + pruned student with 8-bit quantization + multi-intance inference (onnxruntime)

OMP_NUM_THREADS=1 python3 examples/fastformers/run_superglue.py \
                          --model_type bert \
                          --model_name_or_path ${pruned_student_model} \
                          --task_name BoolQ --output_dir ${out_dir} --do_eval \
                          --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
                          --do_lower_case --max_seq_length 512 --use_onnxrt \
                          --threads_per_instance 1 --no_cuda

How to create FastFormers

Training models

This is used for fine-tuning of pretrained or general distilled model (task-agnostic distillation) to the downstream tasks. Currently, BERT and RoBERTa models are supported.

Tip 1. This repository is based on transformers, so you can use huggingface's pre-trained models. (e.g. set distilroberta-base for --model_name_or_path to use distilroberta-base)

Tip 2. Before fine-tuning models, you can change the activation functions to ReLU to get better inference speed. To do this, you can download the config file of your model and manually change it to relu (hidden_act in case of BERT and ReBERTa models). Then, you can specify the config file by adding parameter (--config_name).

Tip 3. Depending on the task and the models used, you can add --do_lower_case if it give a better accuracy.

python3 examples/fastformers/run_superglue.py \
        --data_dir ${data_dir} --task_name ${task} \
        --output_dir ${out_dir} --model_type ${model_type} \
        --model_name_or_path ${model} \
        --use_gpuid ${gpuid} --seed ${seed} \
        --do_train --max_seq_length ${seq_len_train} \
        --do_eval --eval_and_save_steps ${eval_freq} --save_only_best \
        --learning_rate 0.00001 \
        --warmup_ratio 0.06 --weight_decay 0.01 \
        --per_gpu_train_batch_size 4 \
        --gradient_accumulation_steps 1 \
        --logging_steps 100 --num_train_epochs 10 \
        --overwrite_output_dir --per_instance_eval_batch_size 8

Distilling models

This is used for distilling fine-tuned teacher models into smaller student models (task-specific distillation) on the downstream tasks. As described in the paper, it is critical to initialize student models with general distilled models such as distilbert-, distilroberta-base and TinyBERT.

This command is also used to distill non-pruned models into pruned models.

This command always uses task specific logit loss between teacher and student models for the student training. You can add addtional losses for hidden states (including token mbedding) and attentions between teacher and student. To use hidden states and attentions distillation, the number of teacher layers should be multiples of the number of student layers.

python3 examples/fastformers/run_superglue.py \
        --data_dir ${data_dir} --task_name ${task} \
        --output_dir ${out_dir} --teacher_model_type ${teacher_model_type} \
        --teacher_model_name_or_path ${teacher_model} \
        --model_type ${student_model_type} --model_name_or_path ${student_model} \
        --use_gpuid ${gpuid} --seed ${seed} \
        --do_train --max_seq_length ${seq_len_train} \
        --do_eval --eval_and_save_steps ${eval_freq} --save_only_best \
        --learning_rate 0.00001 \
        --warmup_ratio 0.06 --weight_decay 0.01 \
        --per_gpu_train_batch_size 4 \
        --gradient_accumulation_steps 1 \
        --logging_steps 100 --num_train_epochs 10 \
        --overwrite_output_dir --per_instance_eval_batch_size 8 \
        --state_loss_ratio 0.1

Pruning models

This command performs structured pruning on the models described in the paper. It reduces the number of heads and the intermediate hidden states of FFN as set in the options. When the pruning is done on GPU, only 1 GPU is utilized (no multi-GPU).

To get better accuracy, you can do another round of knowledge distillation after the pruning.

python3 examples/fastformers/run_superglue.py \
        --data_dir ${data_dir} --task_name ${task} \
        --output_dir ${out_dir} --model_type ${model_type} \
        --model_name_or_path ${model} --do_eval \
        --do_prune --max_seq_length ${seq_len_train} \
        --per_instance_eval_batch_size 1 \
        --target_num_heads 8 --target_ffn_dim 600

Optimizing models on CPU (8-bit integer quantization + onnxruntime)

This command convert your PyTorch transformers models into optimized onnx format with 8-bit quantization. The converted ONNX model is saved in the directory which the original PyTorch model is located.

python3 examples/fastformers/run_superglue.py \
        --task_name ${task} \
        --model_type ${model_type} \
        --model_name_or_path ${model} \
        --convert_onnx

Optimizing models on GPU (16-bit floating point conversion)

This command convert your PyTorch transformers models into 16-bit floating point model (PyTorch). This creates a new directory named fp16 in the directory the original model is located. Then, the converted fp16 model and all necessary files are saved to the directory.

python3 examples/fastformers/run_superglue.py \
        --task_name ${task} \
        --model_type ${model_type} \
        --model_name_or_path ${model} \
        --convert_fp16

Evaluating models

This command evalutes various models with PyTorch or onnxruntime engine on the give tasks. For more detailed usage, please refer to the demo section.

OMP_NUM_THREADS=1 python3 examples/fastformers/run_superglue.py \
                          --model_type bert \
                          --model_name_or_path ${pruned_student_model} \
                          --task_name BoolQ --output_dir ${out_dir} --do_eval \
                          --data_dir ${data_dir} --per_instance_eval_batch_size 1 \
                          --do_lower_case --max_seq_length 512 --use_onnxrt \
                          --threads_per_instance 1 --no_cuda

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct.

License

This project is licensed under the MIT License.

Comments

run the distilled student with 8-bit quantization (onnxruntime)
🐛 Bug

Information

Model I am using (Bert, XLNet ...): student-4L-312

Language I am using the model on (English, Chinese ...):English

The problem arises when using:

[x] the official example scripts: (give details below)

[ ] my own modified scripts: (give details below)

The tasks I am working on is:

[x] an official GLUE/SQUaD task: (give the name)

[ ] my own task or dataset: (give details below)

To reproduce

Steps to reproduce the behavior:

run the command below python examples/fastformers/run_superglue.py --model_type bert --model_name_or_path ../model/fastformers/student_model/student-4L-312 --task_name BoolQ --output_dir ./out --do_eval --data_dir ../dataset/fastformers/BoolQ --per_instance_eval_batch_size 1 --do_lower_case --max_seq_length 512 --use_onnxrt --no_cuda

error message:

Traceback (most recent call last): File "examples/fastformers/run_superglue.py", line 1901, in <module> main() File "examples/fastformers/run_superglue.py", line 1840, in main from onnxruntime import ExecutionMode, InferenceSession, SessionOptions File "/home/username/.local/lib/python3.6/site-packages/onnxruntime/__init__.py", line 13, in <module> from onnxruntime.capi._pybind_state import get_all_providers, get_available_providers, get_device, set_seed, \ ImportError: cannot import name 'get_all_providers'

Expected behavior

Environment info

transformers version: transformers (2.11.0)

Platform: ubuntu16.04

Python version: 3.6

PyTorch version (GPU?): torch (1.7.0+cpu)

Tensorflow version (GPU?): null

Using GPU in script?: No

Using distributed or parallel set-up in script?: No

onnxruntime (1.4.0)
opened by baiziyuandyufei 6
Fastformers/Transformers question

Hello!

I was reading your paper and was looking in the HF repo and found https://github.com/huggingface/transformers/issues/8083 where it appeared that you were discussing adding your functionality to their library, however that never happened, so I am curious if you discovered something that prohibited adding this functionality. Thanks!

opened by jmwoloso 4
trying to train roberta-large model for question-answering task

❓ Questions & Help

I am trying to convert the Roberta-large model to Fastformers. I am facing this issue with data files after preprocessing

Details

runcate_sequences assert len(ids) > num_tokens_to_remove AssertionError

what did lead me to this error A link to original question on Stack Overflow:

opened by skiran252 4
pruned error

model is bert_base When i do prune, set num_heads to 8. Before prune, shape of key(value) weight is [768, 768] After prune, shape of key(value) weight became [512, 768]

so, saved weights can't be loaded by transformers

opened by renmada 3
Integrate ZORB as an opt-in optimization

Fastformers is already impressive as is but a new paper has just been released: ZORB And it aims to become an alternative to... Backpropagation (you read this right)

Such a breakthrough allow ~300X additional speedup to fastformers (even more in theory) while only diminishing accuracy of a few percents on many cases and actually apparently outperforming BP with a lower error count? Anyway it's a huge performance breakthrough and should be popularized by fastformers and huggingface.

https://paperswithcode.com/paper/zorb-a-derivative-free-backpropagation Notes: Extensive testing of the variable accuracy loss would be welcome. Some activations functions needs to be adapted? (e.g Mish?)

opened by LifeIsStrange 3
Optimize fine-tuned model from HuggingFace

How to optimize an already fine-tuned model from Hugging Face?

Congratulations on the work, it looks amazing 😊

Details

If there is an already fine-tuned model from Hugging Face for, let's say, generating question-answer pairs such as valhalla/t5-base-qa-qg-hl, how could it be further optimized for inference using your method? I'm a bit lost

Thank you in advance!

opened by ugm2 3
AMD CPUs should work just fine

The README mentions Intel cpus are required because of the necessity of AVX256 support.

First of all AMD cpu supports AVX 256 since a Long time (jaguar which predate zen). True AVX 256 support (not only being compatible but being twice as fast as AVX 128) came one year ago with ZEN 2 cpus.

Zen 3 cpus are now being released and are the fastest cpus in the world and any Deep learning researchers should have them and be compatible with them, period.

opened by LifeIsStrange 3
integrate with Lightning ecosystem CI
Hello and so happy to see you use Pytorch-Lightning! :tada: Just wondering if you already heard about quite the new Pytorch Lightning (PL) ecosystem CI where we would like to invite you to... You can check out our blog post about it: Stay Ahead of Breaking Changes with the New Lightning Ecosystem CI :zap: As you use PL framework for your cool project, we would like to enhance your experience and offer you safe updates to our future releases. At this moment, you run tests with a particular PL version, but it may accidentally happen that the next version will be incompatible with your project... :confused: We do not intend to change anything on our project side, but still here we have a solution - ecosystem CI with testing both - your and our latest development head we can find it very early and prevent releasing eventually bad version... :+1:

What is needed to do?

have some tests, including PL integration

add config to ecosystem CI - https://github.com/PyTorchLightning/ecosystem-ci

What will you get?

scheduled nightly testing configured for development/stable versions

slack notification if something went wrong to investigate

testing also on multi-GPU machine as our gift to you :rabbit:

cc: @borda
opened by pl-ghost 2
Which TinyBERT models used for student initialisation?
❓ Questions & Help

Details

Hello again @ykim362,

I'm trying to reproduce your distillation results from Section 2 of the FastFormers paper and I have a few questions I was hoping you could help with:

Did you use the weights provided in the TinyBERT repo (link) or those provided by Huawei in the HuggingFace model hub (link)?

Did you use General_TinyBERT(Nlayer-Ddim) or General_TinyBERT_v2(Nlayer-Ddim)?

I noticed that the Huawei models on the HuggingFace hub do not appear to be compatible with the Transformers library, so e.g. I get errors like the following:

>>> from transformers import AutoTokenizer >>> tokenizer = AutoTokenizer.from_pretrained("huawei-noah/TinyBERT_General_4L_312D") Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/Users/lewtun/git/transformers/src/transformers/models/auto/tokenization_auto.py", line 345, in from_pretrained config = AutoConfig.from_pretrained(pretrained_model_name_or_path, **kwargs) File "/Users/lewtun/git/transformers/src/transformers/models/auto/configuration_auto.py", line 360, in from_pretrained raise ValueError( ValueError: Unrecognized model in huawei-noah/TinyBERT_General_4L_312D. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: retribert, mt5, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, mpnet, bart, blenderbot, reformer, longformer, roberta, deberta, flaubert, fsmt, squeezebert, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm-prophetnet, prophetnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag, tapas

Did you have to do something special to load TinyBERT in your FastFormers experiments? Looking at your source code (link) it seems you use the standard from_pretrained methods of the Transformers library, so I'm curious whether you encountered the same problem. 4. Did you use the data augmentation technique from TinyBERT (i.e. combine BERT with GloVe word embeddings) in your experiments? Looking at your codebase, I could not see this being used, but just want to double-check since it appears to play an important role in the TinyBERT paper. 5. Finally, what values of state_loss_ratio and att_loss_ratio did you use to generate the distilled model in Table 3 of your paper?

For reference, I am not working directly from the fastformers repo, so have the following dependencies:

- `transformers` version: 4.0.0-rc-1 - Platform: Linux-4.15.0-72-generic-x86_64-with-Ubuntu-18.04-bionic - Python version: 3.6.9 - PyTorch version (GPU?): 1.6.0 (True) - Tensorflow version (GPU?): 2.3.0 (True) - Using GPU in script?: (True) - Using distributed or parallel set-up in script?: None

Thank you!
opened by lewtun 2
Task-agnostic or task-specific distillation used for CPU inference results?

❓ Questions & Help

Details

Hello,

First of all, thank you very much for open-sourcing this research - I expect it will have a large impact on helping bring Transformers to production!

I have a question about the results in Table 3 of your paper.

Is the distilled model with (4L, 312) obtained from task-agnostic or task-specific distillation? In Section 2 you state

Since we are experimenting with various NLU tasks, the capacity of the optimal student model that preserves accuracy may vary with varying level of task’s difficulty. Therefore, we experiment with distilling various sized student models; then, we pick the smaller model among the distilled models that can offer higher accuracy than the original BERT model for each task.

and I could not tell from the codebase which approach you used to generate the numbers on Table 3.

Thank you!

opened by lewtun 2
Run the teacher model (BERT-base) baseline Error
🐛 Bug

Information

Model I am using (Bert, XLNet ...): BERT

Language I am using the model on (English, Chinese ...): English

The problem arises when using:

[x] the official example scripts: (give details below)

The tasks I am working on is:

[x] an official GLUE/SQUaD task: (give the name)

To reproduce

Steps to reproduce the behavior:

run the command below

python examples\fastformers\run_superglue.py --model_type bert --model_name_or_path ..\model\fastformers\teacher_model\teacher-bert-base --task_name BoolQ --output_dir .\out --do_eval --data_dir ..\dataset\fastformers\BoolQ --per_instance_eval_batch_size 1 --use_fixed_seq_length --do_lower_case --max_seq_length 512 --no_cuda

Expected behavior

get error below:

FileNotFoundError: [Errno 2] No such file or directory: '..\dataset\fastformers\BoolQ\tensors_dev_..\model\fastformers\teacher_model\teacher-bert-base_512_boolq_True'

Environment info

transformers version: 2.11.0

Platform: windows

Python version: 3.6

PyTorch version (GPU?): 1.7.0+cpu

Tensorflow version (GPU?): None

Using GPU in script?: No

Using distributed or parallel set-up in script?: No
opened by baiziyuandyufei 2

Releases(v0.2-model)

v0.2-model(Jun 3, 2021)

This updates the example models created by public onnxruntime v1.8.0.
Source code(tar.gz)
Source code(zip)
student-4L-312.tar.gz(93.36 MB)
student-pruned-8h-600.tar.gz(36.60 MB)
student-pruned-9h-900.tar.gz(37.28 MB)
v0.1-model(Oct 27, 2020)
Initial release of FastFormers demo models.

Source code(tar.gz)
Source code(zip)
student-4L-312.tar.gz(106.20 MB)
student-pruned-8h-600.tar.gz(43.85 MB)
student-pruned-9h-900.tar.gz(46.97 MB)
teacher-bert-base.tar.gz(413.69 MB)

Owner

Microsoft

Open source projects and samples from Microsoft

GitHub Repository

Code for the paper PermuteFormer

PermuteFormer This repo includes codes for the paper PermuteFormer: Efficient Relative Position Encoding for Long Sequences. Directory long_range_aren

42 Mar 16, 2022

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

Lime Comparing deep contextualized model for sentences highlighting task. In addition, take the classic explanation model "LIME" with bert-base model

2 Jan 18, 2022

NeurIPS'21: Probabilistic Margins for Instance Reweighting in Adversarial Training (Pytorch implementation).

source code for NeurIPS21 paper robabilistic Margins for Instance Reweighting in Adversarial Training

9 Dec 20, 2022

Uncomplete archive of files from the European Nopsled Team

European Nopsled CTF Archive This is an archive of collected material from various Capture the Flag competitions that the European Nopsled team played

4 Nov 24, 2021

The RWKV Language Model

RWKV-LM We propose the RWKV language model, with alternating time-mix and channel-mix layers: The R, K, V are generated by linear transforms of input,

877 Jan 05, 2023

An open collection of annotated voices in Japanese language

声庭 (Koniwa): オープンな日本語音声とアノテーションのコレクション Koniwa (声庭): An open collection of annotated voices in Japanese language 概要 Koniwa(声庭)は利用・修正・再配布が自由でオープンな音声とアノテ

32 Dec 14, 2022

🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

floret: fastText + Bloom embeddings for compact, full-coverage vectors with spaCy floret is an extended version of fastText that can produce word repr

222 Dec 16, 2022

A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

Styleformer A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/cas

431 Dec 19, 2022

TTS is a library for advanced Text-to-Speech generation.

TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pretra

6.5k Jan 08, 2023

Python utility library for compositing PDF documents with reportlab.

pdfdoc-py Python utility library for compositing PDF documents with reportlab. Installation The pdfdoc-py package can be installed directly from the s

1 Jan 06, 2022

SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time

SentimentArcs - Emotion in Text An end-to-end pipeline based on Jupyter notebooks to detect, extract, process and anlayze emotion over time in text. E

14 Dec 19, 2022

Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

chatbot Bu Chatbot, Konya Bilim Merkezi Yeni Ufuklar Sergisi için 2021 Yılında tasarlanmış olan bir projedir. Chatbot Python ortamında yazılmıştır. Sö

1 Feb 23, 2022

keras implement of transformers for humans

4.8k Jan 03, 2023

A spaCy wrapper of OpenTapioca for named entity linking on Wikidata

spaCyOpenTapioca A spaCy wrapper of OpenTapioca for named entity linking on Wikidata. Table of contents Installation How to use Local OpenTapioca Vizu

80 Jan 03, 2023

Python library for Serbian Natural language processing (NLP)

SrbAI - Python biblioteka za procesiranje srpskog jezika SrbAI je projekat prikupljanja algoritama i modela za procesiranje srpskog jezika u jedinstve

3 Nov 22, 2022

Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

FREE_7773 Repo containing material for the NYU class (Master of Engineering) I teach on NLP, ML Sys etc. For context on what the class is trying to ac

90 Dec 19, 2022

Conversational text Analysis using various NLP techniques

159 Jan 06, 2023

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

Transformers are all you need In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a

8 Apr 13, 2022

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

Project: Text Analysis - This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 -

1 Mar 14, 2022

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors [Paper] [Project Website] Pytorch implementation for SAVI2I. We

44 Dec 30, 2022

FastFormers - highly efficient transformer models for NLU

Related tags

Overview

FastFormers

Notes

Requirements

Installation

Run the demo systems

How to create FastFormers

Training models

Distilling models

Pruning models

Optimizing models on CPU (8-bit integer quantization + onnxruntime)

Optimizing models on GPU (16-bit floating point conversion)

Evaluating models

Code of Conduct

License

Comments

🐛 Bug

Information

To reproduce

Expected behavior

Environment info

❓ Questions & Help

Details

How to optimize an already fine-tuned model from Hugging Face?

Details

❓ Questions & Help

Details

❓ Questions & Help

Details

🐛 Bug

Information

To reproduce

Expected behavior

Environment info

Releases(v0.2-model)

v0.2-model(Jun 3, 2021)

v0.1-model(Oct 27, 2020)

Owner

Microsoft

Code for the paper PermuteFormer

Using Bert as the backbone model for lime, designed for NLP task explanation (sentence pair text classification task)

NeurIPS'21: Probabilistic Margins for Instance Reweighting in Adversarial Training (Pytorch implementation).

Uncomplete archive of files from the European Nopsled Team

The RWKV Language Model

An open collection of annotated voices in Japanese language

🌸 fastText + Bloom embeddings for compact, full-coverage vectors with spaCy

A Neural Language Style Transfer framework to transfer natural language text smoothly between fine-grained language styles like formal/casual, active/passive, and many more. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.

TTS is a library for advanced Text-to-Speech generation.

Python utility library for compositing PDF documents with reportlab.

SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time

Bu Chatbot, Konya Bilim Merkezi Yen için tasarlanmış olan bir projedir.

keras implement of transformers for humans

A spaCy wrapper of OpenTapioca for named entity linking on Wikidata

Python library for Serbian Natural language processing (NLP)

Materials (slides, code, assignments) for the NYU class I teach on NLP and ML Systems (Master of Engineering).

Conversational text Analysis using various NLP techniques

In this workshop we will be exploring NLP state of the art transformers, with SOTA models like T5 and BERT, then build a model using HugginFace transformers framework.

This project aims to conduct a text information retrieval and text mining on medical research publication regarding Covid19 - treatments and vaccinations.

SAVI2I: Continuous and Diverse Image-to-Image Translation via Signed Attribute Vectors