Fastseq

基于ONNXRUNTIME的文本生成加速框架

1. 环境配置

# 创建onnx conda环境
conda create -n onnx_py38 python=3.8
conda activate onnx_py38
conda install pytorch cudatoolkit=10.2 -c pytorch

# 安装onnxruntime-gpu(目前只有1.5.2版本测试成功)
pip install onnxruntime-gpu==1.5.2

# 安装transformers==3.1.0版本
pip install transformers==3.1.0

2. ONNX转换

# 将huggingface保存的 模型/checkpoint 转换为onnx格式。这里使用onnxruntime自带的转换工具。
python -m onnxruntime.transformers.convert_to_onnx \
    -m "path_to_checkpoint/model_name(gpt2)" \
    --model_class GPT2LMHeadModel \
    --output gpt2_fp32.onnx \
    -p fp32

3. DEMO测试

CUDA_VISIBLE_DEVICES=3 python demo.py \
    --onnx_model_path "./gpt2_fp32.onnx" \
    --model_name_or_path "path_to_checkpoint" \
    --prompt_text "here is an example of gpt2 model" \
    --do_sample_top_k 5

Fastseq 基于ONNXRUNTIME的文本生成加速框架

Related tags

Overview

Fastseq

1. 环境配置

2. ONNX转换

3. DEMO测试

4. TODO

Owner

Jun Gao

ElasticBERT: A pre-trained model with multi-exit transformer architecture.

PyKaldi is a Python scripting layer for the Kaldi speech recognition toolkit.

Question and answer retrieval in Turkish with BERT

뉴스 도메인 질의응답 시스템 (21-1학기 졸업 프로젝트)

AI_Assistant - This is a Python based Voice Assistant.

Test finetuning of XLSR (multilingual wav2vec 2.0) for other speech classification tasks

🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.

SASE : Self-Adaptive noise distribution network for Speech Enhancement with heterogeneous data of Cross-Silo Federated learning

CCKS-Title-based-large-scale-commodity-entity-retrieval-top1

Wikipedia-Utils: Preprocessing Wikipedia Texts for NLP

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

In this repository we have tested 3 VQA models on the ImageCLEF-2019 dataset.

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

This is the 25 + 1 year anniversary version of the 1995 Rachford-Rice contest

Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

Fake Shakespearean Text Generator

Official PyTorch implementation of Time-aware Large Kernel (TaLK) Convolutions (ICML 2020)

A Python package implementing a new model for text classification with visualization tools for Explainable AI :octocat:

This is a project of data parallel that running on NLP tasks.

Correctly generate plurals, ordinals, indefinite articles; convert numbers to words