CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

此版本基于Pytorch后端的huggingface进行实现。由于此实现使用了Oneflow的dataloader作为数据读入的方式，因此也需要安装Oneflow。其它框架的数据读取可以参考OneflowDataloaderToPytorchDataset类的实现。

使用说明

安装依赖（前置要求：已在环境中安装好Pytorch和Oneflow）

pip install transformers pandas
git clone https://github.com/tea321000/hugging_face_competition
cd hugging_face_competition

运行train_BERT_base.sh和train_BERT_large.sh 单机单卡的baseline。保持其它参数不变，通过调节shell文件里的hidden_size参数，即可观察不同hidden_size所占显存的变化（可通过watch -n 0.1 nvidia-smi直观观察）

python train.py \
--ofrecord_path sample_seq_len_512_example \
--lr 1e-4 --epochs 10 \
--train_batch_size 2 \
--seq_length=512 \
--max_predictions_per_seq=80 \
--num_hidden_layers=24 \
--num_attention_heads=16 \
--hidden_size=1024 \#要调节的参数
--vocab_size=30522

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

Related tags

Overview

CCF BDCI BERT系统调优赛题baseline（Pytorch版本）

使用说明

Owner

Ziqi Zhou

A fast hierarchical dimensionality reduction algorithm.

PIZZA - a task-oriented semantic parsing dataset

Quick insights from Zoom meeting transcripts using Graph + NLP

运小筹公众号是致力于分享运筹优化(LP、MIP、NLP、随机规划、鲁棒优化)、凸优化、强化学习等研究领域的内容以及涉及到的算法的代码实现。

A very simple framework for state-of-the-art Natural Language Processing (NLP)

Big Bird: Transformers for Longer Sequences

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Using context-free grammar formalism to parse English sentences to determine their structure to help computer to better understand the meaning of the sentence.

🗣️ NALP is a library that covers Natural Adversarial Language Processing.

A BERT-based reverse dictionary of Korean proverbs

Neural-Machine-Translation - Implementation of revolutionary machine translation models

Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS)

Deep Learning for Natural Language Processing - Lectures 2021

IEEEXtreme15.0 Questions And Answers

SimCSE: Simple Contrastive Learning of Sentence Embeddings

ETM - R package for Topic Modelling in Embedding Spaces

Turn clang-tidy warnings and fixes to comments in your pull request

official ( API ) for the zAmericanEnglish app in [ Google play ] and [ App store ]

Training RNNs as Fast as CNNs

Sentence Embeddings with BERT & XLNet