GLM (General Language Model)

Related tags

Deep LearningGLM
Overview

GLM

GLM is a General Language Model pretrained with an autoregressive blank-filling objective and can be finetuned on various natural language understanding and generation tasks.

Please refer to our paper for a detailed description of GLM:

All NLP Tasks Are Generation Tasks: A General Pretraining Framework

Zhengxiao Du*, Yujie Qian*, Xiao Liu, Ming Ding, Jiezhong Qiu, Zhilin Yang, Jie Tang (*: equal contribution)

Part of the code is based on Megatron-LM and PET.

Pretrained Models

You can download the pretrained models used in the paper here.

Name Params File Config
GLM-Base 110M glm-base-blank.tar.bz2 model_blocklm_base.sh
GLM-Large 335M glm-large-blank.tar.bz2 model_blocklm_large.sh
GLM-Large (multi-task) 335M glm-large-generation.tar.bz2 model_blocklm_large_generation.sh
GLM-410M (multi-task) 410M glm-1.25-generation.tar.bz2 model_blocklm_1.25_generation.sh
GLM-515M (multi-task) 515M glm-1.5-generation.tar.bz2 model_blocklm_1.5_generation.sh
GLM-RoBERTa 335M glm-roberta-large-blank.tar.bz2 model_blocklm_roberta_large.sh

Installation

Clone this repo

git clone https://github.com/THUDM/GLM
cd GLM

Please first install PyTorch (we use 1.7.0) and apex, and then install other dependencies by

pip install -r requirements.txt

Usage

We provide scripts for finetuning GLM on some downstream tasks.

SuperGLUE

  • Download the SuperGlue data and check the experiment setup in scripts/finetune_superglue.sh. Note that DATA_ROOT, CHECKPOINT_PATH, SAVE_PATH need to be changed to your local path. You may also change the batch-size and nproc_per_node according to your available hardware. We suggest to use aggregated batch size 64 for MultiRC and ReCORD and 16 for other tasks.

  • Run the following script (use the COPA dataset as an example)

bash scripts/finetune_superglue.sh \
     config_tasks/model_blocklm_roberta_large.sh \
     config_tasks/task_copa.sh
  • To apply GLM to a new NLU dataset with cloze-filling finetuning, implement a DataProcessor in tasks/superglue/dataset.py for data loading and add a PVP in tasks/superglue/pvp.py for the cloze question. More details can be found here.

  • The cloze questions (prompts) used in this work are written by human. We are also studying a P-tuning (prompt tuning) approach to search for the optimal continuous prompt. Please refer to our paper and code.

Text Summarization

  • Download the Gigaword dataset and check the experiment setup in scripts/finetune_seq2seq.sh. Change DATA_ROOT, CHECKPOINT_PATH, SAVE_PATH to your local path.

  • Run the following script

bash scripts/finetune_seq2seq.sh \ 
     config_tasks/model_blocklm_large_generation.sh \ 
     config_tasks/seq_gigaword.sh
  • For calculating rouge, install file2rouge from here and run bash scripts/evaluate_seq2seq.sh

Language Modeling

LAMBADA Cloze Accuracy

bash scripts/evaluate_lm.sh \ 
     config_tasks/model_blocklm_large_generation.sh \
     config_tasks/zero_lambada.sh 

LM Perplexity

  • Download our test set of wikibook (or any dataset following the same format) and change DATA_ROOT, CHECKPOINT_PATH in scripts/evaluate_lm.sh
  • Run the following script
    bash scripts/evaluate_lm.sh \ 
       config_tasks/model_blocklm_large_generation.sh \
       config_tasks/zero_lm.sh 

Blank Language Model

  • Download the Yahoo dataset and check the experiment setup in scripts/finetune_blank.sh. Change DATA_ROOT, CHECKPOINT_PATH, SAVE_PATH to your local path.

  • Run the following script

bash scripts/finetune_blank.sh \ 
     config_tasks/model_blocklm_large.sh \ 
     config_tasks/seq_blank.sh

Blank Filling (Interactive)

  • Change CHECKPOINT_PATH to your local path. Run the following script
bash scripts/generate_block.sh \
     config_tasks/model_blocklm_large.sh

Example:

Context: Ng is an adjunct professor at [MASK] (formerly associate professor and Director of its Stanford AI Lab or SAIL ). Also a pioneer in online education, Ng co-founded Coursera and deeplearning.ai.

GLM: [CLS] ng is an adjunct professor at [MASK] ( formerly associate professor and director of its stanford ai lab or sail ) . also a pioneer in online education , ng co - founded coursera and deeplearning . ai . [PAD] <|startofpiece|> the stanford university

Citation

Please cite our paper if you find this code useful for your research:

@article{DBLP:journals/corr/abs-2103-10360,
  author    = {Zhengxiao Du and
               Yujie Qian and
               Xiao Liu and
               Ming Ding and
               Jiezhong Qiu and
               Zhilin Yang and
               Jie Tang},
  title     = {All {NLP} Tasks Are Generation Tasks: {A} General Pretraining Framework},
  journal   = {CoRR},
  volume    = {abs/2103.10360},
  year      = {2021},
  url       = {https://arxiv.org/abs/2103.10360}
}
Owner
THUDM
Data Mining Research Group at Tsinghua University
THUDM
Neural Dynamic Policies for End-to-End Sensorimotor Learning

This is a PyTorch based implementation for our NeurIPS 2020 paper on Neural Dynamic Policies for end-to-end sensorimotor learning.

Shikhar Bahl 47 Dec 11, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization

University1652-Baseline [Paper] [Slide] [Explore Drone-view Data] [Explore Satellite-view Data] [Explore Street-view Data] [Video Sample] [中文介绍] This

Zhedong Zheng 335 Jan 06, 2023
BMN: Boundary-Matching Network

BMN: Boundary-Matching Network A pytorch-version implementation codes of paper: "BMN: Boundary-Matching Network for Temporal Action Proposal Generatio

qinxin 260 Dec 06, 2022
A toolkit for developing and comparing reinforcement learning algorithms.

Status: Maintenance (expect bug fixes and minor updates) OpenAI Gym OpenAI Gym is a toolkit for developing and comparing reinforcement learning algori

OpenAI 29.6k Jan 08, 2023
Multi-Content GAN for Few-Shot Font Style Transfer at CVPR 2018

MC-GAN in PyTorch This is the implementation of the Multi-Content GAN for Few-Shot Font Style Transfer. The code was written by Samaneh Azadi. If you

Samaneh Azadi 422 Dec 04, 2022
Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis

Practical Blind Denoising via Swin-Conv-UNet and Data Synthesis [Paper] [Online Demo] The following results are obtained by our SCUNet with purely syn

Kai Zhang 312 Jan 07, 2023
Physics-informed Neural Operator for Learning Partial Differential Equation

PINO Physics-informed Neural Operator for Learning Partial Differential Equation Abstract: Machine learning methods have recently shown promise in sol

107 Jan 02, 2023
《LXMERT: Learning Cross-Modality Encoder Representations from Transformers》(EMNLP 2020)

The Most Important Thing. Our code is developed based on: LXMERT: Learning Cross-Modality Encoder Representations from Transformers

53 Dec 16, 2022
Pseudo lidar - (CVPR 2019) Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

Pseudo-LiDAR from Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving This paper has been accpeted by Conference o

Yan Wang 881 Dec 27, 2022
UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering

UnsupervisedR&R: Unsupervised Pointcloud Registration via Differentiable Rendering This repository holds all the code and data for our recent work on

Mohamed El Banani 118 Dec 06, 2022
一个免费开源一键搭建的通用验证码识别平台,大部分常见的中英数验证码识别都没啥问题。

captcha_server 一个免费开源一键搭建的通用验证码识别平台,大部分常见的中英数验证码识别都没啥问题。 使用方法 python = 3.8 以上环境 pip install -r requirements.txt -i https://pypi.douban.com/simple gun

Sml2h3 189 Dec 02, 2022
LRBoost is a scikit-learn compatible approach to performing linear residual based stacking/boosting.

LRBoost is a sckit-learn compatible package for linear residual boosting. LRBoost combines a linear estimator and a non-linear estimator to leverage t

Andrew Patton 5 Nov 23, 2022
SCNet: Learning Semantic Correspondence

SCNet Code Region matching code is contributed by Kai Han ([email protected]). Dense

Kai Han 34 Sep 06, 2022
Speed-Test - You can check your intenet speed using this tool

Speed-Test Tool By Hez_X AVAILABLE ON : Termux & Kali linux & Ubuntu (Linux E

Hez-X 3 Feb 17, 2022
Draw like Bob Ross using the power of Neural Networks (With PyTorch)!

Draw like Bob Ross using the power of Neural Networks! (+ Pytorch) Learning Process Visualization Getting started Install dependecies Requires python3

Kendrick Tan 116 Mar 07, 2022
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Pytorch Performace Tuning, WandB, AMP, Multi-GPU, TensorRT, Triton

Plant Pathology 2020 FGVC7 Introduction A deep learning model pipeline for training, experimentaiton and deployment for the Kaggle Competition, Plant

Bharat Giddwani 0 Feb 25, 2022
EASY - Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients.

EASY - Ensemble Augmented-Shot Y-shaped Learning: State-Of-The-Art Few-Shot Classification with Simple Ingredients. This repository is the official im

Yassir BENDOU 57 Dec 26, 2022
Neural Message Passing for Computer Vision

Neural Message Passing for Quantum Chemistry Implementation of different models of Neural Networks on graphs as explained in the article proposed by G

Pau Riba 310 Nov 07, 2022