SAS: Self-Augmentation Strategy for Language Model Pre-training

This repository contains the official pytorch implementation for the paper "SAS: Self-Augmentation Strategy for Language Model Pre-training" based on Huggingface transformers version 4.3.0.

Only the SAS without the disentangled attention mechanism is released for now. To be updated.

File structure

train.py: The file for pre-training.
run_glue.py: The file for finetuning.
models
- modeling_sas.py: The main algorithm for the SAS.
- trainer_sas.py: It is inherited from Huggingface transformers. It is mainly modified for data processing.
utils: It includes all the utilities.
- data_collator_sas.py: It includes the details about self-augmentations.
The rest of codes are supportive.

How to

Download and Install

Clone this repository.
Download dataset for wiki-corpus. Store it to data folder. Currently, we only provide a trail data with 1 million sentence. Full dataset can be pre-processed according to BERT. Detail to be released.

(Optional) Create an environment through conda by the provided environment.yml
- You can also manually install the package:
  - Python==3.9, pytorch==1.10.0, transformers==4.3.0, etc.

    # Clone package
    git clone [email protected]:fei960922/SAS-Self-Augmentation-Strategy.git
    cd SAS-Self-Augmentation-Strategy

    # Establish the environment.
    conda env create -f environment.yml 
    conda activate cssl

    # Download dataset and checkpoint
    wget http://www.stat.ucla.edu/~yifeixu/sas/wiki_corpus_1M.npy

Train from stractch

    # Run default setting 
    bash script/pretrain.sh

    # Run custom setting
    python train.py

    # Starting from checkpoint 
    python train.py --start_from_checkpoint 1 --pretrain_path {PATH_TH_CHECKPOINT}

Caclulate GLUE scores

    # By running this bash, GLUE dataset will be automatically downloaded.
    bash finetune.sh MNLI 0 sas-base output_dir 5e-5 32 4 42
    bash finetune.sh MNLI 0 sas-small output_dir 1e-4 32 4 42

SAS: Self-Augmentation Strategy for Language Model Pre-training

Related tags

Overview

SAS: Self-Augmentation Strategy for Language Model Pre-training

File structure

How to

Download and Install

Train from stractch

Caclulate GLUE scores

Owner

Alibaba

EXplainable Artificial Intelligence (XAI)

TensorFlow implementation for Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How

VR Viewport Pose Model for Quantifying and Exploiting Frame Correlations

Boosting Adversarial Attacks with Enhanced Momentum (BMVC 2021)

This repo is about to create the Streamlit application for given ML model.

Some simple programs built in Python: webcam with cv2 that detects eyes and face, with grayscale filter

A PyTorch-Based Framework for Deep Learning in Computer Vision

ShuttleNet: Position-aware Fusion of Rally Progress and Player Styles for Stroke Forecasting in Badminton (AAAI 2022)

Official PyTorch implementation of "IntegralAction: Pose-driven Feature Integration for Robust Human Action Recognition in Videos", CVPRW 2021

An OpenAI Gym environment for multi-agent car racing based on Gym's original car racing environment.

Image-based Navigation in Real-World Environments via Multiple Mid-level Representations: Fusion Models Benchmark and Efficient Evaluation

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Code for DeepXML: A Deep Extreme Multi-Label Learning Framework Applied to Short Text Documents

Official implementation of the paper "Steganographer Detection via a Similarity Accumulation Graph Convolutional Network"

Pytorch Implementations of large number classical backbone CNNs, data enhancement, torch loss, attention, visualization and some common algorithms.

Pure python PEMDAS expression solver without using built-in eval function

converts nominal survey data into a numerical value based on a dictionary lookup.

上海交通大学全自动抢课脚本，支持准点开抢与抢课后持续捡漏两种模式。2021/06/08更新。

Code release for "MERLOT Reserve: Neural Script Knowledge through Vision and Language and Sound"

RETRO-pytorch - Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch