K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Last update: Nov 16, 2022

Related tags

Overview

Introduction

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce.

Installation

PyTorch version >= 1.5.0
Python version >= 3.6

git clone https://github.com/pytorch/fairseq.git
cd fairseq 
pip install --editable ./

Pre-training

prepare data for pre-training train.sh

export CUDA_VISIBLE_DEVICES=0,1,2,3

function join_by { local IFS="$1"; shift; echo "$*"; }
DATA_DIR=$(join_by : data/kplug/bin/part*)

USER_DIR=src
TOKENS_PER_SAMPLE=512
WARMUP_UPDATES=10000
PEAK_LR=0.0005
TOTAL_UPDATES=125000
#MAX_SENTENCES=8
MAX_SENTENCES=16
UPDATE_FREQ=16   # batch_size=update_freq*max_sentences*nGPU = 16*16*4 = 1024

SUB_TASK=mlm_clm_sentcls_segcls_titlegen 
## ablation task
#SUB_TASK=clm_sentcls_segcls_titlegen
#SUB_TASK=mlm_sentcls_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_segcls
#SUB_TASK=mlm_clm_segcls_titlegen
#SUB_TASK=mlm_clm_sentcls_titlegen

fairseq-train $DATA_DIR \
    --user-dir $USER_DIR \
    --task multitask_lm \
    --sub-task $SUB_TASK \
    --arch transformer_pretrain_base \
    --min-loss-scale=0.000001 \
    --sample-break-mode none \
    --tokens-per-sample $TOKENS_PER_SAMPLE \
    --criterion multitask_lm \
    --apply-bert-init \
    --max-source-positions 512 --max-target-positions 512 \
    --optimizer adam --adam-betas '(0.9, 0.98)' --adam-eps 1e-6 --clip-norm 0.0 \
    --lr-scheduler polynomial_decay --lr $PEAK_LR \
    --warmup-updates $WARMUP_UPDATES --total-num-update $TOTAL_UPDATES \
    --dropout 0.1 --attention-dropout 0.1 --weight-decay 0.01 \
    --max-sentences $MAX_SENTENCES --update-freq $UPDATE_FREQ \
    --ddp-backend=no_c10d \
    --tensorboard-logdir tensorboard \
    --classification-head-name pretrain_head --num-classes 40 \
    --tagging-head-name pretrain_tag_head --tag-num-classes 2 \
    --fp16

Fine-tuning and Inference

Finetuning on JDDC (Response Generation)

Finetuning on ECD Corpus (Response Retrieval)

Finetuning on JD Product Dataset (Abstractive Summarization)

Finetuning on MEPAVE Dataset (Sequence Tagging)

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Related tags

Overview

Introduction

Installation

Pre-training

Fine-tuning and Inference

Owner

Xu Song

Next-gen Rowhammer fuzzer that uses non-uniform, frequency-based patterns.

For auto aligning, cropping, and scaling HR and LR images for training image based neural networks

The Official Implementation of Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [NIPS 2021].

UniMoCo: Unsupervised, Semi-Supervised and Full-Supervised Visual Representation Learning

Iterative Training: Finding Binary Weight Deep Neural Networks with Layer Binarization

Implementation of ML models like Decision tree, Naive Bayes, Logistic Regression and many other

Related resources for our EMNLP 2021 paper

This is the official implement of paper "ActionCLIP: A New Paradigm for Action Recognition"

Answering Open-Domain Questions of Varying Reasoning Steps from Text

A selection of State Of The Art research papers (and code) on human locomotion (pose + trajectory) prediction (forecasting)

Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF shows significant improvements over baseline fine-tuning without data filtration.

Code for the paper "Controllable Video Captioning with an Exemplar Sentence"

A PyTorch Implementation of FaceBoxes

This repository contains the code used in the paper "Prompt-Based Multi-Modal Image Segmentation".

Comp445 project - Data Communications & Computer Networks

SPTAG: A library for fast approximate nearest neighbor search

Hydra: an Extensible Fuzzing Framework for Finding Semantic Bugs in File Systems

NaturalCC is a sequence modeling toolkit that allows researchers and developers to train custom models

Deep Learning for humans

[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans