Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

Last update: Dec 30, 2022

Related tags

Deep Learning PL-Marker

Overview

PL-Marker

Source code for Pack Together: Entity and Relation Extraction with Levitated Marker.

Quick links

Overview
Setup
Training Script
Quick Start
Use TypeMarker
Citation

Overview

In this work, we present a novel span representation approach, named Packed Levitated Markers, to consider the dependencies between the spans (pairs) by strategically packing the markers in the encoder. Our approach is evaluated on two typical span (pair) representation tasks:

Named Entity Recognition (NER): Adopt a group packing strategy for enabling our model to process massive spans together to consider their dependencies with limited resources.
Relation Extraction (RE): Adopt a subject-oriented packing strategy for packing each subject and all its objects into an instance to model the dependencies between the same-subject span pairs

Please find more details of this work in our paper.

Setup

Install Dependencies

The code is based on huggaface's transformers.

Install dependencies and apex:

pip3 install -r requirement.txt
pip3 install --editable transformers

Download and preprocess the datasets

Our experiments are based on three datasets: ACE04, ACE05, and SciERC. Please find the links and pre-processing below:

CoNLL03: We use the Enlish part of CoNLL03
OntoNotes: We use preprocess_ontonotes.py to preprocess the OntoNote 5.0.
Few-NERD: The dataseet can be downloaed in their website
ACE04/ACE05: We use the preprocessing code from DyGIE repo. Please follow the instructions to preprocess the ACE05 and ACE04 datasets.
SciERC: The preprocessed SciERC dataset can be downloaded in their project website.

Pre-trained Models

We release our pre-trained NER models and RE models for ACE05 and SciERC datasets on Google Drive/Tsinghua Cloud.

Note: the performance of the pre-trained models might be slightly different from the reported numbers in the paper, since we reported the average numbers based on multiple runs.

Training Script

Train NER Models:

bash scripts/run_train_ner_PLMarker.sh
bash scripts/run_train_ner_BIO.sh
bash scripts/run_train_ner_TokenCat.sh

Train RE Models:

bash run_train_re.sh

Quick Start

The following commands can be used to run our pre-trained models on SciERC.

Evaluate the NER model:

CUDA_VISIBLE_DEVICES=0  python3  run_acener.py  --model_type bertspanmarker  \
    --model_name_or_path  ../bert_models/scibert-uncased  --do_lower_case  \
    --data_dir scierc  \
    --learning_rate 2e-5  --num_train_epochs 50  --per_gpu_train_batch_size  8  --per_gpu_eval_batch_size 16  --gradient_accumulation_steps 1  \
    --max_seq_length 512  --save_steps 2000  --max_pair_length 256  --max_mention_ori_length 8    \
    --do_eval  --evaluate_during_training   --eval_all_checkpoints  \
    --fp16  --seed 42  --onedropout  --lminit  \
    --train_file train.json --dev_file dev.json --test_file test.json  \
    --output_dir sciner_models/sciner-scibert  --overwrite_output_dir  --output_results

Evaluate the RE model:

CUDA_VISIBLE_DEVICES=0  python3  run_re.py  --model_type bertsub  \
    --model_name_or_path  ../bert_models/scibert-uncased  --do_lower_case  \
    --data_dir scierc  \
    --learning_rate 2e-5  --num_train_epochs 10  --per_gpu_train_batch_size  8  --per_gpu_eval_batch_size 16  --gradient_accumulation_steps 1  \
    --max_seq_length 256  --max_pair_length 16  --save_steps 2500  \
    --do_eval  --evaluate_during_training   --eval_all_checkpoints  --eval_logsoftmax  \
    --fp16  --lminit   \
    --test_file sciner_models/sciner-scibert/ent_pred_test.json  \
    --use_ner_results \
    --output_dir scire_models/scire-scibert

Here, --use_ner_results denotes using the original entity type predicted by NER models.

TypeMarker

if we use the flag --use_typemarker for the RE models, the results will be:

Model	Ent	Rel	Rel+
ACE05-UnTypeMarker (in paper)	89.7	68.8	66.3
ACE05-TypeMarker	89.7	67.5	65.2
SciERC-UnTypeMarker (in paper)	69.9	52.0	40.6
SciERC-TypeMarker	69.9	52.5	40.9

Since the Typemarker increase the performance of SciERC but decrease the performance of ACE05, we didn't use it in the paper.

Citation

If you use our code in your research, please cite our work:

@article{ye2021plmarker,
  author    = {Deming Ye and Yankai Lin and Maosong Sun},
  title     = {Pack Together: Entity and Relation Extraction with Levitated Marker},
  journal   = {arXiv Preprint},
  year={2021}
}

Source code for "Pack Together: Entity and Relation Extraction with Levitated Marker"

Related tags

Overview

PL-Marker

Quick links

Overview

Setup

Install Dependencies

Download and preprocess the datasets

Pre-trained Models

Training Script

Quick Start

TypeMarker

Citation

Owner

THUNLP

PFFDTD is an open-source FDTD simulator for 3D room acoustics

This repository contains the DendroMap implementation for scalable and interactive exploration of image datasets in machine learning.

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

Implementation of " SESS: Self-Ensembling Semi-Supervised 3D Object Detection" (CVPR2020 Oral)

Unified MultiWOZ evaluation scripts for the context-to-response task.

This package implements the algorithms introduced in Smucler, Sapienza, and Rotnitzky (2020) to compute optimal adjustment sets in causal graphical models.

"Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices", official implementation

Awesome-google-colab - Google Colaboratory Notebooks and Repositories

Official repository for: Continuous Control With Ensemble DeepDeterministic Policy Gradients

RM Operation can equivalently convert ResNet to VGG, which is better for pruning; and can help RepVGG perform better when the depth is large.

A series of Python scripts to access measurements from Fluke 28X meters. Fluke IR Remote Interface required.

Animation of solving the traveling salesman problem to optimality using mixed-integer programming and iteratively eliminating sub tours

Use graph-based analysis to re-classify stocks and to improve Markowitz portfolio optimization

This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at EMNLP 2021.

CS50x-AI - Artificial Intelligence with Python from Harvard University

OCR Streamlit App is used to extract text from images using python's easyocr, pytorch and streamlit packages

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Graph Convolutional Networks in PyTorch

Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch