Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Last update: Oct 29, 2022

Overview

GDAP

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Environment

Python (verified: v3.8)
CUDA (verified: v11.1)
Packages (see requirements.txt)

Usage

Preprocessing

We follow dygiepp for data preprocessing.

text2et: Event Type Detection
ettext2tri: Trigger Extraction
etrttext2role: Argument Extraction

# data processed by dyieapp
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema 
├── test.json
├── train.json
└── val.json

# data processed by  data_convert.convert_text_to_target
data/text2target/dyiepp_ace1005_ettext2tri_subtype
├── event.schema
├── test.json
├── train.json
└── val.json

Useful commands:

python -m data_convert.convert_text_to_target # data/raw_data -> data/text2target
python convert_dyiepp_to_sentence.py data/raw_data/dyiepp_ace2005 # doc -> sentence, used in evaluation

Training

Relevant scripts:

run_seq2seq.py: Python code entry, modified from the transformers/examples/seq2seq/run_seq2seq.py
run_seq2seq_span.bash: Model training script logging to the log file.

Example (see the above two files for more details):

# ace05 event type detection t5-base, the metric_format use eval_trigger-F1 
bash run_seq2seq_span.bash --data=dyiepp_ace2005_text2et_subtype --model=t5-base --format=et --metric_format=eval_trigger-F1

# ace05 tri extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_ettext2tri_subtype --model=t5-base --format=tri --metric_format=eval_trigger-F1

# ace05 argument extraction t5-base
bash run_seq2seq_span.bash --data=dyiepp_ace2005_etrttext2role_subtype --model=t5-base --format=role --metric_format=eval_role-F1

Trained models are saved in the models/ folder.

Evaluation

run_tri_predict.bash: trigger extraction evaluation and inference script.
run_arg_predict.bash: argument extraction evaluation and inference script.

Todo

We aim to expand the codebase for a wider range of tasks, including

Name Entity Recognition
Keyword Generation
Event Relation Identification

If you find this repo helpful...

Please give us a ⭐ and cite our paper as

@misc{si2021-GDAP,
      title={Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works}, 
      author={Jinghui Si and Xutan Peng and Chen Li and Haotian Xu and Jianxin Li},
      year={2021},
      eprint={2110.04525},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

This project borrows code from Text2Event

Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

Related tags

Overview

GDAP

Environment

Usage

Preprocessing

Training

Evaluation

Todo

If you find this repo helpful...

Owner

Pytorch Implementation of PointNet and PointNet++++

Scalable machine learning based time series forecasting

Tutorial repo for an end-to-end Data Science project

Lung Pattern Classification for Interstitial Lung Diseases Using a Deep Convolutional Neural Network

PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

Temporal-Relational CrossTransformers

MoCoPnet - Deformable 3D Convolution for Video Super-Resolution

🔪 Elimination based Lightweight Neural Net with Pretrained Weights

Data Engineering ZoomCamp

Official PyTorch implementation of the paper: Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting.

This Deep Learning Model Predicts that from which disease you are suffering.

World Models with TensorFlow 2

Official PyTorch implementation of N-ImageNet: Towards Robust, Fine-Grained Object Recognition with Event Cameras (ICCV 2021)

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

Code for "Causal autoregressive flows" - AISTATS, 2021

PyTorch implementation of "ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context" (INTERSPEECH 2020)

A general 3D Object Detection codebase in PyTorch.

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

[Pedestron] Generalizable Pedestrian Detection: The Elephant In The Room. @ CVPR2021