Codes for coreference-aware machine reading comprehension

Last update: Sep 29, 2022

Related tags

Overview

Data and code for the paper "Tracing Origins: Coreference-aware Machine Reading Comprehension" at ACL2022.

Dataset

There are three folders for our three models mentioned in the paper: Coref_additive_spacy for Coref_additive_attention, Coref_dgl_spacy for GNN and Coref_multiplication_spacy for Coref_multiplication_attention, and each contains the train data set and the dev data set under the quoref folder.

each sample contains

context: the paragraph text
context_id: the unique identifier of the context
qas: a group of questions
question: question text
id: the unique identifier of the question
answers: a group of the answers to one question
text: answer text
answer_start: the start_position of one answer

Models

If you want to use our trained model, please download it from Google drive

Training

python run_quoref.py --train_file "quoref/train.json" --predict_file "quoref/dev.json" --model_type "roberta_multi" --model_name_or_path "roberta-large" --output_dir "out" --do_train --do_eval --eval_all_checkpoints --learning_rate 1e-5 --num_train_epochs 6 --overwrite_output_dir --per_gpu_train_batch_size 4 --save_steps 6000 --coref_weight 0.4

Kindly Hint

There is an open issue regarding the compatibility between NeuralCoref and spaCy 3.0. If you intend to use the latest spaCy models, please watch the issue.

Cite

If you extend or use this work, please cite the paper where it was introduced:

@article{Huang2021TracingOC,
  title={Tracing Origins: Coref-aware Machine Reading Comprehension},
  author={Baorong Huang and Zhuosheng Zhang and Hai Zhao},
  journal={ArXiv},
  year={2021},
  volume={abs/2110.07961}
}

Codes for coreference-aware machine reading comprehension

Related tags

Overview

Dataset

Models

Training

Kindly Hint

Cite

Owner

Continuously update some NLP practice based on different tasks.

translate using your voice

Lightweight utility tools for the detection of multiple spellings, meanings, and language-specific terminology in British and American English

Open-source offline translation library written in Python. Uses OpenNMT for translations

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

The first online catalogue for Arabic NLP datasets.

Python Implementation of ``Modeling the Influence of Verb Aspect on the Activation of Typical Event Locations with BERT'' (Findings of ACL: ACL 2021)

A Flask Sentiment Analysis API, with visual implementation

NLP codes implemented with Pytorch (w/o library such as huggingface)

A retro text-to-speech bot for Discord

This project is part of Eleuther AI's quest to create a massive repository of high quality text data for training language models.

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Bidirectional Variational Inference for Non-Autoregressive Text-to-Speech (BVAE-TTS)

Speech Recognition for Uyghur using Speech transformer

Chinese Named Entity Recognization (BiLSTM with PyTorch)

[NeurIPS 2021] Code for Learning Signal-Agnostic Manifolds of Neural Fields

EMNLP'2021: Can Language Models be Biomedical Knowledge Bases?

Python wrapper for Stanford CoreNLP tools v3.4.1

Implementing SimCSE(paper, official repository) using TensorFlow 2 and KR-BERT.

Rhyme with AI