Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Last update: Jun 21, 2022

Related tags

Overview

Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Disfl-QA is a targeted dataset for contextual disfluencies in an information seeking setting, namely question answering over Wikipedia passages. Disfl-QA builds upon the SQuAD-v2 (Rajpurkar et al., 2018) dataset, where each question in the dev set is annotated to add a contextual disfluency using the paragraph as a source of distractors.

The final dataset consists of ~12k (disfluent question, answer) pairs. Over 90% of the disfluencies are corrections or restarts, making it a much harder test set for disfluency correction. Disfl-QA aims to fill a major gap between speech and NLP research community. We hope the dataset can serve as a benchmark dataset for testing robustness of models against disfluent inputs.

Our expriments reveal that the state-of-the-art models are brittle when subjected to disfluent inputs from Disfl-QA. Detailed experiments and analyses can be found in our paper.

Dataset Description

Disfl-QA consists of ~12k disfluent questions with the following train/dev/test splits:

File	Questions
train.json	7182
dev.json	1000
test.json	3643

Each JSON file consists of original question (SQuAD-v2) and disfluent question (Disfl-QA) in the following format:

{ 
  "squad_v2_id":
  {
    "original": Original question from SQuAD-v2,
    "disfluent": Disfluent question from Disfl-QA
  }, ...
}

Note: The squad_v2_id corresponds to the unique data.paragraphs.qas.id in SQuAD-v2 development set.

Here's an example from the dataset:

 {
  "56ddde6b9a695914005b9628": {
    "original": "In what country is Normandy located?",
    "disfluent": "In what country is Norse found no wait Normandy not Norse?"
  },
  "56ddde6b9a695914005b9629": {
    "original": "When were the Normans in Normandy?",
    "disfluent": "From which countries no tell me when were the Normans in Normandy?"
  },
  "56ddde6b9a695914005b962a": {
    "original": "From which countries did the Norse originate?",
    "disfluent": "From which Norse leader I mean countries did the Norse originate?"
  },
  "56ddde6b9a695914005b962b": {
    "original": "Who was the Norse leader?",
    "disfluent": "When I mean Who was the Norse leader?"
  },
  "56ddde6b9a695914005b962c": {
    "original": "What century did the Normans first gain their separate identity?",
    "disfluent": "When no what century did the Normans first gain their separate identity?"
  },
 }

Citation

If you use or discuss this dataset in your work, please cite it as follows:

@inproceedings{gupta-etal-2021-disflqa,
    title = "{Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering}",
    author = "Gupta, Aditya and Xu, Jiacheng and Upadhyay, Shyam and Yang, Diyi and Faruqui, Manaal",
    booktitle = "Findings of ACL",
    year = "2021"
}

License

Disfl-QA dataset is licensed under CC BY 4.0.

Contact

If you have a technical question regarding the dataset or publication, please create an issue in this repository.

Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Related tags

Overview

Disfl-QA: A Benchmark Dataset for Understanding Disfluencies in Question Answering

Dataset Description

Citation

License

Contact

Owner

Google Research Datasets

Code repository for "It's About Time: Analog clock Reading in the Wild"

Faster, modernized fork of the language identification tool langid.py

Code and data accompanying Natural Language Processing with PyTorch

Yet another Python binding for fastText

An extension for asreview implements a version of the tf-idf feature extractor that saves the matrix and the vocabulary.

Search with BERT vectors in Solr and Elasticsearch

VD-BERT: A Unified Vision and Dialog Transformer with BERT

An Open-Source Package for Neural Relation Extraction (NRE)

Practical Natural Language Processing Tools for Humans is build on the top of Senna Natural Language Processing (NLP)

Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Knowledge. Proceedings of EMNLP 2021

Task-based datasets, preprocessing, and evaluation for sequence models.

An implementation of the Pay Attention when Required transformer

DELTA is a deep learning based natural language and speech processing platform.

Use the power of GPT3 to execute any function inside your programs just by giving some doctests

Full Spectrum Bioinformatics - a free online text designed to introduce key topics in Bioinformatics using the Python

In this project, we aim to achieve the task of predicting emojis from tweets. We aim to investigate the relationship between words and emojis.

Python bot created with Selenium that can guess the daily Wordle word correct 96.8% of the time.

Flaxformer: transformer architectures in JAX/Flax

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Grover is a model for Neural Fake News -- both generation and detectio