INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

Existing studies on semantic parsing focus primarily on mapping a natural-language utterance to a corresponding logical form in one turn. However, because natural language can contain a great deal of ambiguity and variability, this is a difficult challenge. In this work, we investigate an interactive semantic parsing framework that explains the predicted logical form step by step in natural language and enables the user to make corrections through natural-language feedback for individual steps. We focus on question answering over knowledge bases (KBQA) as an instantiation of our framework, aiming to increase the transparency of the parsing process and help the user appropriately trust the final answer. To do so, we construct INSPIRED, a crowdsourced dialogue dataset derived from the ComplexWebQuestions dataset.

This repository will contain the dataset and code for our paper Towards Transparent Interactive Semantic Parsing via Step-by-Step Correction.

Data

Dataset Download

The dataset can be downloaded under this path: ./data/dataset.jsonl

Data Structure

In the dataset file, each line is a dictionary with several keys:

{
    "id": "ID number",
    "cwq_question": "Original complex question in CWQ dataset",
    "rephrased_question": "Rephrased complex question by workers",
    "rephrased_question_label": " 'Replacement' or 'Alternative' ",
    "question": "If rephrased_question_label is marked as 'Replacement', set the value the same as rephrased_question; Otherwise, set it the same as cwq_question",
    "final_answer": "Final answer for the complex question",
    "gold_parse": "Gold sparql query for complex question",
    "preprocessed_gold_parse": "Preprocessed gold parse with entities and prefix replaced",
    "predicted_parse": "Predicted sparql query by initial semantic parser",
    "gold_sub_lfs": "A list of gold sub-logical forms after decomposition",
    "pred_sub_lfs": "A list of predicted sub-logical forms after decomposition",
    "gold_sub_qs": [
        {
          "sub_id": "ID of sub questions",
          "sub_question": "Rephrased sub question",
          "temp_sub_question": "Templated sub question for gold sub-logical form",
          "answer": "Answer for each sub question",
        }, "..."], 
    "pred_sub_qs": [
        {
          "sub_id": "ID of sub questions",
          "sub_question": "Rephrased sub question",
          "temp_sub_question": "Templated sub question for predicted sub-logical form",
          "answer": "Answer for each sub question",
        }, "..."], 
    "feedback": "A list of human feedback"
    
}

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

Related tags

Overview

INSPIRED: A Transparent Dialogue Dataset for Interactive Semantic Parsing

Data

Dataset Download

Data Structure

Owner

End-To-End Crowdsourcing

AI grand challenge 2020 Repo (Speech Recognition Track)

The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

Marine debris detection with commercial satellite imagery and deep learning.

Deep learning models for classification of 15 common weeds in the southern U.S. cotton production systems.

Human Dynamics from Monocular Video with Dynamic Camera Movements

Codebase for BMVC 2021 paper "Text Based Person Search with Limited Data"

A Comprehensive Study on Learning-Based PE Malware Family Classification Methods

EdiBERT is a generative model based on a bi-directional transformer, suited for image manipulation

Morphable Detector for Object Detection on Demand

Security evaluation module with onnx, pytorch, and SecML.

basic tutorial on pytorch

Rax is a Learning-to-Rank library written in JAX

Image Super-Resolution Using Very Deep Residual Channel Attention Networks

Assessing syntactic abilities of BERT

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Identifying a Training-Set Attack’s Target Using Renormalized Influence Estimation

EsViT: Efficient self-supervised Vision Transformers

An optimization and data collection toolbox for convenient and fast prototyping of computationally expensive models.

Head2Toe: Utilizing Intermediate Representations for Better OOD Generalization