Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering

Overview

Path-Generator-QA

This is a Pytorch implementation for the EMNLP 2020 (Findings) paper: Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering [arxiv][project page]

Code folders:

(1) learning-generator: conduct path sampling and then train the path generator.

(2) commonse-qa: use the generator to generate paths and then train the qa system on task dataset.

(3) A-Commonsense-Path-Generator-for-Connecting-Entities.ipynb: The notebook illustrating how to use our proposed generator to connect a pair of entities with a commonsense relational path.

Part of this code and instruction rely on our another project [code][arxiv]. Please cite both of our works if you use this code. Thanks!

@article{wang2020connecting,
  title={Connecting the Dots: A Knowledgeable Path Generator for Commonsense Question Answering},
  author={Wang, Peifeng and Peng, Nanyun and Szekely, Pedro and Ren, Xiang},
  journal={arXiv preprint arXiv:2005.00691},
  year={2020}
}

@article{feng2020scalable,
  title={Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering},
  author={Feng, Yanlin and Chen, Xinyue and Lin, Bill Yuchen and Wang, Peifeng and Yan, Jun and Ren, Xiang},
  journal={arXiv preprint arXiv:2005.00646},
  year={2020}
}

Dependencies

  • Python >= 3.6
  • PyTorch == 1.1
  • transformers == 2.8.0
  • dgl == 0.3 (GPU version)
  • networkx == 2.3

Run the following commands to create a conda environment:

conda create -n pgqa python=3.6
source activate pgqa
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
pip install dgl-cu100
pip install transformers==2.8.0 tqdm networkx==2.3 nltk spacy==2.1.6
python -m spacy download en

For training a path generator

cd learning-generator
cd data
unzip conceptnet.zip
cd ..
python sample_path_rw.py

After path sampling, shuffle the resulting data './data/sample_path/sample_path.txt' and then split them into train.txt, dev.txt and test.txt by ratio of 0.9:0.05:0.05 under './data/sample_path/'

Then you can start to train the path generator by running

# the first arg is for specifying which gpu to use
./run.sh $gpu_device

The checkpoint of the path generator would be stored in './checkpoints/model.ckpt'. Move it to '../commonsense-qa/saved_models/pretrain_generator'. So far, we are done with training the generator.

Alternatively, you can also download our well-trained path generator checkpoint.

For training a commonsense qa system

1. Download Data

First, you need to download all the necessary data in order to train the model:

cd commonsense-qa
bash scripts/download.sh

2. Preprocess

To preprocess the data, run:

python preprocess.py

3. Using the path generator to connect question-answer entities

(Modify ./config/path_generate.config to specify the dataset and gpu device)

./scripts/run_generate.sh

4. Commonsense QA system training

bash scripts/run_main.sh ./config/csqa.config

Training process and final evaluation results would be stored in './saved_models/'

Owner
Peifeng Wang
Peifeng Wang
[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

MosaicKD Code for NeurIPS-21 paper "Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data" 1. Motivation Natural images share common l

ZJU-VIPA 37 Nov 10, 2022
A Benchmark For Measuring Systematic Generalization of Multi-Hierarchical Reasoning

Orchard Dataset This repository contains the code used for generating the Orchard Dataset, as seen in the Multi-Hierarchical Reasoning in Sequences: S

Bill Pung 1 Jun 05, 2022
Just-Now - This Is Just Now Login Friendlist Cloner Tools

JUST NOW LOGIN FRIENDLIST CLONER TOOLS Install $ apt update $ apt upgrade $ apt

MAHADI HASAN AFRIDI 21 Mar 09, 2022
PyTorch Implementations for DeeplabV3 and PSPNet

Pytorch-segmentation-toolbox DOC Pytorch code for semantic segmentation. This is a minimal code to run PSPnet and Deeplabv3 on Cityscape dataset. Shor

Zilong Huang 746 Dec 15, 2022
Modelisation on galaxy evolution using PEGASE-HR

model_galaxy Modelisation on galaxy evolution using PEGASE-HR This is a labwork done in internship at IAP directed by Damien Le Borgne (https://github

Adrien Anthore 1 Jan 14, 2022
A Python package for generating concise, high-quality summaries of a probability distribution

GoodPoints A Python package for generating concise, high-quality summaries of a probability distribution GoodPoints is a collection of tools for compr

Microsoft 28 Oct 10, 2022
Optimizes image files by converting them to webp while also updating all references.

About Optimizes images by (re-)saving them as webp. For every file it replaced it automatically updates all references. Works on single files as well

Watermelon Wolverine 18 Dec 23, 2022
Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes

Naive-Bayes Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes Downloading Data Set Use our Breast Cancer Wisconsin Data Set Also you can

Faeze Habibi 0 Apr 06, 2022
TuckER: Tensor Factorization for Knowledge Graph Completion

TuckER: Tensor Factorization for Knowledge Graph Completion This codebase contains PyTorch implementation of the paper: TuckER: Tensor Factorization f

Ivana Balazevic 296 Dec 06, 2022
Painting app using Python machine learning and vision technology.

AI Painting App We are making an app that will track our hand and helps us to draw from that. We will be using the advance knowledge of Machine Learni

Badsha Laskar 3 Oct 03, 2022
Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021) This repository is the official P

Jingyun Liang 159 Dec 30, 2022
Train Scene Graph Generation for Visual Genome and GQA in PyTorch >= 1.2 with improved zero and few-shot generalization.

Scene Graph Generation Object Detections Ground truth Scene Graph Generated Scene Graph In this visualization, woman sitting on rock is a zero-shot tr

Boris Knyazev 93 Dec 28, 2022
Image Data Augmentation in Keras

Image data augmentation is a technique that can be used to artificially expand the size of a training dataset by creating modified versions of images in the dataset.

Grace Ugochi Nneji 3 Feb 15, 2022
It is an open dataset for object detection in remote sensing images.

RSOD-Dataset It is an open dataset for object detection in remote sensing images. The dataset includes aircraft, oiltank, playground and overpass. The

136 Dec 08, 2022
Roger Labbe 13k Dec 29, 2022
High-Resolution Image Synthesis with Latent Diffusion Models

Latent Diffusion Models arXiv | BibTeX High-Resolution Image Synthesis with Latent Diffusion Models Robin Rombach*, Andreas Blattmann*, Dominik Lorenz

CompVis Heidelberg 5.6k Dec 30, 2022
Faster RCNN pytorch windows

Faster-RCNN-pytorch-windows Faster RCNN implementation with pytorch for windows Open cmd, compile this comands: cd lib python setup.py build develop T

Hwa-Rang Kim 1 Nov 11, 2022
The original weights of some Caffe models, ported to PyTorch.

pytorch-caffe-models This repo contains the original weights of some Caffe models, ported to PyTorch. Currently there are: GoogLeNet (Going Deeper wit

Katherine Crowson 9 Nov 04, 2022
Static Features Classifier - A static features classifier for Point-Could clusters using an Attention-RNN model

Static Features Classifier This is a static features classifier for Point-Could

ABDALKARIM MOHTASIB 1 Jan 25, 2022
Official repository for Hierarchical Opacity Propagation for Image Matting

HOP-Matting Official repository for Hierarchical Opacity Propagation for Image Matting 🚧 🚧 🚧 Under Construction 🚧 🚧 🚧 🚧 🚧 🚧   Coming Soon   

Li Yaoyi 54 Dec 30, 2021