Implementation for the EMNLP 2021 paper "Interactive Machine Comprehension with Dynamic Knowledge Graphs".

Overview

Interactive Machine Comprehension with Dynamic Knowledge Graphs


Implementation for the EMNLP 2021 paper.

Dependencies

apt-get -y update
apt-get install -y unzip zip parallel
conda create -p /tmp/imrc python=3.6 numpy scipy cython nltk
conda activate /tmp/imrc
pip install --upgrade pip
pip install numpy==1.16.2
pip install gym==0.15.4
pip install tqdm pipreqs pyyaml pytz visdom
conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
pip install transformers
pip install allennlp

Data Preparation

Split SQuAD 1.1 and preprocess

The original SQuAD dataset does not provide its test set, we take 23 wiki articles from its training set as our validation set. We then use the SQuAD dev set as our test set.

# download SQuAD from official website, then
python utils/split_original_squad.py

To speed up training, we parse (tokenization and SRL) the dataset in advance.

python utils/preproc_squad.py

This will result squad_split/processed_squad.1.1.split.[train/valid/test].json, which are used in iMRC tasks.

Preprocess Wikipedia data for self-supervised learning

python utils/get_wiki_filter_squad.py
python utils/split_wiki_data.py

This will result wiki_without_squad/wiki_without_squad_[train/valid/test].json, which are used to pre-train the continuous belief graph generator.

Training

To train the agent equipped with different types of graphs, run:

# without graph
python main.py configs/imrc_none.yaml

# co-occurrence graph
python main.py configs/imrc_cooccur.yaml

# relative position graph
python main.py configs/imrc_rel_pos.yaml

# SRL graph
python main.py configs/imrc_srl.yaml

# continuous belief graph
# in this setting, we need a pre-trained graph generator.
# we provide our pre-trained graph generator at
# https://drive.google.com/drive/folders/1zZ7C_-xaYsfg2Ms7_BO5n3Qzx69UqMKD?usp=sharing

# one can choose to train their own version by:
python pretrain_observation_infomax.py configs/pretrain_cont_bnelief.yaml
# then using the downloaded/saved model checkpoint
python main.py configs/imrc_cont_belief.yaml

To change the task settings/configurations:

general:
  naozi_capacity: 1  # capacity of agent's external memory queue (1, 3, 5)
  generate_or_point: "point"  # "qmpoint": q+o_t, "point": q, "generate": vocab
  disable_prev_next: False  # False: Easy Mode, True: Hard Mode

model:
  recurrent: True  # recurrent component described in Section 3.3 and Section 4.Additional Results

Citation

@inproceedings{Yuan2021imrc_graph,
  title={Interactive Machine Comprehension with Dynamic Knowledge Graphs},
  author={Xingdi Yuan},
  year={2021},
  booktitle="EMNLP",
}
Owner
Xingdi (Eric) Yuan
Senior Research Engineer at Microsoft Research, Montréal
Xingdi (Eric) Yuan
Final project for machine learning (CSC 590). Detection of hepatitis C and progression through blood samples.

Hepatitis C Blood Based Detection Final project for machine learning (CSC 590). Dataset from Kaggle. Using data from previous hepatitis C blood panels

Jennefer Maldonado 1 Dec 28, 2021
Active learning for Mask R-CNN in Detectron2

MaskAL - Active learning for Mask R-CNN in Detectron2 Summary MaskAL is an active learning framework that automatically selects the most-informative i

49 Dec 20, 2022
NUANCED is a user-centric conversational recommendation dataset that contains 5.1k annotated dialogues and 26k high-quality user turns.

NUANCED: Natural Utterance Annotation for Nuanced Conversation with Estimated Distributions Overview NUANCED is a user-centric conversational recommen

Facebook Research 18 Dec 28, 2021
TensorFlow (Python) implementation of DeepTCN model for multivariate time series forecasting.

DeepTCN TensorFlow TensorFlow (Python) implementation of multivariate time series forecasting model introduced in Chen, Y., Kang, Y., Chen, Y., & Wang

Flavia Giammarino 21 Dec 19, 2022
[AAAI 2021] MVFNet: Multi-View Fusion Network for Efficient Video Recognition

MVFNet: Multi-View Fusion Network for Efficient Video Recognition (AAAI 2021) Overview We release the code of the MVFNet (Multi-View Fusion Network).

Wenhao Wu 114 Nov 27, 2022
A PyTorch Implementation of "SINE: Scalable Incomplete Network Embedding" (ICDM 2018).

Scalable Incomplete Network Embedding ⠀⠀ A PyTorch implementation of Scalable Incomplete Network Embedding (ICDM 2018). Abstract Attributed network em

Benedek Rozemberczki 69 Sep 22, 2022
View model summaries in PyTorch!

torchinfo (formerly torch-summary) Torchinfo provides information complementary to what is provided by print(your_model) in PyTorch, similar to Tensor

Tyler Yep 1.5k Jan 05, 2023
A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

APPNP ⠀ A PyTorch implementation of Predict then Propagate: Graph Neural Networks meet Personalized PageRank (ICLR 2019). Abstract Neural message pass

Benedek Rozemberczki 329 Dec 30, 2022
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Mohamed Afham 152 Dec 23, 2022
Pytorch implementation of the DeepDream computer vision algorithm

deep-dream-in-pytorch Pytorch (https://github.com/pytorch/pytorch) implementation of the deep dream (https://en.wikipedia.org/wiki/DeepDream) computer

102 Dec 05, 2022
iNAS: Integral NAS for Device-Aware Salient Object Detection

iNAS: Integral NAS for Device-Aware Salient Object Detection Introduction Integral search design (jointly consider backbone/head structures, design/de

顾宇超 77 Dec 02, 2022
NVIDIA container runtime

nvidia-container-runtime A modified version of runc adding a custom pre-start hook to all containers. If environment variable NVIDIA_VISIBLE_DEVICES i

NVIDIA Corporation 938 Jan 06, 2023
[CVPR'21 Oral] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning

Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning [CVPR'21, Oral] By Zhicheng Huang*, Zhaoyang Zeng*, Yupan H

Multimedia Research 196 Dec 13, 2022
Original Implementation of Prompt Tuning from Lester, et al, 2021

Prompt Tuning This is the code to reproduce the experiments from the EMNLP 2021 paper "The Power of Scale for Parameter-Efficient Prompt Tuning" (Lest

Google Research 282 Dec 28, 2022
SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

Liumeo 25 Dec 21, 2022
Edge Restoration Quality Assessment

ERQA - Edge Restoration Quality Assessment ERQA - a full-reference quality metric designed to analyze how good image and video restoration methods (SR

MSU Video Group 27 Dec 17, 2022
Code for our CVPR 2021 Paper "Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes".

Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes (CVPR 2021) Project page | Paper | Colab | Colab for Drawing App Rethinking Style

CompVis Heidelberg 153 Jan 04, 2023
Algorithmic trading with deep learning experiments

Deep-Trading Algorithmic trading with deep learning experiments. Now released part one - simple time series forecasting. I plan to implement more soph

Alex Honchar 1.4k Jan 02, 2023
Code for Environment Dynamics Decomposition (ED2).

ED2 Code for Environment Dynamics Decomposition (ED2). Installation Follow the installation in MBPO and Dreamer. Usage First follow the SD2 method for

0 Aug 10, 2021
This project is based on RIFE and aims to make RIFE more practical for users by adding various features and design new models

CPM 项目描述 CPM(Chinese Pretrained Models)模型是北京智源人工智能研究院和清华大学发布的中文大规模预训练模型。官方发布了三种规模的模型,参数量分别为109M、334M、2.6B,用户需申请与通过审核,方可下载。 由于原项目需要考虑大模型的训练和使用,需要安装较为复杂

hzwer 190 Jan 08, 2023