This repo contains the code and data used in the paper "Wizard of Search Engine: Access to Information Through Conversations with Search Engines"

Overview

Wizard of Search Engine: Access to Information Through Conversations with Search Engines

by Pengjie Ren, Zhongkun Liu, Xiaomeng Song, Hongtao Tian, Zhumin Chen, Zhaochun Ren and Maarten de Rijke

@inproceedings{ren2021wizard,
title={Wizard of Search Engine: Access to Information Through Conversations with Search Engines},
author={Ren, Pengjie and Liu, Zhongkun and Song, Xiaomeng and Tian, Hongtao and Chen, Zhumin and Ren, Zhaochun and de Rijke, Maarten},
booktitle={Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year={2021}
}

Paper summary

task
Task pipeline for conversational information seeking (CIS)
model
Model pipeline for conversational information seeking (CIS)

In this work, we make efforts to facilitate research on conversational information seeking (CIS) from three angles: (1) We formulate a pipeline for CIS with six sub-tasks: intent detection, keyphrase extraction, action prediction, query selection, passage selection, and response generation. (2) We release a benchmark dataset, called wizard of search engine(WISE), which allows for comprehensive and in-depth research on all aspects of CIS. (3) We design a neural architecture capable of training and evaluating both jointly and separately on the six sub-tasks, and devise a pre-train/fine-tune learning scheme, that can reduce the requirements of WISE in scale by making full use of available data.

Running experiments

Requirements

This code is written in PyTorch. Any version later than 1.6 is expected to work with the provided code. Please refer to the official website for an installation guide.

We recommend to use conda for installing the requirements. If you haven't installed conda yet, you can find instructions here. The steps for installing the requirements are:

  • Create a new environment

    conda create env -n WISE
    

    In the environment, a python version >3.6 should be used.

  • Activate the environment

    conda activate WISE
    
  • Install the requirements within the environment via pip:

    pip install -r requirements.txt
    

Datasets

We use WebQA, DuReader, KdConv and DuConv datasets for pretraining. You can get them from the provided links and put them in the corresponding folders in ./data/. For example, WebQA datasets should be put in ./data/WebQA, and DuReader datasets in ./data/Dureader and so on. We use the WISE dataset to fine-tune the model, and this dataset is available in ./data/WISE. Details about the WISE dataset can be found here.

Training

  • Run the following scripts to automatically process the pretraining datasets into the required format:
python ./Run.py --mode='data'
  • Run the following scripts sequentially:
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='pretrain'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='finetune'

Note that you should select the appropriate pretrain models from the folder ./output/pretrained, and put them into ./output/pretrained_ready which is newly created by yourself before finetuning. The hyperparameters are set to the default values used in our experiments. To see an overview of all hyperparameters, please refer to ./Run.py.

Evaluating

  • Run the following scripts:
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='infer-valid'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='eval-valid'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='infer-test'
python -m torch.distributed.launch --nproc_per_node=4 ./Run.py --mode='eval-test'
Code for the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness"

DU-VAE This is the pytorch implementation of the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness" Acknowledgement

Dazhong Shen 4 Oct 19, 2022
Simulator for FRC 2022 challenge: Rapid React

rrsim Simulator for FRC 2022 challenge: Rapid React out-1.mp4 Usage In order to run the simulator use the following: python3 rrsim.py [config_path] wh

1 Jan 18, 2022
PyTorch implementation of ENet

PyTorch-ENet PyTorch (v1.1.0) implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from the lua-torc

David Silva 333 Dec 29, 2022
Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021.

NL-CSNet-Pytorch Pytorch code for paper "Image Compressed Sensing Using Non-local Neural Network" TMM 2021. Note: this repo only shows the strategy of

WenxueCui 7 Nov 07, 2022
EfficientNetv2 TensorRT int8

EfficientNetv2_TensorRT_int8 EfficientNetv2模型实现来自https://github.com/d-li14/efficientnetv2.pytorch 环境配置 ubuntu:18.04 cuda:11.0 cudnn:8.0 tensorrt:7

34 Apr 24, 2022
Malware Bypass Research using Reinforcement Learning

Malware Bypass Research using Reinforcement Learning

Bobby Filar 76 Dec 26, 2022
Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition

Efficient Conformer: Progressive Downsampling and Grouped Attention for Automatic Speech Recognition Official implementation of the Efficient Conforme

Maxime Burchi 145 Dec 30, 2022
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

Phil Wang 12.6k Jan 09, 2023
An Active Automata Learning Library Written in Python

AALpy An Active Automata Learning Library AALpy is a light-weight active automata learning library written in pure Python. You can start learning auto

TU Graz - SAL Dependable Embedded Systems Lab (DES Lab) 78 Dec 30, 2022
Implementation of ProteinBERT in Pytorch

ProteinBERT - Pytorch (wip) Implementation of ProteinBERT in Pytorch. Original Repository Install $ pip install protein-bert-pytorch Usage import torc

Phil Wang 92 Dec 25, 2022
NBEATSx: Neural basis expansion analysis with exogenous variables

NBEATSx: Neural basis expansion analysis with exogenous variables We extend the NBEATS model to incorporate exogenous factors. The resulting method, c

Cristian Challu 100 Dec 31, 2022
Deeper insights into graph convolutional networks for semi-supervised learning

deeper_insights_into_GCNs Deeper insights into graph convolutional networks for semi-supervised learning References data and utils.py come from Implem

Davidham3 17 Dec 16, 2022
Code for Greedy Gradient Ensemble for Visual Question Answering (ICCV 2021, Oral)

Greedy Gradient Ensemble for De-biased VQA Code release for "Greedy Gradient Ensemble for Robust Visual Question Answering" (ICCV 2021, Oral). GGE can

21 Jun 29, 2022
Direct design of biquad filter cascades with deep learning by sampling random polynomials.

IIRNet Direct design of biquad filter cascades with deep learning by sampling random polynomials. Usage git clone https://github.com/csteinmetz1/IIRNe

Christian J. Steinmetz 55 Nov 02, 2022
Explainer for black box models that predict molecule properties

Explaining why that molecule exmol is a package to explain black-box predictions of molecules. The package uses model agnostic explanations to help us

White Laboratory 172 Dec 19, 2022
The official codes for the ICCV2021 presentation "Uniformity in Heterogeneity: Diving Deep into Count Interval Partition for Crowd Counting"

UEPNet (ICCV2021 Poster Presentation) This repository contains codes for the official implementation in PyTorch of UEPNet as described in Uniformity i

Tencent YouTu Research 15 Dec 14, 2022
Manim is an engine for precise programmatic animations, designed for creating explanatory math videos

Manim is an engine for precise programmatic animations, designed for creating explanatory math videos. Note, there are two versions of manim. This rep

Grant Sanderson 49k Jan 09, 2023
Source code of our BMVC 2021 paper: AniFormer: Data-driven 3D Animation with Transformer

AniFormer This is the PyTorch implementation of our BMVC 2021 paper AniFormer: Data-driven 3D Animation with Transformer. Haoyu Chen, Hao Tang, Nicu S

24 Nov 02, 2022
A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

A minimal yet resourceful implementation of diffusion models (along with pretrained models + synthetic images for nine datasets)

Vikash Sehwag 65 Dec 19, 2022
A 2D Visual Localization Framework based on Essential Matrices [ICRA2020]

A 2D Visual Localization Framework based on Essential Matrices This repository provides implementation of our paper accepted at ICRA: To Learn or Not

Qunjie Zhou 27 Nov 07, 2022