NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation (ACL-IJCNLP 2021)

Overview

NeuralWOZ

This code is official implementation of "NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation".

Sungdong Kim, Minsuk Chang, Sang-woo Lee
In ACL 2021.

Citation

@inproceedings{kim2021neuralwoz,
  title={NeuralWOZ: Learning to Collect Task-Oriented Dialogue via Model-based Simulation},
  author={Kim, Sungdong and Chang, Minsuk and Lee, Sang-woo},
  booktitle={ACL},
  year={2021}
}

Requirements

python3.6
torch==1.4.0
transformers==2.11.0

Please install apex for the mixed precision training.
See details in requirements.txt

Data Download and Preprocessing

1. Download dataset

Please run this script at first. It will create data repository, and save and preprocess MultiWOZ 2.1 dataset.

python3 create_data.py

2. Preprocessing

To train NeuralWOZ under various settings, you should create each training instances with running below script.

python3 neuralwoz/preprocess.py --exceptd $TARGET_DOMAIN --fewshot_ratio $FEWSHOT_RATIO
  • exceptd: Specify "target domain" to exclude from training dataset for leave-one-out scheme. It is one of the (hotel|restaurant|attraction|train|taxi).
  • fewshot_ratio: Choose proportion of examples in the target domain to include. Default is 0. which means zero-shot. It is one of the (0.|0.01|0.05|0.1). You can check the fewshot examples in the assets/fewshot_key.json.

This script will create "$TARGET_DOMAIN_$FEWSHOT_RATIO_collector_(train|dev).json" and "$TARGET_DOMAIN_$FEWSHOT_RATIO_labeler_train.h5".

Training NeuralWOZ

You should specify output_path to save the trained model.
Each output consists of the below four files after the training.

  • pytorch_model.bin
  • config.json
  • vocab.json
  • merges.txt

For each zero/few-shot settings, you should set the TRAIN_DATA and DEV_DATA from the preprocessing. For example, hotel_0.0_collector_(train|dev).json should be used for the Collector training when the target domain is hotel in the zero-shot domain transfer task.

We use N_GPU=4 and N_ACCUM=2 for Collector training and N_GPU=2 and N_ACCUM=2 for Labeler training to fit 32 for batch size based on V100 32GB GPU.

1. Collector

python3 neuralwoz/train_collector.py \
  --dataset_dir data \
  --output_path $OUTPUT_PATH \
  --model_name_or_path facebook/bart-large \
  --train_data $TRAIN_DATA \
  --dev_data $DEV_DATA \
  --n_gpu $N_GPU \
  --per_gpu_train_batch_size 4 \
  --num_train_epochs 30 \
  --learning_rate 1e-5 \
  --gradient_accumulation_steps $N_ACCUM \
  --warmup_steps 1000 \
  --fp16

2. Labeler

python3 neuralwoz/train_labeler.py \
  --dataset_dir data \
  --output_path $OUTPUT_PATH \
  --model_name_or_path roberta-base-dream \
  --train_data $TRAIN_DATA \
  --dev_data labeler_dev_data.json \
  --n_gpu $N_GPU \
  --per_gpu_train_batch_size 8 \
  --num_train_epochs 10 \
  --learning_rate 1e-5 \
  --gradient_accumulation_steps $N_ACCUM \
  --warmup_steps 1000 \
  --beta 5. \
  --fp16

Download Synthetic Dialogues from NeuralWOZ

Please download synthetic dialogues from here

  • The naming convention is nwoz_{target_domain}_{fewshot_proportion}.json
  • Each dataset contains synthesized dialogues from our NeuralWOZ
  • Specifically, It contains synthetic dialogues for the target_domain while excluding original dialogues for the target domain (leave-one-out setup)
  • You can check the i-th synthesized dialogue in each files with aug_{target_domain}_{fewshot_proprotion}_{i} for dialogue_idx key.
  • You can use the json file to directly train zero/few-shot learner for DST task
  • Please see readme for training TRADE and readme for training SUMBT using the dataset
  • If you want to synthesize your own dialogues, please see below sections.

Download Pretrained Models

Pretrained models are available in this link. The naming convention is like below

  • NEURALWOZ: (Collector|Labeler)_{target_domain}_{fewshot_proportion}.tar.gz
  • TRADE: nwoz_TRADE_{target_domain}_{fewshot_proportion}.tar.gz
  • SUMBT: nwoz_SUMBT_{target_domain}_{fewshot_proportion}.tar.gz

To synthesize your own dialogues, please download and unzip both of Collector and Labeler in same target domain and fewshot_proportion at $COLLECTOR_PATH and $LABELER_PATH, repectively.

Please use tar -zxvf MODEL.tar.gz for the unzipping.

Generate Synthetic Dialogues using NeuralWOZ

python3 neuralwoz/run_neuralwoz.py \
  --dataset_dir data \
  --output_dir data \
  --output_file_name neuralwoz-output.json \
  --target_data collector_dev_data.json \
  --include_domain $TARGET_DOMAIN \
  --collector_path $COLLECTOR_PATH \
  --labeler_path $LABELER_PATH \
  --num_dialogues $NUM_DIALOGUES \
  --batch_size 16 \
  --num_beams 1 \
  --top_k 0 \
  --top_p 0.98 \
  --temperature 0.9 \
  --include_missing_dontcare

License

Copyright 2021-present NAVER Corp.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
Owner
NAVER AI
Official account of NAVER AI, Korea No.1 Industrial AI Research Group
NAVER AI
Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning

Beta Shapley: a Unified and Noise-reduced Data Valuation Framework for Machine Learning This repository provides an implementation of the paper Beta S

Yongchan Kwon 28 Nov 10, 2022
An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters

CNN-Filter-DB An Empirical Investigation of Model-to-Model Distribution Shifts in Trained Convolutional Filters Paul Gavrikov, Janis Keuper Paper: htt

Paul Gavrikov 18 Dec 30, 2022
Split your patch similarly to `git add -p` but supporting multiple buckets

split-patch.py This is git add -p on steroids for patches. Given a my.patch you can run ./split-patch.py my.patch You can choose in which bucket to p

102 Oct 06, 2022
Time Series Cross-Validation -- an extension for scikit-learn

TSCV: Time Series Cross-Validation This repository is a scikit-learn extension for time series cross-validation. It introduces gaps between the traini

Wenjie Zheng 222 Jan 01, 2023
YOLOV4运行在嵌入式设备上

在嵌入式设备上实现YOLO V4 tiny 在嵌入式设备上实现YOLO V4 tiny 目录结构 目录结构 |-- YOLO V4 tiny |-- .gitignore |-- LICENSE |-- README.md |-- test.txt |-- t

Liu-Wei 6 Sep 09, 2021
Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective

Unofficial pytorch implementation of the paper "Unsupervised Real-World Super-Resolution: A Domain Adaptation Perspective"

16 Nov 21, 2022
Huawei Hackathon 2021 - Sweden (Stockholm)

huawei-hackathon-2021 Contributors DrakeAxelrod Challenge Requirements: python=3.8.10 Standard libraries (no importing) Important factors: Data depend

Drake Axelrod 32 Nov 08, 2022
Learning What and Where to Draw

###Learning What and Where to Draw Scott Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee This is the code for our NIPS 201

Scott Ellison Reed 337 Nov 18, 2022
PyTorch3D is FAIR's library of reusable components for deep learning with 3D data

Introduction PyTorch3D provides efficient, reusable components for 3D Computer Vision research with PyTorch. Key features include: Data structure for

Facebook Research 6.8k Jan 01, 2023
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"

Hierarchical Token Semantic Audio Transformer Introduction The Code Repository for "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound

Knut(Ke) Chen 134 Jan 01, 2023
MMdnn is a set of tools to help users inter-operate among different deep learning frameworks. E.g. model conversion and visualization. Convert models between Caffe, Keras, MXNet, Tensorflow, CNTK, PyTorch Onnx and CoreML.

MMdnn MMdnn is a comprehensive and cross-framework tool to convert, visualize and diagnose deep learning (DL) models. The "MM" stands for model manage

Microsoft 5.7k Jan 09, 2023
Reaction SMILES-AA mapping via language modelling

rxn-aa-mapper Reactions SMILES-AA sequence mapping setup conda env create -f conda.yml conda activate rxn_aa_mapper In the following we consider on ex

16 Dec 13, 2022
Bayesian optimisation library developped by Huawei Noah's Ark Library

Bayesian Optimisation Research This directory contains official implementations for Bayesian optimisation works developped by Huawei R&D, Noah's Ark L

HUAWEI Noah's Ark Lab 395 Dec 30, 2022
A flexible and extensible framework for gait recognition.

A flexible and extensible framework for gait recognition. You can focus on designing your own models and comparing with state-of-the-arts easily with the help of OpenGait.

Shiqi Yu 335 Dec 22, 2022
Perform zero-order Hankel Transform for an 1D array (float or real valued).

perform zero-order Hankel Transform for an 1D array (float or real valued). An discrete form of Parseval theorem is guaranteed. Suit for iterative problems.

1 Jan 17, 2022
(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

(Preprint) Official PyTorch implementation of "How Do Vision Transformers Work?"

xxxnell 656 Dec 30, 2022
The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop.

AICITY2021_Track2_DMT The 1st place solution of track2 (Vehicle Re-Identification) in the NVIDIA AI City Challenge at CVPR 2021 Workshop. Introduction

Hao Luo 91 Dec 21, 2022
Code for weakly supervised segmentation of a single class

SingleClassRL Implementation of weak single object segmentation from paper "Regularized Loss for Weakly Supervised Single Class Semantic Segmentation"

16 Nov 14, 2022
Motion planning environment for Sampling-based Planners

Sampling-Based Motion Planners' Testing Environment Sampling-based motion planners' testing environment (sbp-env) is a full feature framework to quick

Soraxas 23 Aug 23, 2022
😮The official implementation of "CoNeRF: Controllable Neural Radiance Fields" 😮

CoNeRF: Controllable Neural Radiance Fields This is the official implementation for "CoNeRF: Controllable Neural Radiance Fields" Project Page Paper V

Kacper Kania 61 Dec 24, 2022