Few-shot Natural Language Generation for Task-Oriented Dialog

Related tags

Text Data & NLPSC-GPT
Overview

Few-shot Natural Language Generation for Task-Oriented Dialog

This repository contains the dataset, source code and trained model for the following paper:

Few-shot Natural Language Generation for Task-Oriented Dialog Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng and Jianfeng Gao

ArXiv paper: https://arxiv.org/abs/2002.12328

This repository is based on hugginface transformer package and OpenAI GPT-2, containing model training code and pretrained medium model checkpoint. Some evaluation scripts are adapted from RNNLG. The results indicate that with minimal training examples, SC-GPT is able to generate natural language response given dialog acts naturally and adequately. It can be used to train an NLG model in new domains with very limited examples.

The include scripts can be used to reproduce the results reported in the paper.

Project and demo webpage: https://aka.ms/scgpt

Dataset: FewShotWoz

FewShotWoz is constructed using dataset from RNNLG and MultiWoz.

Data files includes

{domain}/train.json: training set in json format used for evaluation, other package like RNNLG also need this format. {domain}/train.txt: linearized training set for GPT-2 models. {domain}/test.json: testing set in json format. {domain}/test.txt: linearized testing set for GPT-2 models.

Data format

[
"inform(name='hakka restaurant';pricerange=moderate)", 
"hakka restaurant is moderate -ly priced", 
"hakka restaurant is moderate -ly priced" 
]

First item: dialog act
Second item: corresponding natural language description
Thrid item: repeated for evaluation script

Linearized as:
inform ( name = hakka restaurant ; pricerange = moderate ) & hakka restaurant is moderate -ly priced

Pipeline

The code is still under cleanup. More details of code usage will be added soon

Setup

Please use the below command to clone and install the requirements.

git clone https://github.com/pengbaolin/SC-GPT.git
cd SC-GPT
pip install -r requirements.txt

Fetch and unzip the checkpoint

wget https://bapengstorage.blob.core.windows.net/fileshare/scgpt.tar.gz
tar -xvf scgpt.tar.gz

Training

export CUDA_VISIBLE_DEVICES=0
python train.py --output_dir=MODEL_SAVE_PATH --model_type=gpt2 --model_name_or_path=PRE_TRINED_MODEL_PATH --do_train --do_eval --eval_data_file=data/restaurant/train.txt --per_gpu_train_batch_size 1 --num_train_epochs EPOCH --learning_rate LR --overwrite_cache --use_tokenize --train_data_file=data/restaurant/train.txt --overwrite_output_dir

MODEL_SAVE_PATH : Path of the saving model .

PRE_TRAINED_MODEL_PATH : Initial checkpoint; Could start from gpt2, gpt2-meidum or our provided scgpt folder.

EPOCH : Number of training epochs; 5 is enough for a reasonable performance

LR : Learning rate; 5e-5, 1e-5, or 1e-4

Decoding

export CUDA_VISIBLE_DEVICES=0
python generate.py --model_type=gpt2 --model_name_or_path=MODEL_SAVE_PATH --num_samples 5 --input_file=data/restaurant/test.txt --top_k 5 --output_file=results.json --length 80

Evaluate

python evaluator.py --domain restaurant results.json

script for attraction/train/taxi will be provided soon

Interact

python interact.py --model_type=gpt2 --model_name_or_path=MODEL_SAVE_PATH --length 50 --num_samples 5

Try our demo

The live demo is at https://aka.ms/scgpt. Please refer the examples on top to input dialog acts.

Disclaimer

This repository aims to facilitate research in large-scale pretraining for NLG in the context of dialog systems. This toolkit contains only part of the modeling machinery needed to actually produce a model weight file in a running dialog. On its own, this model provides only information about the weights of various text spans; in order for a researcher to actually use it, they will need to bring conversational data of their own and decode the response generation from the pretrained system. Microsoft is not responsible for any generation from the 3rd party utilization of the pretrained system.

Citation

if you use this code and data in your research, please cite our arxiv paper:

@misc{peng2020scgpt,
      title={Few-shot Natural Language Generation for Task-Oriented Dialog},
      author={Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao},
      archivePrefix={arXiv},
      year={2020},
      eprint={2002.12328},
      primaryClass={cs.CL}
}
A Chinese to English Neural Model Translation Project

ZH-EN NMT Chinese to English Neural Machine Translation This project is inspired by Stanford's CS224N NMT Project Dataset used in this project: News C

Zhenbang Feng 29 Nov 26, 2022
A simple visual front end to the Maya UE4 RBF plugin delivered with MetaHumans

poseWrangler Overview PoseWrangler is a simple UI to create and edit pose-driven relationships in Maya using the MayaUE4RBF plugin. This plugin is dis

Christopher Evans 105 Dec 18, 2022
Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch

N-Grammer - Pytorch Implementation of N-Grammer, augmenting Transformers with latent n-grams, in Pytorch Install $ pip install n-grammer-pytorch Usage

Phil Wang 66 Dec 29, 2022
Healthsea is a spaCy pipeline for analyzing user reviews of supplementary products for their effects on health.

Welcome to Healthsea ✨ Create better access to health with spaCy. Healthsea is a pipeline for analyzing user reviews to supplement products by extract

Explosion 75 Dec 19, 2022
kochat

Kochat 챗봇 빌더는 성에 안차고, 자신만의 딥러닝 챗봇 애플리케이션을 만드시고 싶으신가요? Kochat을 이용하면 손쉽게 자신만의 딥러닝 챗봇 애플리케이션을 빌드할 수 있습니다. # 1. 데이터셋 객체 생성 dataset = Dataset(ood=True) #

1 Oct 25, 2021
CoSENT 比Sentence-BERT更有效的句向量方案

CoSENT 比Sentence-BERT更有效的句向量方案

苏剑林(Jianlin Su) 201 Dec 12, 2022
Adversarial Examples for Extreme Multilabel Text Classification

Adversarial Examples for Extreme Multilabel Text Classification The code is adapted from the source codes of BERT-ATTACK [1], APLC_XLNet [2], and Atte

1 May 14, 2022
Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products

Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products.

Leah Pathan Khan 2 Jan 12, 2022
translate using your voice

speech-to-text-translator Usage translate using your voice description this project makes translating a word easy, all you have to do is speak and...

1 Oct 18, 2021
Uncomplete archive of files from the European Nopsled Team

European Nopsled CTF Archive This is an archive of collected material from various Capture the Flag competitions that the European Nopsled team played

European Nopsled 4 Nov 24, 2021
Code for PED: DETR For (Crowd) Pedestrian Detection

Code for PED: DETR For (Crowd) Pedestrian Detection

36 Sep 13, 2022
Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow.

Malaya-Speech is a Speech-Toolkit library for bahasa Malaysia, powered by Deep Learning Tensorflow. Documentation Proper documentation is available at

HUSEIN ZOLKEPLI 151 Jan 05, 2023
Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

Develop open-source Python Arabic NLP libraries that the Arab world will easily use in all Natural Language Processing applications

BADER ALABDAN 2 Oct 22, 2022
SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

The goal is to create a single, flexible, and user-friendly toolkit that can be used to easily develop state-of-the-art speech technologies, including systems for speech recognition, speaker recognit

SpeechBrain 5.1k Jan 09, 2023
Free and Open Source Machine Translation API. 100% self-hosted, offline capable and easy to setup.

LibreTranslate Try it online! | API Docs | Community Forum Free and Open Source Machine Translation API, entirely self-hosted. Unlike other APIs, it d

3.4k Dec 27, 2022
Code for CodeT5: a new code-aware pre-trained encoder-decoder model.

CodeT5: Identifier-aware Unified Pre-trained Encoder-Decoder Models for Code Understanding and Generation This is the official PyTorch implementation

Salesforce 564 Jan 08, 2023
Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Code for the paper: Sequence-to-Sequence Learning with Latent Neural Grammars

Yoon Kim 43 Dec 23, 2022
Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

"# bpe_algorithm_can_finetune_tokenizer" this is an implyment for https://github

张博 1 Feb 02, 2022
EasyTransfer is designed to make the development of transfer learning in NLP applications easier.

EasyTransfer is designed to make the development of transfer learning in NLP applications easier. The literature has witnessed the success of applying

Alibaba 819 Jan 03, 2023
ADCS cert template modification and ACL enumeration

Purpose This tool is designed to aid an operator in modifying ADCS certificate templates so that a created vulnerable state can be leveraged for privi

Fortalice Solutions, LLC 78 Dec 12, 2022