Few-shot Natural Language Generation for Task-Oriented Dialog

Related tags

Text Data & NLPSC-GPT
Overview

Few-shot Natural Language Generation for Task-Oriented Dialog

This repository contains the dataset, source code and trained model for the following paper:

Few-shot Natural Language Generation for Task-Oriented Dialog Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng and Jianfeng Gao

ArXiv paper: https://arxiv.org/abs/2002.12328

This repository is based on hugginface transformer package and OpenAI GPT-2, containing model training code and pretrained medium model checkpoint. Some evaluation scripts are adapted from RNNLG. The results indicate that with minimal training examples, SC-GPT is able to generate natural language response given dialog acts naturally and adequately. It can be used to train an NLG model in new domains with very limited examples.

The include scripts can be used to reproduce the results reported in the paper.

Project and demo webpage: https://aka.ms/scgpt

Dataset: FewShotWoz

FewShotWoz is constructed using dataset from RNNLG and MultiWoz.

Data files includes

{domain}/train.json: training set in json format used for evaluation, other package like RNNLG also need this format. {domain}/train.txt: linearized training set for GPT-2 models. {domain}/test.json: testing set in json format. {domain}/test.txt: linearized testing set for GPT-2 models.

Data format

[
"inform(name='hakka restaurant';pricerange=moderate)", 
"hakka restaurant is moderate -ly priced", 
"hakka restaurant is moderate -ly priced" 
]

First item: dialog act
Second item: corresponding natural language description
Thrid item: repeated for evaluation script

Linearized as:
inform ( name = hakka restaurant ; pricerange = moderate ) & hakka restaurant is moderate -ly priced

Pipeline

The code is still under cleanup. More details of code usage will be added soon

Setup

Please use the below command to clone and install the requirements.

git clone https://github.com/pengbaolin/SC-GPT.git
cd SC-GPT
pip install -r requirements.txt

Fetch and unzip the checkpoint

wget https://bapengstorage.blob.core.windows.net/fileshare/scgpt.tar.gz
tar -xvf scgpt.tar.gz

Training

export CUDA_VISIBLE_DEVICES=0
python train.py --output_dir=MODEL_SAVE_PATH --model_type=gpt2 --model_name_or_path=PRE_TRINED_MODEL_PATH --do_train --do_eval --eval_data_file=data/restaurant/train.txt --per_gpu_train_batch_size 1 --num_train_epochs EPOCH --learning_rate LR --overwrite_cache --use_tokenize --train_data_file=data/restaurant/train.txt --overwrite_output_dir

MODEL_SAVE_PATH : Path of the saving model .

PRE_TRAINED_MODEL_PATH : Initial checkpoint; Could start from gpt2, gpt2-meidum or our provided scgpt folder.

EPOCH : Number of training epochs; 5 is enough for a reasonable performance

LR : Learning rate; 5e-5, 1e-5, or 1e-4

Decoding

export CUDA_VISIBLE_DEVICES=0
python generate.py --model_type=gpt2 --model_name_or_path=MODEL_SAVE_PATH --num_samples 5 --input_file=data/restaurant/test.txt --top_k 5 --output_file=results.json --length 80

Evaluate

python evaluator.py --domain restaurant results.json

script for attraction/train/taxi will be provided soon

Interact

python interact.py --model_type=gpt2 --model_name_or_path=MODEL_SAVE_PATH --length 50 --num_samples 5

Try our demo

The live demo is at https://aka.ms/scgpt. Please refer the examples on top to input dialog acts.

Disclaimer

This repository aims to facilitate research in large-scale pretraining for NLG in the context of dialog systems. This toolkit contains only part of the modeling machinery needed to actually produce a model weight file in a running dialog. On its own, this model provides only information about the weights of various text spans; in order for a researcher to actually use it, they will need to bring conversational data of their own and decode the response generation from the pretrained system. Microsoft is not responsible for any generation from the 3rd party utilization of the pretrained system.

Citation

if you use this code and data in your research, please cite our arxiv paper:

@misc{peng2020scgpt,
      title={Few-shot Natural Language Generation for Task-Oriented Dialog},
      author={Baolin Peng, Chenguang Zhu, Chunyuan Li, Xiujun Li, Jinchao Li, Michael Zeng, Jianfeng Gao},
      archivePrefix={arXiv},
      year={2020},
      eprint={2002.12328},
      primaryClass={cs.CL}
}
Code for PED: DETR For (Crowd) Pedestrian Detection

Code for PED: DETR For (Crowd) Pedestrian Detection

36 Sep 13, 2022
This code extends the neural style transfer image processing technique to video by generating smooth transitions between several reference style images

Neural Style Transfer Transition Video Processing By Brycen Westgarth and Tristan Jogminas Description This code extends the neural style transfer ima

Brycen Westgarth 110 Jan 07, 2023
📔️ Generate a text-based journal from a template file.

JGen 📔️ Generate a text-based journal from a template file. Contents Getting Started Example Overview Usage Details Reserved Keywords Gotchas Getting

Harrison Broadbent 21 Sep 25, 2022
Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Diaformer Diaformer: Automatic Diagnosis via Symptoms Sequence Generation (AAAI 2022) Diaformer is an efficient model for automatic diagnosis via symp

Junying Chen 20 Dec 13, 2022
simpleT5 is built on top of PyTorch-lightning⚡️ and Transformers🤗 that lets you quickly train your T5 models.

Quickly train T5 models in just 3 lines of code + ONNX support simpleT5 is built on top of PyTorch-lightning ⚡️ and Transformers 🤗 that lets you quic

Shivanand Roy 220 Dec 30, 2022
Create a machine learning model which will predict if the mortgage will be approved or not based on 5 variables

Mortgage-Application-Analysis Create a machine learning model which will predict if the mortgage will be approved or not based on 5 variables: age, in

1 Jan 29, 2022
MEDIALpy: MEDIcal Abbreviations Lookup in Python

A small python package that allows the user to look up common medical abbreviations.

Aberystwyth Systems Biology 7 Nov 09, 2022
OpenChat: Opensource chatting framework for generative models

OpenChat is opensource chatting framework for generative models.

Hyunwoong Ko 427 Jan 06, 2023
Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation

Unsupervised Document Expansion for Information Retrieval with Stochastic Text Generation Official Code Repository for the paper "Unsupervised Documen

NLP*CL Laboratory 2 Oct 26, 2021
ConferencingSpeech2022; Non-intrusive Objective Speech Quality Assessment (NISQA) Challenge

ConferencingSpeech 2022 challenge This repository contains the datasets list and scripts required for the ConferencingSpeech 2022 challenge. For more

21 Dec 02, 2022
Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022)

SyntaxGen Syntax-aware Multi-spans Generation for Reading Comprehension (TASLP 2022) In this repo, we upload all the scripts for this work. Due to siz

Zhuosheng Zhang 3 Jun 13, 2022
Mednlp - Medical natural language parsing and utility library

Medical natural language parsing and utility library A natural language medical

Paul Landes 3 Aug 24, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Jan 08, 2023
Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

Code for the paper "A Simple but Tough-to-Beat Baseline for Sentence Embeddings".

1.1k Dec 27, 2022
Official source for spanish Language Models and resources made @ BSC-TEMU within the "Plan de las Tecnologías del Lenguaje" (Plan-TL).

Spanish Language Models 💃🏻 Corpora 📃 Corpora Number of documents Size (GB) BNE 201,080,084 570GB Models 🤖 RoBERTa-base BNE: https://huggingface.co

PlanTL-SANIDAD 203 Dec 20, 2022
A minimal Conformer ASR implementation adapted from ESPnet.

Conformer ASR A minimal Conformer ASR implementation adapted from ESPnet. Introduction I want to use the pre-trained English ASR model provided by ESP

Niu Zhe 3 Jan 24, 2022
A machine learning model for analyzing text for user sentiment and determine whether its a positive, neutral, or negative review.

Sentiment Analysis on Yelp's Dataset Author: Roberto Sanchez, Talent Path: D1 Group Docker Deployment: Deployment of this application can be found her

Roberto Sanchez 0 Aug 04, 2021
Conversational-AI-ChatBot - Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users!

Conversational AI ChatBot Intelligent ChatBot built with Microsoft's DialoGPT transformer to make conversations with human users! In this project? Thi

Rajkumar Lakshmanamoorthy 6 Nov 30, 2022
A linter to manage all your python exceptions and try/except blocks (limited only for those who like dinosaurs).

Manage your exceptions in Python like a PRO Currently in BETA. Inspired by this blog post. I shared the building process of this tool here. “For those

Guilherme Latrova 353 Dec 31, 2022
Pipeline for chemical image-to-text competition

BMS-Molecular-Translation Introduction This is a pipeline for Bristol-Myers Squibb – Molecular Translation by Vadim Timakin and Maksim Zhdanov. We got

Maksim Zhdanov 7 Sep 20, 2022