Well-formed Limericks and Haikus with GPT2

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

In collaboration with Matthew Korahais & Daniel Korsunsky

Abstract

We explore the capabilities and limits of GPT-2 in the case of well-formed poems, specifically limericks and haikus. We hypothesized that GPT-2 trained without phonetic annotations would be unable to systematically learn and generate syllabic patterns and rhyme scheme, since these features are grounded in real world acoustic representations. Our model trained with list-of-rhymes annotations outperformed baselines, generating perfect-scoring limericks 33% of the time. Our best haiku model generated valid haikus in 29% of cases, with an average syllable error rate of <0.4. Our work invites further research into methods of combining text and phonetic data for more convincing text generation.

Limericks Colab here ->

Haiku Colab Here ->

Evaluation Data Here: https://docs.google.com/spreadsheets/d/1rd1qCbCcTX1zHa0Dvh1q8OJ2iidxxrifTJlYWg3MMes

Examples (Find more in the repo):

Limericks

To the one grading our research, I'd say,
that a lot of work's been done today.
our paper's been checked,
And our work is all correct.
We're not mired in conjecture today.

The Indians' chief deity, they say,
Was a god of the earth all day.
But the gods he made
Were the ones who would fade
As they were replaced by a new way.

A large, thick, thick, and thickly cut tree
(A weeping cedar) will please me.
It's a tree that's known
As a cedar it's own,
And it's named for a bird that I see.

Haiku

The only thing that
gets me going is you So
let's keep this going

Saw a duck come in
from the woods and now i know
what a duck is lol

the only thing I
wanna say to you is good
bye don't disappoint

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

Related tags

Overview

Well-formed Limericks and Haikus with GPT2

📜 GPT-2 Rhyming Limerick and Haiku models using data augmentation

In collaboration with Matthew Korahais & Daniel Korsunsky

Abstract

Examples (Find more in the repo):

Limericks

Haiku

Owner

Bardia Shahrestani

nlabel is a library for generating, storing and retrieving tagging information and embedding vectors from various nlp libraries through a unified interface.

End-2-end speech synthesis with recurrent neural networks

The simple project to separate mixed voice (2 clean voices) to 2 separate voices.

Deep Learning Topics with Computer Vision & NLP

숭실대학교 컴퓨터학부 전공종합설계프로젝트

spaCy-wrap: For Wrapping fine-tuned transformers in spaCy pipelines

A Survey of Natural Language Generation in Task-Oriented Dialogue System (TOD): Recent Advances and New Frontiers

CYGNUS, the Cynical AI, combines snarky responses with uncanny aggression.

A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

Translation for Trilium Notes. Trilium Notes 中文版.

This is the library for the Unbounded Interleaved-State Recurrent Neural Network (UIS-RNN) algorithm, corresponding to the paper Fully Supervised Speaker Diarization.

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。

A relatively simple python program to generate one of those reddit text to speech videos dominating youtube.

Open source annotation tool for machine learning practitioners.

MPNet: Masked and Permuted Pre-training for Language Understanding

中文空间语义理解评测

This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Abstractive Text Summarization for 1500+ Language Pairs".

GCRC: A Gaokao Chinese Reading Comprehension dataset for interpretable Evaluation

Unlimited Call - Text Bombing Tool

code for modular summarization work published in ACL2021 by Krishna et al