Honor's thesis project analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry.

Last update: Jan 09, 2022

Related tags

Overview

gpt2-poetry

The following code is for my senior honor's thesis project, under the guidance of Dr. Keith Holyoak at the University of California, Los Angeles.

I am currently analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry by utilizing the GPT-2 architecture (code originated from "Language Models are Unsupervised Multitask Learners" by Radford et. al., paper at this link: https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf) to generate poetry trained on two different corpora: a corpora of sonnets (fourteen lined, rhymed poems) and another corpora of free-verse poetry from ten to eighteen lines selected from Poetry Magazine's issues from January 2012 - December 2021. I plan to compare the quality of these poems to randomly selected human-written poems from each of the training sets through a participant survey on the different characteristics of poetry.

To run: install Python 3.9.8, as well as the following modules: Fire 0.1.3, Regex 2017.4.5, Requests 2.21.0, tqdm 4.31.1, and toposort 1.5.

This project is in process and solely the free-verse portion of the data is currently uploaded to Github. The sonnets generated by the GPT-2 model will be uploaded soon!

Last updated: 1/5/2021

Honor's thesis project analyzing whether the GPT-2 model can more effectively generate free-verse or structured poetry.

Related tags

Overview

gpt2-poetry

Owner

Ashley Kim

This Project is based on NLTK It generates a RANDOM WORD from a predefined list of words, From that random word it read out the word, its meaning with parts of speech , its antonyms, its synonyms

Course project of [email protected]

MEDIALpy: MEDIcal Abbreviations Lookup in Python

Train 🤗transformers with DeepSpeed: ZeRO-2, ZeRO-3

Implementation of TF-IDF algorithm to find documents similarity with cosine similarity

Built for cleaning purposes in military institutions

Code for EmBERT, a transformer model for embodied, language-guided visual task completion.

Index different CKAN entities in Solr, not just datasets

CPC-big and k-means clustering for zero-resource speech processing

BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions

This repository contains the code for "Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference"

BookNLP, a natural language processing pipeline for books

Twitter-Sentiment-Analysis - Analysis of twitter posts' positive and negative score.

Textlesslib - Library for Textless Spoken Language Processing

A simple Streamlit App to classify swahili news into different categories.

Official PyTorch implementation of SegFormer

Code for our paper "Transfer Learning for Sequence Generation: from Single-source to Multi-source" in ACL 2021.

Code to use Augmented Shapiro Wilks Stopping, as well as code for the paper "Statistically Signifigant Stopping of Neural Network Training"

Blackstone is a spaCy model and library for processing long-form, unstructured legal text

Python library for processing Chinese text