Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Last update: Feb 07, 2022

Related tags

Overview

NLP-Summarizer

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

This project aimed to provide insight and explanations to current limitations on Natural Language Processing models by exploring the Transformer model, the latest state-of-the-art NLP solution, as well as discussing possible use cases for such tools in a domestic and workplace environment. An in-depth explanation of the architecture and the limitations it aims to solve was provided, as well as how it can be used to infer various tasks. Numerous use cases of NLP were also explored and how tools such as this can be extremely useful and have a massive impact on today’s society, both domestically and in the workplace. Three specific Transformer models were implemented using a GUI to evaluate their effectiveness. The final artefact provides a user with an interaction between the models for document summarisation tasks of variable output lengths.

Working Example

Following example created using another student's project introduction, original word count was ~1000.

Initial GUI

After Summarization

Getting Started

All code is ran using Python version 3.8.8
The artefact to be operated in it's entirety requires ~20GB of available space for downloads of the pre-trained models.

!pip install transformers
!pip install spacy==2.0.12
!pip install torch
!pip install tk

Runtime will be displayed as an output in console

Natural language processing summarizer using 3 state of the art Transformer models: BERT, GPT2, and T5

Related tags

Overview

NLP-Summarizer

Working Example

Initial GUI

After Summarization

Owner

Samuel Sharkey

Code for text augmentation method leveraging large-scale language models

Tool to check whether a GCP bucket is public or not.

Beyond Accuracy: Behavioral Testing of NLP models with CheckList

Code to reprudece NeurIPS paper: Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks

Example code for "Real-World Natural Language Processing"

Simple Text-To-Speech Bot For Discord

Code for Editing Factual Knowledge in Language Models

MEDIALpy: MEDIcal Abbreviations Lookup in Python

HF's ML for Audio study group

Revisiting Pre-trained Models for Chinese Natural Language Processing (Findings of EMNLP 2020)

A library for end-to-end learning of embedding index and retrieval model

An easy-to-use framework for BERT models, with trainers, various NLP tasks and detailed annonations

Write Alphabet, Words and Sentences with your eyes.

Learn meanings behind words is a key element in NLP. This project concentrates on the disambiguation of preposition senses. Therefore, we train a bert-transformer model and surpass the state-of-the-art.

Retraining OpenAI's GPT-2 on Discord Chats

A PyTorch Implementation of End-to-End Models for Speech-to-Text

ProteinBERT is a universal protein language model pretrained on ~106M proteins from the UniRef90 dataset.

Python SDK for working with Voicegain Speech-to-Text

Shared, streaming Python dict

Repository for Graph2Pix: A Graph-Based Image to Image Translation Framework