A Transformer Implementation that is easy to understand and customizable.

Last update: Jan 20, 2022

Overview

Simple Transformer

I've written a series of articles on the transformer architecture and language models on Medium.

This repository contains an implementation of the Transformer architecture presented in the paper Attention Is All You Need by Ashish Vaswani, et. al.

My goal is to write an implementation that is easy to understand and dig into nitty-gritty details where the devil is.

Python environment

You can use any Python virtual environment like venv and conda.

For example, with venv:

python3 -m venv venv
source venv/bin/activate

pip install --upgrade pip
pip install -e.

Spacy Tokenizer Data Preparation

To use Spacy's tokenizer, make sure to download required languages.

For example, English and Germany tokenizers can be downloaded as below:

python -m spacy download en_core_web_sm
python -m spacy download de_core_news_sm

Text Data from Torchtext

This project uses text datasets from Torchtext.

from torchtext import datasets

The default configuration uses Multi30k dataset.

Training

python train.py config_path

The default config path is config/config.yaml.

It is possible to resume training from a checkpoint.

python train.py --checkpoint_path runs/20220108-164720-Multi30k-Transformer/checkpoint-010-2.3343.pt

You can run tensorboard to see the training progress.

tensorboard --logdir=runs

The logs are created under runs.

Test

python test.py checkpoint_path

Example,

python test.py runs/20220108-164720-Multi30k-Transformer/checkpoint-010-2.3343.pt

config.yaml is copied to the model folder when training starts, and the test.py assumes the existence of a config yaml file.

Unit tests

There are some unit tests in the tests folder.

pytest tests

A Transformer Implementation that is easy to understand and customizable.

Related tags

Overview

Simple Transformer

Python environment

Spacy Tokenizer Data Preparation

Text Data from Torchtext

Training

Test

Unit tests

References:

Owner

Naoki Shibuya

Auto_code_complete is a auto word-completetion program which allows you to customize it on your needs

A Python module made to simplify the usage of Text To Speech and Speech Recognition.

A crowdsourced dataset of dialogues grounded in social contexts involving utilization of commonsense.

Python wrapper for Stanford CoreNLP tools v3.4.1

Text to speech is a process to convert any text into voice. Text to speech project takes words on digital devices and convert them into audio. Here I have used Google-text-to-speech library popularly known as gTTS library to convert text file to .mp3 file. Hope you like my project!

Just a Basic like Language for Zeno INC

spaCy plugin for Transformers , Udify, ELmo, etc.

Klexikon: A German Dataset for Joint Summarization and Simplification

Code for PED: DETR For (Crowd) Pedestrian Detection

customer care chatbot made with Rasa Open Source.

Code for hyperboloid embeddings for knowledge graph entities

PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP. Democratize AI for everyone.

Unofficial PyTorch implementation of Google AI's VoiceFilter system

Understand Text Summarization and create your own summarizer in python

A benchmark for evaluation and comparison of various NLP tasks in Persian language.

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

EasyTransfer is designed to make the development of transfer learning in NLP applications easier.

Ptorch NLU, a Chinese text classification and sequence annotation toolkit, supports multi class and multi label classification tasks of Chinese long text and short text, and supports sequence annotation tasks such as Chinese named entity recognition, part of speech tagging and word segmentation.

Stuff related to Ben Eater's 8bit breadboard computer

Chinese NewsTitle Generation Project by GPT2.带有超级详细注释的中文GPT2新闻标题生成项目。