Neural machine translation between the writings of Shakespeare and modern English using TensorFlow

Overview

Shakespeare translations using TensorFlow

This is an example of using the new Google's TensorFlow library on monolingual translation going from modern English to Shakespeare based on research from Wei Xu.

Prepare

First download the TensorFlow library depending on your platform:

pip install https://storage.googleapis.com/tensorflow/mac/tensorflow-0.5.0-py2-none-any.whl # for mac
pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.5.0-cp27-none-linux_x86_64.whl # for ubuntu
  1. Grabs parallel data.
  2. Gets train, dev split.
  3. Builds vocabulary
  4. Converts parallel data into ids

From the root directory:

python -m tensorshake.get_data
python -m tensorshake.prepare_corpus

Delete /cache to start anew.

Train

Use the example BASH script to train the model. This saves the check points in the --train_dir directory. If you run it again, the training process continues from the check point. To restart with fresh parameters, simply delete/rename the check points.

./run.sh

Results

Benchmarks from original paper. (Shakespeare -> Modern English)

Input Output
i will bite thee by the ear for that jest . i ’ ll bite you by the ear for that joke .
what further woe conspires against mine age ? what ’ s true despair conspires against my old age ?
how doth my lady ? how is my lady ?
hast thou slain tybalt ? have you killed tybalt ?
an i might live to see thee married once , i have my wish . if i could live to see you married, i ’ ve my wish .
benvolio , who began this bloody fray ? benvolio , who started this bloody fight itself ?
what is your will ? what do you want ?
call her forth to me . bring her out to me .

Cherrypicked examples from this repo (Modern English -> Shakespeare)

Input Output
but you’re not listening to me. but you do not hear me .
Gregory, on my word, we will not be humiliated, like carrying coal. regory , we 'll not carry coals .
but he got the promotion. he is the friend .
i can hit quickly, if i'm motivated. i strike , i am moved .
Did you just give us the finger, sir? have you leave the thumb , sir ?
You don’t know what you’re doing! you do not what you know you .
have you killed Tybalt? hast thou slain tybalt ?
Why, Romeo, are you crazy? why , art thou mad , mad ?

Pre-Trained Models

Here is a link for an example model: https://s3-us-west-2.amazonaws.com/foxtype-nlp/tensorshake/model_cache.zip

Possible improvements

  • word embeddings
  • beam search
  • language model reranking
Owner
Motoki Wu
AI @cresta. Formerly @chegg, @writelab, @foxtype, @lingonautics
Motoki Wu
Semantic segmentation task for ADE20k & cityscapse dataset, based on several models.

semantic-segmentation-tensorflow This is a Tensorflow implementation of semantic segmentation models on MIT ADE20K scene parsing dataset and Cityscape

HsuanKung Yang 83 Oct 13, 2022
Matlab Python Heuristic Battery Opt - SMOP conversion and manual conversion

SMOP is Small Matlab and Octave to Python compiler. SMOP translates matlab to py

Tom Xu 1 Jan 12, 2022
Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement

Decompose to Adapt: Cross-domain Object Detection via Feature Disentanglement In this project, we proposed a Domain Disentanglement Faster-RCNN (DDF)

19 Nov 24, 2022
Spatial Single-Cell Analysis Toolkit

Single-Cell Image Analysis Package Scimap is a scalable toolkit for analyzing spatial molecular data. The underlying framework is generalizable to spa

Laboratory of Systems Pharmacology @ Harvard 30 Nov 08, 2022
Code of our paper "Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning"

CCOP Code of our paper Contrastive Object-level Pre-training with Spatial Noise Curriculum Learning Requirement Install OpenSelfSup Install Detectron2

Chenhongyi Yang 21 Dec 13, 2022
Road Crack Detection Using Deep Learning Methods

Road-Crack-Detection-Using-Deep-Learning-Methods This is my Diploma Thesis ¨Road Crack Detection Using Deep Learning Methods¨ under the supervision of

Aggelos Katsaliros 3 May 03, 2022
MetaShift: A Dataset of Datasets for Evaluating Contextual Distribution Shifts and Training Conflicts (ICLR 2022)

MetaShift: A Dataset of Datasets for Evaluating Distribution Shifts and Training Conflicts This repo provides the PyTorch source code of our paper: Me

88 Jan 04, 2023
Our VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks.

VMAgent is a platform for exploiting Reinforcement Learning (RL) on Virtual Machine (VM) scheduling tasks. VMAgent is constructed based on one month r

56 Dec 12, 2022
This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing.

Feedback Prize - Evaluating Student Writing This is the solution for 2nd rank in Kaggle competition: Feedback Prize - Evaluating Student Writing. The

Udbhav Bamba 41 Dec 14, 2022
Codebase for "ProtoAttend: Attention-Based Prototypical Learning."

Codebase for "ProtoAttend: Attention-Based Prototypical Learning." Authors: Sercan O. Arik and Tomas Pfister Paper: Sercan O. Arik and Tomas Pfister,

47 2 May 17, 2022
Massively parallel Monte Carlo diffusion MR simulator written in Python.

Disimpy Disimpy is a Python package for generating simulated diffusion-weighted MR signals that can be useful in the development and validation of dat

Leevi 16 Nov 11, 2022
Assessing syntactic abilities of BERT

BERT-Syntax Assesing the syntactic abilities of BERT. What Evaluate Google's BERT-Base and BERT-Large models on the syntactic agreement datasets from

Yoav Goldberg 147 Aug 02, 2022
GANSketchingJittor - Implementation of Sketch Your Own GAN in Jittor

GANSketching in Jittor Implementation of (Sketch Your Own GAN) in Jittor(计图). Or

Bernard Tan 10 Jul 02, 2022
An open-source online reverse dictionary.

An open-source online reverse dictionary.

THUNLP 6.3k Jan 09, 2023
Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

SPLASH: Semantic Parsing with Language Assistance from Humans SPLASH is dataset for the task of semantic parse correction with natural language feedba

Microsoft Research - Language and Information Technologies (MSR LIT) 35 Oct 31, 2022
D²Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos

D²Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos This repository contains the implementation for "D²Conv3D: Dynamic Dilated Co

17 Oct 20, 2022
JudeasRx - graphical app for doing personalized causal medicine using the methods invented by Judea Pearl et al.

JudeasRX Instructions Read the references given in the Theory and Notation section below Fire up the Jupyter Notebook judeas-rx.ipynb The notebook dra

Robert R. Tucci 19 Nov 07, 2022
Deep Residual Learning for Image Recognition

Deep Residual Learning for Image Recognition This is a Torch implementation of "Deep Residual Learning for Image Recognition",Kaiming He, Xiangyu Zhan

Kimmy 561 Dec 01, 2022
Official repository of OFA. Paper: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

Paper | Blog OFA is a unified multimodal pretrained model that unifies modalities (i.e., cross-modality, vision, language) and tasks (e.g., image gene

OFA Sys 1.4k Jan 08, 2023
The project page of paper: Architecture disentanglement for deep neural networks [ICCV 2021, oral]

This is the project page for the paper: Architecture Disentanglement for Deep Neural Networks, Jie Hu, Liujuan Cao, Tong Tong, Ye Qixiang, ShengChuan

Jie Hu 15 Aug 30, 2022