An application that maps an image of a LaTeX math equation to LaTeX code.

Overview

Image to LaTeX

Code style: black pre-commit License

An application that maps an image of a LaTeX math equation to LaTeX code.

Image to Latex streamlit app

Introduction

The problem of image-to-markup generation has been attempted by Deng et al. (2016). They provide the raw and preprocessed versions of im2latex-100K, a dataset consisting of about 100K LaTeX math equation images. Using their dataset, I trained a model that uses ResNet-18 as encoder (up to layer3) and a Transformer as decoder with cross-entropy loss.

Initially, I used the preprocessed dataset to train my model, but the preprocessing turned out to be a huge limitation. Although the model can achieve a reasonable performance on the test set, it performs poorly if the image quality, padding, or font size is different from the images in the dataset. This phenomenon has also been observed by others who have attempted the same problem using the same dataset (e.g., this project, this issue and this issue). This is most likely due to the rigid preprocessing for the dataset (e.g. heavy downsampling).

To this end, I used the raw dataset and included image augmentation (e.g. random scaling, small rotation) in my data processing pipeline to increase the diversity of the samples. Moreover, unlike Deng et al. (2016), I did not group images by size. Rather, I sampled them uniformly and padded them to the size of the largest image in the batch, to increase the generalizability of the model.

Additional problems that I found in the dataset:

  • Some latex code produces visually identical outputs (e.g. \left( and \right) look the same as ( and )), so I normalized them.
  • Some latex code is used to add space (e.g. \vspace{2px} and \hspace{0.3mm}). However, the length of the space is diffcult to judge. Also, I don't want the model generates code on blank images, so I removed them.

The best run has a character error rate (CER) of 0.17 in test set. Most errors seem to come from unnecessary horizontal spacing, e.g., \;, \, and \qquad. (I only removed \vspace and \hspace during preprocessing. I did not know that LaTeX has so many horizontal spacing commands.)

Possible improvements include:

  • Do a better job cleaning the data (e.g., removing spacing commands)
  • Train the model for more epochs (for the sake of time, I only trained the model for 15 epochs, but the validation loss is still going down)
  • Use beam search (I only implemented greedy search)
  • Use a larger model (e.g., use ResNet-34 instead of ResNet-18)
  • Do some hyperparameter tuning

I didn't do any of these, because I had limited computational resources (I was using Google Colab).

How To Use

Setup

Clone the repository to your computer and position your command line inside the repository folder:

git clone https://github.com/kingyiusuen/image-to-latex.git
cd image-to-latex

Then, create a virtual environment named venv and install required packages:

make venv
make install-dev

Data Preprocessing

Run the following command to download the im2latex-100k dataset and do all the preprocessing. (The image cropping step may take over an hour.)

python scripts/prepare_data.py

Model Training and Experiment Tracking

Model Training

An example command to start a training session:

python scripts/run_experiment.py trainer.gpus=1 data.batch_size=32

Configurations can be modified in conf/config.yaml or in command line. See Hydra's documentation to learn more.

Experiment Tracking using Weights & Biases

The best model checkpoint will be uploaded to Weights & Biases (W&B) automatically (you will be asked to register or login to W&B before the training starts). Here is an example command to download a trained model checkpoint from W&B:

python scripts/download_checkpoint.py RUN_PATH

Replace RUN_PATH with the path of your run. The run path should be in the format of //. To find the run path for a particular experiment run, go to the Overview tab in the dashboard.

For example, you can use the following command to download my best run

python scripts/download_checkpoint.py kingyiusuen/image-to-latex/1w1abmg1

The checkpoint will be downloaded to a folder named artifacts under the project directory.

Testing and Continuous Integration

The following tools are used to lint the codebase:

isort: Sorts and formats import statements in Python scripts.

black: A code formatter that adheres to PEP8.

flake8: A code linter that reports stylistic problems in Python scripts.

mypy: Performs static type checking in Python scripts.

Use the following command to run all the checkers and formatters:

make lint

See pyproject.toml and setup.cfg at the root directory for their configurations.

Similar checks are done automatically by the pre-commit framework when a commit is made. Check out .pre-commit-config.yaml for the configurations.

Deployment

An API is created to make predictions using the trained model. Use the following command to get the server up and running:

make api

You can explore the API via the generated documentation at http://0.0.0.0:8000/docs.

To run the Streamlit app, create a new terminal window and use the following command:

make streamlit

The app should be opened in your browser automatically. You can also open it by visiting http://localhost:8501. For the app to work, you need to download the artifacts of an experiment run (see above) and have the API up and running.

To create a Docker image for the API:

make docker

Acknowledgement

Nutrify - take a photo of food and learn about it

Nutrify - take a photo of food and learn about it Work in progress. To make this a thing, we're going to need lots of food images... Start uploading y

Daniel Bourke 93 Dec 30, 2022
Digital image process Basic algorithm

These are some basic algorithms that I have implemented by my hands in the process of learning digital image processing, such as mean and median filtering, sharpening algorithms, interpolation scalin

JingYu 2 Nov 03, 2022
Plots is a graph plotting app for GNOME.

Plots is a graph plotting app for GNOME. Plots makes it easy to visualise mathematical formulae. In addition to basic arithmetic operations, it supports trigonometric, hyperbolic, exponential and log

Alex Huntley 138 Dec 14, 2022
Panel Competition Image Generator

Panel Competition Image Generator This project was build by a member of the NFH community and is open for everyone who wants to try it. Relevant links

Juliano Mendieta 1 Oct 22, 2021
Image manipulation package used for EpicBot.

Image manipulation package used for EpicBot.

Nirlep_5252_ 7 May 26, 2022
Generate meme GIFs in which an image you choose can be viewed by the user only after they wait a whole hour.

Generate meme GIFs in which an image you choose can be viewed by the user only after they wait a whole hour.

Feliks Maak 1 Jan 31, 2022
Javascript image annotation tool based on image segmentation.

JS Segment Annotator Javascript image annotation tool based on image segmentation. Label image regions with mouse. Written in vanilla Javascript, with

Kota Yamaguchi 513 Nov 15, 2022
Python Image Optimizer Script

Image-Optimizer Download and Install git clone https://github.com/stefankumpan/Image-Optimizer-Script.git cd Image-Optimizer-Script pip install -r req

Stefan Kumpan 0 Jul 15, 2021
Paper backup of files using QR codes

Generate paper backups for Linux. Currently command-linux Linux only. Takes any file, and outputs a "paper backup": a printable black-and-white pdf fu

Zachary Vance 27 Dec 28, 2022
A python based library to help you create unique generative images based on Rarity for your next NFT Project

Generative-NFT Generate Unique Images based on Rarity A python based library to help you create unique generative images based on Rarity for your next

Kartikay Bhutani 8 Sep 21, 2022
This is the official source code of FreeCAD, a free and opensource multiplatform 3D parametric modeler.

Freedom to build what you want FreeCAD is an open-source parametric 3D modeler made primarily to design real-life objects of any size. Parametric modeling allows you to easily modify your design by g

FreeCAD 12.9k Jan 07, 2023
🎨 Generate and change color-schemes on the fly.

Generate and change color-schemes on the fly. Pywal is a tool that generates a color palette from the dominant colors in an image. It then applies the

dylan 6.9k Jan 03, 2023
display: a browser-based graphics server

display: a browser-based graphics server Installation Quick Start Usage Development A very lightweight display server for Torch. Best used as a remote

Szymon Jakubczak 205 Oct 17, 2022
A 3D structural engineering finite element library for Python.

An easy to use elastic 3D structural engineering finite element analysis library for Python.

Craig 220 Dec 27, 2022
image-processing exercises.

image_processing Assignment 21 Checkered Board Create a chess table using numpy and opencv. view: Color Correction Reverse black and white colors with

Benyamin Zojaji 25 Dec 15, 2022
A sketch like(?) effect for images

lineArt A sketch like(?) effect for images How to run main.py [filename] [option {1,2}] option 1 retains colour option 2 gives gray image #results ori

1 Oct 28, 2021
This Github Action automatically creates a GIF from a given web page to display on your project README

This Github Action automatically creates a GIF from a given web page to display on your project README

Pablo Lecolinet 28 Dec 15, 2022
This will help to read QR codes using Raspberry Pi and Pi Camera

Raspberry-Pi-Generate-and-Read-QR-code This will help to read QR codes using Raspberry Pi and Pi Camera Install the required libraries first in your T

Raspberry_Pi Pakistan 2 Nov 06, 2021
Python library that finds the size / type of an image given its URI by fetching as little as needed

FastImage This is an implementation of the excellent Ruby library FastImage - but for Python. FastImage finds the size or type of an image given its u

Brian Muller 28 Mar 01, 2022
Multiparametric Image Analysis

Documentation The documentation is available on populse_mia's website here Installation From PyPI, for users By cloning the package, for developers Fr

Populse 9 Dec 14, 2022