Handwritten_Text_Recognition

Overview

Deep Learning framework for Line-level Handwritten Text Recognition

Short presentation of our project

  1. Introduction

  2. Installation
    2.a Install conda environment
    2.b Download databases

    • IAM dataset
    • ICFHR 2014 dataset
  3. How to use
    3.a Make predictions on unlabelled data using our best networks
    3.b Train and test a network from scratch
    3.c Test a model without retraining it

  4. References

  5. Contact

1. Introduction

This work was an internship project under Mathieu Aubry's supervision, at the LIGM lab, located in Paris.

In HTR, the task is to predict a transcript from an image of a handwritten text. A commonly used structure for this task is Convolutional Recurrent Neural Networks (CRNN). One CRNN network consists of a feature extractor (often with convolutional layers), followed by a recurrent network (LSTM).

This github provides a framework to train and test CRNN networks on handwritten grayscale line-level datasets. This github also provides code to generate predictions on an unlabelled, line-level, grayscale line-level dataset. There are several options for the structure of the CRNN used, image preprocessing, dataset used, data augmentation.

alt text

2. Installation

Prerequisites

Make sure you have Anaconda installed (version >= to 4.7.10, you may not be able to install correct dependencies if older). If not, follow the installation instructions provided at https://docs.anaconda.com/anaconda/install/.

Also pull the git.

2.a Download and activate conda environment

Once in the git folder on your machine, run the command lines :

conda env create -f HTR_environment.yml
conda activate HTR 

2.b Download databases

You will only need to download these databases if you want to train your own network from scratch. The framework is built to train a network on one of these 2 datasets : IAM and ICFHR2014 HTR competition. [ADD REF TO SLIDES]

  • Before downloading IAM dataset, you need to register on this website. Once that's done, you need to download :

    • The 'lines' folder at this link.
    • The 'split' folder at this link.
    • The 'lines.txt' file at this link.
  • For ICFHR2014 dataset, you need to download the 'BenthamDatasetR0-GT' folder at this link.

Make sure to download the two databases in the same folder. Structure must be

Your data folder / 
    IAM/
        lines.txt
        lines/
        split/
            trainset.txt
            testset.txt
            validationset1.txt
            validationset2.txt
            
    ICFHR2014/
        BenthamDatasetR0-GT/ 

    Your own dataset/

3. How to use

3.a Make predictions on your own unlabelled dataset

Running this code will use model stored at model_path to make predictions on images stored in data_path. The predictions will be stored in predictions.txt in data_path folder.

python lines_predictor.py --data_path datapath  --model_path ./trained_networks/IAM_model_imgH64.pth --imgH 64

/!\ Make sure that each image in the data folder has a unique file name and all images are in .jpg form. When you use our trained model with imgH as 64 (i.e. IAM_model_imgH64.pth), you have to set the argument --imgH as 64.

3.b Train a network from scratch

python train.py --dataset dataset  --tr_data_path data_dir --save_model_path path

Before running the code, make sure that you change ROOT_PATH variable at the beginning of params.py to the path of the folder you want to save your models in. Main arguments :

  • --dataset: name of the dataset to train and test on. Supported values are ICFHR2014 and IAM.
  • --tr_data_path: location of the train dataset folder on local machine. See section [??] for downloading datasets.
  • --save_model_path: path of the folder where model will be saved if params.save is set to True.

Main learning arguments :

  • --data_aug: If set to True, will apply random affine data transformation to the training images.

  • --optimizer: Which optimizer to use. Supported values are rmsprop, adam, adadelta, and sgd. We recommend using RMSprop, which got best results in our experiments. See params.py for optimizer-specific parameters.

  • --epochs : Number of training epochs

  • --lr: Learning rate at the beginning of training.

  • --milestones: List of the epochs at which the learning rate will be divided by 10.

  • feat_extractor: Structure to use for the feature extractor. Supported values are resnet18, custom_resnet, and conv.

    • resnet18 : standard structure of resnet18.
    • custom_resnet: variant of resnet18 that we tuned for our experiments.
    • conv: Use this option if you want to use a purely convolutional feature extractor and not a residual one. See conv parameters in params.py to choose conv structure.

3.c Test a model without retraining it

Running this code will compute the average CER and WER of model stored at pretrained_model path on the testing set of chosen dataset.

python train.py --train '' --save '' --pretrained_model model_path --dataset dataset --tr_data_path data_path 

Main arguments :

  • --pretrained_model: path to state_dict of pretrained model.
  • --dataset: Which dataset to test on. Supported values are ICFHR2014 and IAM.
  • --tr_data_path: path to the dataset folder (see section [??])

4. References

Graves et al. Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks
Sánchez et al. A set of benchmarks for Handwritten Text Recognition on historical documents
Dutta et al. Improving CNN-RNN Hybrid Networks for Handwriting Recognition

U.-V. Marti, H. Bunke The IAM-database: an English sentence database for offline handwriting recognition

https://github.com/Holmeyoung/crnn-pytorch
https://github.com/georgeretsi/HTR-ctc
Synthetic line generator : https://github.com/monniert/docExtractor (see paper for more information)

5. Contact

If you have questions or remarks about this project, please email us at [email protected] and [email protected].

Memory tests solver with using OpenCV

Human Benchmark project This project is OpenCV based programs which are puzzle solvers for 7 different games for https://humanbenchmark.com/. made as

Bahadır Araz 24 Dec 27, 2022
Pixie - A full-featured 2D graphics library for Python

Pixie - A full-featured 2D graphics library for Python Pixie is a 2D graphics library similar to Cairo and Skia. pip install pixie-python Features: Ty

treeform 65 Dec 30, 2022
OCR, Scene-Text-Understanding, Text Recognition

Scene-Text-Understanding Survey [2015-PAMI] Text Detection and Recognition in Imagery: A Survey paper [2014-Front.Comput.Sci] Scene Text Detection and

Alan Tang 354 Dec 12, 2022
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
2 telegram-bots: for image recognition and for text generation

💻 📱 Telegram_Bots 🔎 & 📖 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

OCRopus 3.2k Dec 31, 2022
A dataset handling library for computer vision datasets in LOST-fromat

A dataset handling library for computer vision datasets in LOST-fromat

8 Dec 15, 2022
A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

Weverton Marques 4 Aug 06, 2021
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering" official PyTorch implementation.

LED2-Net This is PyTorch implementation of our CVPR 2021 Oral paper "LED2-Net: Monocular 360˚ Layout Estimation via Differentiable Depth Rendering". Y

Fu-En Wang 83 Jan 04, 2023
An interactive interface for using OpenCV's GrabCut algorithm for image segmentation.

Interactive GrabCut An interactive interface for using OpenCV's GrabCut algorithm for image segmentation. Setup Install dependencies: pip install nump

Jason Y. Zhang 16 Oct 10, 2022
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
A simple Digits Recogniser made in Python

⭐ Python Digit Recogniser A simple digit Recogniser made in Python Demo Run Locally Clone the project git clone https://github.com/yashraj-n/python-

Yashraj narke 4 Nov 29, 2021
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022
CNN+Attention+Seq2Seq

Attention_OCR CNN+Attention+Seq2Seq The model and its tensor transformation are shown in the figure below It is necessary ch_ train and ch_ test the p

Tsukinousag1 2 Jul 14, 2022
A bot that plays TFT using OCR. Keeps track of bench, board, items, and plays the user defined team comp.

NOTES: To ensure best results, make sure you are running this on a computer that has decent specs. 1920x1080 fullscreen is required in League, game mu

francis 125 Dec 30, 2022
A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

A python script based on opencv and paddleocr, which can automatically pick up tasks, make cookies, and receive rewards in the Destiny 2 Dawning Oven

1 Dec 22, 2021
Go package for OCR (Optical Character Recognition), by using Tesseract C++ library

gosseract OCR Golang OCR package, by using Tesseract C++ library. OCR Server Do you just want OCR server, or see the working example of this package?

Hiromu OCHIAI 1.9k Dec 28, 2022