OCR software for recognition of handwritten text

Last update: Jan 03, 2023

Overview

Handwriting OCR

The project tries to create software for recognition of a handwritten text from photos (also for Czech language). It uses computer vision and machine learning. And it experiments with different approaches to the problem. It started as a school project which I got a chance to present on Intel ISEF 2018.

Program Structure

Proces of recognition is divided into 4 steps. The initial input is a photo of page with text.

Detection of page and removal of background
Detection and separation of words
Normalization of words
Separation and recegnition of characters (recognition of words)

Main files combining all the steps are OCR.ipynb or OCR-Evaluator.ipynb. Naming of files goes by step representing - name of machine learning model.

Getting Started

1. Clone the repository

git clone https://github.com/Breta01/handwriting-ocr.git

After downloading the repo, you have to download the datasets and models (for more info look into data and models folders).

2. Requirements

The project is created using Python 3.6 with Jupyter Notebook. I recommend using Anaconda. If you have it, you can run the installation as:

conda create --name ocr-env --file environment.yml
conda activate ocr-env

Main libraries (all required libraries are in environment.yml):

Numpy (1.13)
Tensorflow (1.4)
OpenCV (3.1)
Pandas (0.21)
Matplotlib (2.1)

Run

With all required libraries installed and cloned repo, run jupyter notebook in the directory of the project. Then you can work on the particular notebook.

Contributing

Best way how to get involved is through creating GitHub issues or solving one! If there aren't any issues you can contact me directly on email.

License

MIT

Support the project

If this project helped you or you want to support quick answers to questions and issues. Or you just think it is an interesting project. Please consider a small donation.

OCR software for recognition of handwritten text

Related tags

Overview

Handwriting OCR

Program Structure

Getting Started

1. Clone the repository

2. Requirements

Run

Contributing

License

Support the project

Owner

Břetislav Hájek

A Python wrapper for the tesseract-ocr API

TextBoxes++: A Single-Shot Oriented Scene Text Detector

Automatically resolve RidderMaster based on TensorFlow & OpenCV

Textboxes_plusplus implementation with Tensorflow (python)

Qrcode Attendence System with Opencv and Pyzbar

Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

A pure pytorch implemented ocr project including text detection and recognition

A bot that extract text from images using the Tesseract OCR.

Handwritten Character Recognition using CNN

Corner-based Region Proposal Network

Create single line SVG illustrations from your pictures

This is used to convert a string to an Image with Handwritten Characters.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約

Machine Leaning applied to denoise images to improve OCR Accuracy

Detect handwritten words in a text-line (classic image processing method).

Program created with opencv that allows you to automatically count your repetitions on several fitness exercises.

A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

TableBank: A Benchmark Dataset for Table Detection and Recognition

A curated list of resources dedicated to scene text localization and recognition

OCR software for recognition of handwritten text

Related tags

Overview

Handwriting OCR

Program Structure

Getting Started

1. Clone the repository

2. Requirements

Run

Contributing

License

Support the project

Owner

Břetislav Hájek

A Python wrapper for the tesseract-ocr API

TextBoxes++: A Single-Shot Oriented Scene Text Detector

Automatically resolve RidderMaster based on TensorFlow & OpenCV

Textboxes_plusplus implementation with Tensorflow (python)

Qrcode Attendence System with Opencv and Pyzbar

Code for CVPR 2022 paper "SoftGroup for Instance Segmentation on 3D Point Clouds"

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

A pure pytorch implemented ocr project including text detection and recognition

A bot that extract text from images using the Tesseract OCR.

Handwritten Character Recognition using CNN

Corner-based Region Proposal Network

Create single line SVG illustrations from your pictures

This is used to convert a string to an Image with Handwritten Characters.

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集 シーンテキストの位置認識と識別のための論文リソースの要約

Machine Leaning applied to denoise images to improve OCR Accuracy

Detect handwritten words in a text-line (classic image processing method).

Program created with opencv that allows you to automatically count your repetitions on several fitness exercises.

A tool combining EasyOCR and LaMa to automatically detect text and replace it with an inpainted background.

TableBank: A Benchmark Dataset for Table Detection and Recognition

A curated list of resources dedicated to scene text localization and recognition

A general list of resources to image text localization and recognition 场景文本位置感知与识别的论文资源与实现合集シーンテキストの位置認識と識別のための論文リソースの要約