A pure pytorch implemented ocr project including text detection and recognition

Last update: Dec 30, 2022

Overview

ocr.pytorch

A pure pytorch implemented ocr project.
Text detection is based CTPN and text recognition is based CRNN.
More detection and recognition methods will be supported!

Prerequisite

python-3.5+
pytorch-0.4.1+
torchvision-0.2.1
opencv-3.4.0.14
numpy-1.14.3

They could all be installed through pip except pytorch and torchvision. As for pytorch and torchvision, they both depends on your CUDA version, you would prefer to reading pytorch's official site

Detection

Detection is based on CTPN, some codes are borrowed from pytorch_ctpn, several detection results:

Recognition

Recognition is based on CRNN, some codes are borrowed from crnn.pytorch

Test

Download pretrained models from Baidu Netdisk (extract code: u2ff) or Google Driver and put these files into checkpoints. Then run

python3 demo.py

The image files in ./test_images will be tested for text detection and recognition, the results will be stored in ./test_result.

If you want to test a single image, run

python3 test_one.py [filename]

Train

Training codes are placed into train_code directory.
Train CTPN
Train CRNN

Licence

MIT License

A pure pytorch implemented ocr project including text detection and recognition

Related tags

Overview

ocr.pytorch

Prerequisite

Detection

Recognition

Test

Train

Licence

Owner

coura

The papers published in top-tier AI conferences in recent years.

Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

Write-ups for the SwissHackingChallenge2021 CTF.

The Open Source Framework for Machine Vision

text detection mainly based on ctpn model in tensorflow, id card detect, connectionist text proposal network

Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

Creating a virtual tv using opencv in python3.

An unofficial implementation of the paper "AutoVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss".

Use Youdao OCR API to covert your clipboard image to text.

InverseRenderNet: Learning single image inverse rendering, CVPR 2019.

Python-based tools for document analysis and OCR

Detect handwritten words in a text-line (classic image processing method).

a deep learning model for page layout analysis / segmentation.

Official code for :rocket: Unsupervised Change Detection of Extreme Events Using ML On-Board :rocket:

document image degradation

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021)

Automatic Number Plate Recognition (ANPR) is a highly accurate system capable of reading vehicle number plates without human intervention

(CVPR 2021) Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

Fully-automated scripts for collecting AI-related papers