This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Last update: Dec 30, 2022

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.

Thanks for the author's (@whai362) awesome work!

Installation

Any version of tensorflow version > 1.0 should be ok.
python 2 or 3 will be ok.

Download

trained on ICDAR 2015 (training set) + ICDAR2017 MLT (training set):

baiduyun extract code: pffd

google drive

This model is not as good as article's, it's just a reference. You can finetune on it or you can do a lot of optimization based on this code.

Database	Precision (%)	Recall (%)	F-measure (%)
ICDAR 2015(val)	74.61	80.93	77.64

Train

If you want to train the model, you should provide the dataset path, in the dataset path, a separate gt text file should be provided for each image, and make sure that gt text and image file have the same names.

Then run train.py like:

python train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 --checkpoint_path=./resnet_v1_50/ \
--training_data_path=./data/ocr/icdar2015/

If you have more than one gpu, you can pass gpu ids to gpu_list(like --gpu_list=0,1,2,3)

Note:

right now , only support icdar2017 data format input, like (116,1179,206,1179,206,1207,116,1207,"###"), but you can modify data_provider.py to support polygon format input
Already support polygon shrink by using pyclipper module
this re-implementation is just for fun, but I'll continue to improve this code.
re-implementation pse algorithm by using c++ (if you use python2, just run it, if python3, please replace python-config with python3-config in makefile)

Test

run eval.py like:

python eval.py --test_data_path=./tmp/images/ --gpu_list=0 --checkpoint_path=./resnet_v1_50/ \
--output_dir=./tmp/

a text file and result image will be then written to the output path.

Examples

About issues

If you encounter any issue check issues first, or you can open a new issue.

Reference

Acknowledge

@rkshuai found a bug about concat features in model.py.

If this repository helps you，please star it. Thanks.

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

Related tags

Overview

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network

Introduction

Installation

Download

Train

Test

Examples

About issues

Reference

Acknowledge

Owner

Michael liu

TextBoxes re-implement using tensorflow

pyntcloud is a Python library for working with 3D point clouds.

Basic functions manipulating images using the OpenCV library

Random maze generator and solver

Multi-choice answer sheet correction system using computer vision with opencv & python.

Distilling Knowledge via Knowledge Review, CVPR 2021

Automatically resolve RidderMaster based on TensorFlow & OpenCV

Web interface for browsing arXiv papers

YOLOv5 in DOTA with CSL_label.(Oriented Object Detection)（Rotation Detection）（Rotated BBox）

Deskewing images with slanted content

The first open-source library that detects the font of a text in a image.

fishington.io bot with OpenCV and NumPy

OCR software for recognition of handwritten text

A Python script to capture images from multiple webcams at once and save them into your local machine

A small C++ implementation of LSTM networks, focused on OCR.

Some Boring Research About Products Recognition 、Duplicate Img Detection、Img Stitch、OCR

Textboxes implementation with Tensorflow (python)

基于Paddle框架的PSENet复现

Repository for Scene Text Detection with Supervised Pyramid Context Network with tensorflow.

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications