Page to PAGE Layout Analysis Tool

Overview

P2PaLA

Python Version Code Style

Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks.

๐Ÿ’ฅ Try our new DEMO for online baseline detection. โ— โ—

If you find this toolkit useful in your research, please cite:

@misc{p2pala2017,
  author = {Lorenzo Quirรณs},
  title = {P2PaLA: Page to PAGE Layout Analysis tookit},
  year = {2017},
  publisher = {GitHub},
  note = {GitHub repository},
  howpublished = {\url{https://github.com/lquirosd/P2PaLA}},
}

Check this paper for more details Arxiv.

Requirements

  • Linux (OSX may work, but untested.).
  • Python (2.7, 3.6 under conda virtual environment is recomended)
  • Numpy
  • PyTorch (1.0). PyTorch 0.3.1 compatible on this branch
  • OpenCv (3.4.5.20).
  • NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN works, but is not recomended for training).
  • tensorboard-pytorch (v0.9) [Optional]. pip install tensorboardX > A diferent conda env is recomended to keep tensorflow separated from PyTorch

Install

python setup.py install

To install python dependencies alone, use requirements file conda env create --file conda_requirements.yml

Usage

  1. Input data must follow the folder structure data_tag/page, where images must be into the data_tag folder and xml files into page. For example:
mkdir -p data/{train,val,test,prod}/page;
tree data;
data
โ”œโ”€โ”€ prod
โ”‚   โ”œโ”€โ”€ page
โ”‚   โ”‚   โ”œโ”€โ”€ prod_0.xml
โ”‚   โ”‚   โ””โ”€โ”€ prod_1.xml
โ”‚   โ”œโ”€โ”€ prod_0.jpg
โ”‚   โ””โ”€โ”€ prod_1.jpg
โ”œโ”€โ”€ test
โ”‚   โ”œโ”€โ”€ page
โ”‚   โ”‚   โ”œโ”€โ”€ test_0.xml
โ”‚   โ”‚   โ””โ”€โ”€ test_1.xml
โ”‚   โ”œโ”€โ”€ test_0.jpg
โ”‚   โ””โ”€โ”€ test_1.jpg
โ”œโ”€โ”€ train
โ”‚   โ”œโ”€โ”€ page
โ”‚   โ”‚   โ”œโ”€โ”€ train_0.xml
โ”‚   โ”‚   โ””โ”€โ”€ train_1.xml
โ”‚   โ”œโ”€โ”€ train_0.jpg
โ”‚   โ””โ”€โ”€ train_1.jpg
โ””โ”€โ”€ val
    โ”œโ”€โ”€ page
    โ”‚   โ”œโ”€โ”€ val_0.xml
    โ”‚   โ””โ”€โ”€ val_1.xml
    โ”œโ”€โ”€ val_0.jpg
    โ””โ”€โ”€ val_1.jpg
  1. Run the tool.
python P2PaLA.py --config config.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"

โ— Pre-trained models available here

  1. Use TensorBoard to visualize train status:
tensorboard --logdir ./work/runs
  1. xml-PAGE files must be at "./work/results/test/"

We recommend Transkribus or nw-page-editor to visualize and edit PAGE-xml files.

  1. For detail about arguments and config file, see docs or python P2PaLa.py -h.
  2. For more detailed example see egs:
    • Bozen dataset see
    • cBAD complex competition dataset see
    • OHG dataset see

License

GNU General Public License v3.0 See LICENSE to see the full text.

Acknowledgments

Code is inspired by pix2pix and pytorch-CycleGAN-and-pix2pix

Owner
Lorenzo Quirรณs Dรญaz
Lorenzo Quirรณs Dรญaz
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Overview This collection demonstrates how to construct and train a deep, bidirectional stacked LSTM using CNN features as input with CTC loss to perfo

Jerod Weinman 489 Dec 21, 2022
Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera. Fingertip location is mapped to RGB images to control the mouse cursor.

Ravi Sharma 71 Dec 20, 2022
Document blur detection based on Laplacian operator and text detection.

Document Blur Detection For general blurred image, using the variance of Laplacian operator is a good solution. But as for the blur detection of docum

JoeyLr 5 Oct 20, 2022
([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Nested-Co-teaching ([email protected]) Pytorch implementation of paper "Boosting Co-tea

YINGYI CHEN 41 Jan 03, 2023
Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

GTA-5-Lane-detection Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and

Danciu Georgian 4 Aug 01, 2021
M-LSDใ‚’็”จใ„ใฆๅ››่ง’ๅฝขใ‚’ๆคœๅ‡บใ—ใ€ๅฐ„ๅฝฑๅค‰ๆ›ใ‚’่กŒใ†ใ‚ตใƒณใƒ—ใƒซใƒ—ใƒญใ‚ฐใƒฉใƒ 

M-LSD-warpPerspective-Example M-LSDใ‚’็”จใ„ใฆๅ››่ง’ๅฝขใ‚’ๆคœๅ‡บใ—ใ€ๅฐ„ๅฝฑๅค‰ๆ›ใ‚’่กŒใ†ใ‚ตใƒณใƒ—ใƒซใƒ—ใƒญใ‚ฐใƒฉใƒ ใงใ™ใ€‚ Requirements OpenCV 3.4.2 or Later tensorflow 2.4.1 or Later Usage ๅฎŸ่กŒๆ–นๆณ•ใฏไปฅไธ‹ใงใ™ใ€‚ pytho

KazuhitoTakahashi 9 Oct 14, 2022
Motion Detection Squid Game with OpenCV Python

*Motion Detection Squid Game with OpenCV Python i am newbie in python. In this project I made a simple game to follow the trend about the red light gr

Nayan 17 Nov 22, 2022
2 telegram-bots: for image recognition and for text generation

๐Ÿ’ป ๐Ÿ“ฑ Telegram_Bots ๐Ÿ”Ž & ๐Ÿ“– 2 telegram-bots: for image recognition and for text generation. About Image recognition bot: User sends a photo and bot de

Marina Polukoshko 1 Jan 27, 2022
TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

Minghui Liao 930 Jan 04, 2023
Image processing in Python

scikit-image: Image processing in Python Website (including documentation): https://scikit-image.org/ Mailing list: https://mail.python.org/mailman3/l

Image Processing Toolbox for SciPy 5.2k Dec 30, 2022
Text to QR-CODE

QR CODE GENERATO USING PYTHON Author : RAFIK BOUDALIA. Installation Use the package manager pip to install foobar. pip install pyqrcode Usage from tki

Rafik Boudalia 2 Oct 13, 2021
Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

This repository hosts the source code of our paper: [AAAI 2021]Sequential End-to-end Network for Efficient Person Search. SeqNet achieves the state-of

Zj Li 218 Dec 31, 2022
ocroseg - This is a deep learning model for page layout analysis / segmentation.

ocroseg This is a deep learning model for page layout analysis / segmentation. There are many different ways in which you can train and run it, but by

NVIDIA Research Projects 71 Dec 06, 2022
Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

Morphologycal-edge-detection-using-erosion-and-dialation the task is to detect object boundary using erosion or dialation . Here, use the kernel or st

Tamzid hasan 3 Nov 25, 2022
ไธ€้”ฎ็ฟป่ฏ‘ๅ„็ฑปๅ›พ็‰‡ๅ†…ๆ–‡ๅญ—

ไธ€้”ฎ็ฟป่ฏ‘ๅ„็ฑปๅ›พ็‰‡ๅ†…ๆ–‡ๅญ— ้’ˆๅฏน็พคๅ†…ใ€ๅ„ไธชๅ›พ็ซ™ไธŠๅคง้‡ไธๅคชๅฏ่ƒฝไผšๆœ‰ไบบๅŽป็ฟป่ฏ‘็š„ๅ›พ็‰‡่ฎพ่ฎก๏ผŒ่ฎฉๆˆ‘่ฟ™็งๆ—ฅ่ฏญๅฐ็™ฝ่ƒฝๅคŸๅ‹‰ๅผบ็œ‹ๆ‡‚ๅ›พ็‰‡ ไธป่ฆๆ”ฏๆŒๆ—ฅ่ฏญ๏ผŒไธ่ฟ‡ไนŸ่ƒฝ่ฏ†ๅˆซๆฑ‰่ฏญๅ’Œๅฐๅ†™่‹ฑๆ–‡ ๆ”ฏๆŒ็ฎ€ๅ•็š„ๆถ‚็™ฝๅ’ŒๅตŒๅญ—

574 Dec 28, 2022
a Deep Learning Framework for Text

DeLFT DeLFT (Deep Learning Framework for Text) is a Keras and TensorFlow framework for text processing, focusing on sequence labelling (e.g. named ent

Patrice Lopez 350 Dec 19, 2022
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

Amazon Web Services - Labs 422 Jan 03, 2023
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

DeepCam Shenzhen 141 Dec 31, 2022
A curated list of papers and resources for scene text detection and recognition

Awesome Scene Text A curated list of papers and resources for scene text detection and recognition The year when a paper was first published, includin

Jan Zdenek 43 Mar 15, 2022