Page to PAGE Layout Analysis Tool

Last update: Nov 24, 2022

Overview

P2PaLA

Page to PAGE Layout Analysis (P2PaLA) is a toolkit for Document Layout Analysis based on Neural Networks.

💥 Try our new DEMO for online baseline detection. ❗ ❗

If you find this toolkit useful in your research, please cite:

@misc{p2pala2017,
  author = {Lorenzo Quirós},
  title = {P2PaLA: Page to PAGE Layout Analysis tookit},
  year = {2017},
  publisher = {GitHub},
  note = {GitHub repository},
  howpublished = {\url{https://github.com/lquirosd/P2PaLA}},
}

Check this paper for more details Arxiv.

Requirements

Linux (OSX may work, but untested.).
Python (2.7, 3.6 under conda virtual environment is recomended)
Numpy
PyTorch (1.0). PyTorch 0.3.1 compatible on this branch
OpenCv (3.4.5.20).
NVIDIA GPU + CUDA CuDNN (CPU mode and CUDA without CuDNN works, but is not recomended for training).
tensorboard-pytorch (v0.9) [Optional]. pip install tensorboardX > A diferent conda env is recomended to keep tensorflow separated from PyTorch

Install

python setup.py install

To install python dependencies alone, use requirements file conda env create --file conda_requirements.yml

Usage

Input data must follow the folder structure data_tag/page, where images must be into the data_tag folder and xml files into page. For example:

mkdir -p data/{train,val,test,prod}/page;
tree data;

data
├── prod
│   ├── page
│   │   ├── prod_0.xml
│   │   └── prod_1.xml
│   ├── prod_0.jpg
│   └── prod_1.jpg
├── test
│   ├── page
│   │   ├── test_0.xml
│   │   └── test_1.xml
│   ├── test_0.jpg
│   └── test_1.jpg
├── train
│   ├── page
│   │   ├── train_0.xml
│   │   └── train_1.xml
│   ├── train_0.jpg
│   └── train_1.jpg
└── val
    ├── page
    │   ├── val_0.xml
    │   └── val_1.xml
    ├── val_0.jpg
    └── val_1.jpg

Run the tool.

python P2PaLA.py --config config.txt --tr_data ./data/train --te_data ./data/test --log_comment "_foo"

❗ Pre-trained models available here

Use TensorBoard to visualize train status:

tensorboard --logdir ./work/runs

xml-PAGE files must be at "./work/results/test/"

We recommend Transkribus or nw-page-editor to visualize and edit PAGE-xml files.

For detail about arguments and config file, see docs or python P2PaLa.py -h.
For more detailed example see egs:
- Bozen dataset see
- cBAD complex competition dataset see
- OHG dataset see

License

GNU General Public License v3.0 See LICENSE to see the full text.

Acknowledgments

Code is inspired by pix2pix and pytorch-CycleGAN-and-pix2pix

Page to PAGE Layout Analysis Tool

Related tags

Overview

P2PaLA

Requirements

Install

Usage

License

Acknowledgments

Owner

Lorenzo Quirós Díaz

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Developed an AI-based system to control the mouse cursor using Python and OpenCV with the real-time camera.

Document blur detection based on Laplacian operator and text detection.

([email protected]) Boosting Co-teaching with Compression Regularization for Label Noise

Just a script for detecting the lanes in any car game (not just gta 5) with specific resolution and road design ( very basic and limited )

M-LSDを用いて四角形を検出し、射影変換を行うサンプルプログラム

Motion Detection Squid Game with OpenCV Python

2 telegram-bots: for image recognition and for text generation

TextBoxes++: A Single-Shot Oriented Scene Text Detector

Image processing in Python

Text to QR-CODE

Code for AAAI 2021 paper: Sequential End-to-end Network for Efficient Person Search

ocroseg - This is a deep learning model for page layout analysis / segmentation.

Morphological edge detection or object's boundary detection using erosion and dialation in OpenCV python

一键翻译各类图片内文字

a Deep Learning Framework for Text

This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

A curated list of papers and resources for scene text detection and recognition