Layout Analysis Evaluator for the ICDAR 2017 competition on Layout Analysis for Challenging Medieval Manuscripts

Overview

LayoutAnalysisEvaluator

Layout Analysis Evaluator for:

Minimal usage: java -jar LayoutAnalysisEvaluator.jar -gt gt_image.png -p prediction_image.png

Parameters list: utility-name

 -gt,--groundTruth <arg>      Ground Truth image 
 -p,--prediction <arg>        Prediction image 
 -o,--original <arg>          (Optional) Original image, to be overlapped with the results visualization
 -j,--json <arg>              (Optional) Json Path, for the DIVAServices JSON output
 -out,--outputPath <arg>      (Optional) Output path (relative to prediction input path)                            
 -dv,--disableVisualization   (Optional)(Flag) Vsualizing the evaluation as image is NOT desired

Note: this also outputs a human-friendly visualization of the results next to the prediction_image.png which can be overlapped to the original image if provided with the parameter -overlap to enable deeper analysis.

Visualization of the results

Along with the numerical results (such as the Intersection over Union (IU), precision, recall,F1) the tool provides a human friendly visualization of the results. Additionally, when desired one can provide the original image and it will be overlapped with the visualization of the results. This is particularly helpful to understand why certain artifacts are created. The three images below represent the three steps: the original image, the visualization of the result and the two overlapped.

Alt text Alt text Alt text

Interpreting the colors

Pixel colors are assigned as follows:

  • GREEN: Foreground predicted correctly
  • YELLOW: Foreground predicted - but the wrong class (e.g. Text instead of Comment)
  • BLACK: Background predicted correctly
  • RED: Background mis-predicted as Foreground
  • BLUE: Foreground mis-predicted as Background

Example of problem hunting

Below there is an example supporting the usefulness of overlapping the prediction quality visualization with the original image. Focus on the red pixels pointed at by the white arrow: they are background pixels mis-classified as foreground. In the normal visualization (left) its not possible to know why would an algorithm decide that in that spot there is something belonging to foreground, as it is clearly far from regular text. However, when overlapped with the original image (right) one can clearly see that in this area there is an ink stain which could explain why the classification algorithm is deceived into thinking these pixel were foreground. This kind of interpretation is obviously not possible without the information provided by the original image like in (right).

Alt text Alt text

Ground Truth Format

The ground truth information needs to be a pixel-label image where the class information is encoded in the blue channel. Red and green channels should be set to 0 with the exception of the boundaries pixels used in the two competitions mentioned above.

For example, in the DIVA-HisDB dataset there are four different annotated classes which might overlap: main text body, decorations, comments and background.

In the pixel-label images the classes are encoded by RGB values as follows:

Red = 0 everywhere (except boundaries)
Green = 0 everywhere

Blue = 0b00...1000 = 0x000008: main text body
Blue = 0b00...0100 = 0x000004: decoration
Blue = 0b00...0010 = 0x000002: comment
Blue = 0b00...0001 = 0x000001: background (out of page)

Note that the GT might contain multi-class labeled pixels, for all classes except for the background. For example:

Blue = 0b...1000 | 0b...0010 = 0b...1010 = 0x00000A : main text body + comment  
Blue = 0b...1000 | 0b...0100 = 0b...1100 = 0x00000C : main text body + decoration
Blue = 0b...0010 | 0b...0100 = 0b...0110 = 0x000006 : comment + decoration

Citing us

If you use our software, please cite our paper as:

@inproceedings{alberti2017evaluation,
    address = {Kyoto, Japan},
    archivePrefix = {arXiv},
    arxivId = {1712.01656},
    author = {Alberti, Michele and Bouillon, Manuel and Ingold, Rolf and Liwicki, Marcus},
    booktitle = {2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
    doi = {10.1109/ICDAR.2017.311},
    eprint = {1712.01656},
    isbn = {978-1-5386-3586-5},
    month = {nov},
    pages = {43--47},
    title = {{Open Evaluation Tool for Layout Analysis of Document Images}},
    year = {2017}
}
You might also like...
Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

CellProfiler is a open-source application for biological image analysis
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automatically.

Python-based tools for document analysis and OCR

ocropy OCRopus is a collection of document analysis programs, not a turn-key OCR system. In order to apply it to your documents, you may need to do so

Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled -
Pre-trained BERT Models for Ancient and Medieval Greek, and associated code for LaTeCH 2021 paper titled - "A Pilot Study for BERT Language Modelling and Morphological Analysis for Ancient and Medieval Greek"

Ancient Greek BERT The first and only available Ancient Greek sub-word BERT model! State-of-the-art post fine-tuning on Part-of-Speech Tagging and Mor

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.
2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

TableMASTER-mmocr Contents About The Project Method Description Dependency Getting Started Prerequisites Installation Usage Data preprocess Train Infe

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection
1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This project releases our 1st place solution on ICDAR 2021 Competition on Mathematical Formula Detection. We implement our solution based on MMDetection, which is an open source object detection toolbox based on PyTorch.

Code for the DH project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval Muslim World"

Damast This repository contains code developed for the digital humanities project "Dhimmis & Muslims – Analysing Multireligious Spaces in the Medieval

Layout Parser is a deep learning based tool for document image layout analysis tasks.
G-Research-Crypto-Competition - Project for passing the ML exam. Dataset took from the competition on the kaggle

G-Research-Crypto-Competition Project for passing the ML exam. Dataset took from

PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)
PyTorch code of my ICDAR 2021 paper Vision Transformer for Fast and Efficient Scene Text Recognition (ViTSTR)

Vision Transformer for Fast and Efficient Scene Text Recognition (ICDAR 2021) ViTSTR is a simple single-stage model that uses a pre-trained Vision Tra

Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021
Official implementation of SynthTIGER (Synthetic Text Image GEneratoR) ICDAR 2021

🐯 SynthTIGER: Synthetic Text Image GEneratoR Official implementation of SynthTIGER | Paper | Datasets Moonbin Yim1, Yoonsik Kim1, Han-cheol Cho1, Sun

Official implementation for ICDAR 2021 paper "Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer"

Handwritten Mathematical Expression Recognition with Bidirectionally Trained Transformer Description Convert offline handwritten mathematical expressi

A mathematica expression evaluator with PokemonTypes

A simple mathematical expression evaluator that uses Pokemon types to replace symbols.

The evaluator covering all of the metrics required by tasks within the DUE Benchmark.

DUE Evaluator The repository contains the evaluator covering all of the metrics required by tasks within the DUE Benchmark, i.e., set-based F1 (for KI

Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.
Excel-report-evaluator - A simple Python GUI application to aid with bulk evaluation of Microsoft Excel reports.

Excel Report Evaluator Simple Python GUI with Tkinter for evaluating Microsoft Excel reports (.xlsx-Files). Usage Start main.py and choose one of the

 Binance Smart Chain Contract Scraper + Contract Evaluator
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

Binance Smart Chain Contract Scraper + Contract Evaluator
Binance Smart Chain Contract Scraper + Contract Evaluator

Pulls Binance Smart Chain feed of newly-verified contracts every 30 seconds, then checks their contract code for links to socials.Returns only those with socials information included, and then submits the contract address to TokenSniffer to evaluate contract legitimacy

Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

IMGUR5K handwriting set. It is a handwritten in-the-wild dataset, which contains challenging real world handwritten samples from different writers.The dataset is shared as a set of image urls with annotations. This code downloads the images and verifies the hash to the image to avoid data contamination.
Releases(v1.0.0)
CellProfiler is a open-source application for biological image analysis

CellProfiler is a free open-source software designed to enable biologists without training in computer vision or programming to quantitatively measure phenotypes from thousands of images automaticall

CellProfiler 732 Dec 23, 2022
QED-C: The Quantum Economic Development Consortium provides these computer programs and software for use in the fields of quantum science and engineering.

Application-Oriented Performance Benchmarks for Quantum Computing This repository contains a collection of prototypical application- or algorithm-cent

SRI International 67 Nov 30, 2022
Detect textlines in document images

Textline Detection Detect textlines in document images Introduction This tool performs border, region and textline detection from document image data

QURATOR-SPK 70 Jun 30, 2022
A Python wrapper for the tesseract-ocr API

tesserocr A simple, Pillow-friendly, wrapper around the tesseract-ocr API for Optical Character Recognition (OCR). tesserocr integrates directly with

Fayez 1.7k Dec 31, 2022
Code release for our paper, "SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo"

SimNet: Enabling Robust Unknown Object Manipulation from Pure Synthetic Data via Stereo Thomas Kollar, Michael Laskey, Kevin Stone, Brijen Thananjeyan

68 Dec 14, 2022
かの有名なあの東方二次創作ソング、「bad apple!」のMVをPythonでやってみたって話

bad apple!! 内容 このプログラムは、bad apple!(feat. nomico)のPVをPythonを用いて再現しよう!という内容です。 実はYoutube並びにGithub上に似たようなプログラムがあったしなんならそっちの方が結構良かったりするんですが、一応公開しますw 使い方 こ

赤紫 8 Jan 05, 2023
Erosion and dialation using structure element in OpenCV python

Erosion and dialation using structure element in OpenCV python

Tamzid hasan 2 Nov 11, 2021
This is a implementation of CRAFT OCR method

This is a implementation of CRAFT OCR method

Esaka 0 Nov 01, 2021
1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge

SIIM-COVID19-Detection Source code of the 1st place solution for SIIM-FISABIO-RSNA COVID-19 Detection Challenge. 1.INSTALLATION Ubuntu 18.04.5 LTS CUD

Nguyen Ba Dung 170 Dec 21, 2022
Corner-based Region Proposal Network

Corner-based Region Proposal Network CRPN is a two-stage detection framework for multi-oriented scene text. It employs corners to estimate the possibl

xhzdeng 140 Nov 04, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
Demo for the paper "Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation"

Streaming speaker diarization Overlap-aware low-latency online speaker diarization based on end-to-end local segmentation by Juan Manuel Coria, Hervé

Juanma Coria 185 Jan 01, 2023
learn how to use Gesture Control to change the volume of a computer

Volume-Control-using-gesture In this project we are going to learn how to use Gesture Control to change the volume of a computer. We first look into h

Diwas Pandey 49 Sep 22, 2022
graph learning code for ogb

The final code for OGB Installation Requirements: ogb=1.3.1 torch=1.7.0 torch-geometric=1.7.0 torch-scatter=2.0.6 torch-sparse=0.6.9 Baseline models T

PierreHao 20 Nov 10, 2022
Learn computer graphics by writing GPU shaders!

This repo contains a selection of projects designed to help you learn the basics of computer graphics. We'll be writing shaders to render interactive two-dimensional and three-dimensional scenes.

Eric Zhang 1.9k Jan 02, 2023
A small C++ implementation of LSTM networks, focused on OCR.

clstm CLSTM is an implementation of the LSTM recurrent neural network model in C++, using the Eigen library for numerical computations. Status and sco

Tom 794 Dec 30, 2022
Binarize document images

Binarization Binarization for document images Examples Introduction This tool performs document image binarization (i.e. transform colour/grayscale to

QURATOR-SPK 48 Jan 02, 2023
The open source extract transaction infomation by using OCR.

Transaction OCR Mã nguồn trích xuất thông tin transaction từ file scaned pdf, ở đây tôi lựa chọn tài liệu sao kê công khai của Thuy Tien. Mã nguồn có

Nguyen Xuan Hung 18 Jun 02, 2022
Deep LearningImage Captcha 2

滑动验证码深度学习识别 本项目使用深度学习 YOLOV3 模型来识别滑动验证码缺口,基于 https://github.com/eriklindernoren/PyTorch-YOLOv3 修改。 只需要几百张缺口标注图片即可训练出精度高的识别模型,识别效果样例: 克隆项目 运行命令: git cl

Python3WebSpider 117 Dec 28, 2022
TedEval: A Fair Evaluation Metric for Scene Text Detectors

TedEval: A Fair Evaluation Metric for Scene Text Detectors Official Python 3 implementation of TedEval | paper | slides Chae Young Lee, Youngmin Baek,

Clova AI Research 167 Nov 20, 2022