Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Overview

Total-Text-Dataset (Official site)

Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.)

Updated on March 19, 2020 (Query on the new groundtruth of test set)

Updated on Sept. 08, 2019 (New training groundtruth of Total-Text is now available)

Updated on Sept. 07, 2019 (Updated Guided Annotation toolbox for scene text image annotation)

Updated on Sept. 07, 2019 (Updated baseline as to our IJDAR)

Updated on August 01, 2019 (Extended version with new baseline + annotation tool is accepted at IJDAR)

Updated on May 30, 2019 (Important announcement on Total-Text vs. ArT dataset)

Updated on April 02, 2019 (Updated table ranking with default vs. our proposed DetEval)

Updated on March 31, 2019 (Faster version DetEval.py, support Python3. Thank you princewang1994.)

Updated on March 14, 2019 (Updated table ranking with evaluation protocol info.)

Updated on November 26, 2018 (Table ranking is included for reference.)

Updated on August 24, 2018 (Newly added Guided Annotation toolbox folder.)

Updated on May 15, 2018 (Added groundtruth in '.txt' format.)

Updated on May 14, 2018 (Added feature - 'Do not care' candidates filtering is now available in the latest python scripts.)

Updated on April 03, 2018 (Added pixel level groundtruth)

Updated on November 04, 2017 (Added text level groundtruth)

Released on October 27, 2017

News

  • We received some questions in regard to the new groundtruth for the test set of Total-Text. Here is an update. We do not release a new version of the test set groundtruth because

     1) there is no need of standardising the length of the groundtruth vertices for testing purpose, it was proposed to facilitate training only, and
     2) a new version of groundtruth would make the previous benchmarks irrelevant.
    

Do contact us if you think there is a valid reason to require the new groundtruth for the test set, we shall discuss about it.

  • TOTAL-TEXT is a word-level based English curve text dataset. If you are interested in text-line based dataset with both English and Chinese instances, we highly recommend you to refer SCUT-CTW1500. In addition, a Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT), which is extended from Total-Text and SCUT-CTW1500, was held at ICDAR2019 to stimulate more innovative ideas on the arbitrary-shaped text reading task. Congratulations to all winners and challengers. The technical report of ArT can be found on at this https URL.

Important Announcement

Total-Text and SCUT-CTW1500 are now part of the training set of the largest curved text dataset - ArT (Arbitrary-Shaped Text dataset). In order to retain the validity of future benchmarking on Total-Text datasets, the test-set images of Total-Text should be removed (with the corresponding ID provided HERE) from the ArT dataset shall one intend to leverage the extra training data from the ArT dataset. We count on the trust of the research community to perform such removal operation to attain the fairness of the benchmarking.

Table Ranking

  • The results from recent papers on Total-Text dataset are listed below where P=Precision, R=Recall & F=F-score.
  • If your result is missing or incorrect, please do not hesisate to contact us.
  • The baseline scores are based on our proposed [Poly-FRCNN-3] in this folder.
  • *Pascal VOC IoU metric; **Polygon Regression

Detection Leaderboard

Method Reported
on paper
DetEval
(tp=0.4, tr=0.8)
(Default)
DetEval
(tp=0.6, tr=0.7)
(New Proposal)
Published at
P R F P R F P R F
Our Baseline [paper] 78.0 68.0 73.0 - - - 78.0 68.0 73.0 IJDAR2020
CRAFTS [paper] 89.5 85.4 87.4 - - - - - - ECCV2020
#ASTS_Weakly-ResNet101 (E2E) [paper] - - 87.3 - - - - - - TIP2020
TextFuseNet [paper] 89.0 85.3 87.1 - - - - - - IJCAI2020
#Boundary (E2E) [paper] 88.9 85.0 87.0 - - - - - - AAAI2020
PolyPRNet [paper] 88.1 85.3 86.7 - - - - - - ACCV2020
#Qin et al. (E2E) [paper] 87.8 85.0 86.4 - - - - - - ICCV2019
100%Poly [paper] 88.2 83.3 85.6 - - - - - - arXiv:2012
ContourNet [paper] 86.9 83.9 85.4 - - - - - - CVPR2020
#Text Perceptron (E2E) [paper] 88.8 81.8 85.2 - - - - - - AAAI2020
PAN-640 [paper] 89.3 81.0 85.0 - - - - - - ICCV2019
DB-ResNet50 (800) [paper] 87.1 82.5 84.7 - - - - - - AAAI2020
TextCohesion [paper] 88.1 81.4 84.6 - - - - - - arXiv:1904
Feng et al. [paper] 87.3 81.1 84.1 - - - - - - IJCV2020
ReLaText [paper] 84.8 83.1 84.0 - - - - - - arXiv:2003
CRAFT [paper] 87.6 79.9 83.6 - - - - - - CVPR2019
LOMO MS [paper] 87.6 79.3 83.3 - - - - - - CVPR2019
SPCNet [paper] 83.0 82.8 82.9 - - - - - - AAAI2019
#ABCNet (E2E) [paper] 85.4 80.1 82.7 - - - - - - CVPR2020
ICG [paper] 82.1 80.9 81.5 - - - - - - PR2019
FTSN [paper] *84.7 *78.0 *81.3 - - - - - - ICPR2018
PSENet-1s [paper] 84.02 77.96 80.87 - - - - - - CVPR2019
1TextField [paper] 81.2 79.9 80.6 76.1 75.1 75.6 83.0 82.0 82.5 TIP2019
#TextDragon (E2E) [paper] 85.6 75.7 80.3 - - - - - - ICCV2019
CSE [paper] 81.4
(**80.9)
79.7
(**80.3)
80.2
(**80.6)
- - - - - - CVPR2019
MSR [paper] 85.2 73.0 78.6 82.7 68.3 74.9 81.4 72.5 76.7 arXiv:1901
ATTR [paper] 80.9 76.2 78.5 - - - - - - CVPR2019
TextSnake [paper] 82.7 74.5 78.4 - - - - - - ECCV2018
1CTD [paper] 74.0 71.0 73.0 60.7 58.8 59.8 76.5 73.8 75.2 PR2019
#TextNet (E2E) [paper] 68.2 59.5 63.5 - - - - - - ACCV2018
#,2Mask TextSpotter (E2E) [paper] 69.0 55.0 61.3 68.9 62.5 65.5 82.5 75.2 78.6 ECCV2018
CENet [paper] 59.9 54.4 57.0 - - - - - - ACCV2018
#Textboxes (E2E) [paper] 62.1 45.5 52.5 - - - - - - AAAI2017
EAST [paper] 50.0 36.2 42.0 - - - - - - CVPR2017
SegLink [paper] 30.3 23.8 26.7 - - - - - - CVPR2017

Note:

# Framework that does end-to-end training (i.e. detection + recognition).

1For the results of TextField and CTD, the improved versions of their original paper were used, and this explains why the performance is better.

2For Mask-TextSpotter, the relatively poor performance reported in their paper was due to a bug in the input reading module (which was fixed recently). The authors were informed about this issue.

End-to-end Recognition Leaderboard
(None refers to recognition without any lexicon; Full lexicon contains all words in test set.)

Method Backbone None (%) Full (%) FPS Published at
CRAFTS [paper] ResNet50-FPN 78.7 - - ECCV2020
MANGO [paper] ResNet50-FPN 72.9 83.6 4.3 AAAI2021
Text Perceptron [paper] ResNet50-FPN 69.7 78.3 - AAAI2020
ABCNet-MS [paper] ResNet50-FPN 69.5 78.4 6.9 CVPR2020
CharNet H-88 MS [paper] ResNet50-Hourglass57 69.2 - 1.2 ICCV2019
Qin et al. [paper] ResNet50-MSF 67.8 - - ICCV2019
ASTS_Weakly [paper] ResNet101-FPN 65.3 84.2 2.5 TIP2020
Boundary [paper] ResNet50-FPN 65.0 76.1 - AAAI2020
ABCNet [paper] ResNet50-FPN 64.2 75.7 17.9 CVPR2020
CAPNet [paper] ResNet50-FPN 62.7 - - ICASSP2020
Feng et al. [paper] VGG 55.8 79.2 - IJCV2020
TextNet [paper] ResNet50-SAM 54.0 - 2.7 ACCV2018
Mask TextSpotter [paper] ResNet50-FPN 52.9 71.8 4.8 ECCV2018
TextDragon [paper] VGG16 48.8 74.8 - ICCV2019
Textboxes [paper] ResNet50-FPN 36.3 48.9 1.4 AAAI2017

Description

In order to facilitate a new text detection research, we introduce Total-Text dataset (IJDAR)(ICDAR-17 paper) (presentation slides), which is more comprehensive than the existing text datasets. The Total-Text consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Citation

If you find this dataset useful for your research, please cite

@article{CK2019,
  author    = {Chee Kheng Ch’ng and
               Chee Seng Chan and
               Chenglin Liu},
  title     = {Total-Text: Towards Orientation Robustness in Scene Text Detection},
  journal   = {International Journal on Document Analysis and Recognition (IJDAR)},
  volume    = {23},
  pages     = {31-52},
  year      = {2020},
  doi       = {10.1007/s10032-019-00334-z},
}

Feedback

Suggestions and opinions of this dataset (both positive and negative) are greatly welcome. Please contact the authors by sending email to chngcheekheng at gmail.com or cs.chan at um.edu.my.

License and Copyright

The project is open source under BSD-3 license (see the LICENSE file).

For commercial purpose usage, please contact Dr. Chee Seng Chan at cs.chan at um.edu.my

©2017-2020 Center of Image and Signal Processing, Faculty of Computer Science and Information Technology, University of Malaya.

Owner
Chee Seng Chan
Chee Seng Chan
[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 update20190706 为解决本项目中对数学公式预测的准确性,做了其他的改进和尝试,效果还不错,https://github.com/xiaofengShi/Image2Katex 希

xiaofeng 2.7k Dec 25, 2022
Convert PDF/Image to TXT using EasyOcr - the best OCR engine available!

PDFImage2TXT - DOWNLOAD INSTALLER HERE What can you do with it? Convert scanned PDFs to TXT. Convert scanned Documents to TXT. No coding required!! In

Hans Alemão 2 Feb 22, 2022
Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022) Code release of our paper ROCA. Check out our video, paper, and website

123 Dec 25, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 182 Dec 29, 2022
Qrcode Attendence System with Opencv and Pyzbar

Setup process Creates a virtual environment (Scripts that ensure executed Python code uses the Python interpreter and site packages installed inside t

Ganesh 5 Aug 01, 2022
This repository lets you train neural networks models for performing end-to-end full-page handwriting recognition using the Apache MXNet deep learning frameworks on the IAM Dataset.

Handwritten Text Recognition (OCR) with MXNet Gluon These notebooks have been created by Jonathan Chung, as part of his internship as Applied Scientis

Amazon Web Services - Labs 422 Jan 03, 2023
🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

Charset Detection, for Everyone 👋 The Real First Universal Charset Detector A library that helps you read text from an unknown charset encoding. Moti

TAHRI Ahmed R. 332 Dec 31, 2022
Image augmentation library in Python for machine learning.

Augmentor is an image augmentation library in Python for machine learning. It aims to be a standalone library that is platform and framework independe

Marcus D. Bloice 4.8k Jan 04, 2023
OCR system for Arabic language that converts images of typed text to machine-encoded text.

Arabic OCR OCR system for Arabic language that converts images of typed text to machine-encoded text. The system currently supports only letters (29 l

Hussein Youssef 144 Jan 05, 2023
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

Canjie Luo 595 Dec 27, 2022
零样本学习测评基准,中文版

ZeroCLUE 零样本学习测评基准,中文版 零样本学习是AI识别方法之一。 简单来说就是识别从未见过的数据类别,即训练的分类器不仅仅能够识别出训练集中已有的数据类别, 还可以对于来自未见过的类别的数据进行区分。 这是一个很有用的功能,使得计算机能够具有知识迁移的能力,并无需任何训练数据, 很符合现

CLUE benchmark 27 Dec 10, 2022
A set of workflows for corpus building through OCR, post-correction and normalisation

PICCL: Philosophical Integrator of Computational and Corpus Libraries PICCL offers a workflow for corpus building and builds on a variety of tools. Th

Language Machines 41 Dec 27, 2022
Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 06, 2022
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022
Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

TimeLens: Event-based Video Frame Interpolation This repository is about the High Speed Event and RGB (HS-ERGB) dataset, used in the 2021 CVPR paper T

Robotics and Perception Group 544 Dec 19, 2022
Detecting Text in Natural Image with Connectionist Text Proposal Network (ECCV'16)

Detecting Text in Natural Image with Connectionist Text Proposal Network The codes are used for implementing CTPN for scene text detection, described

Tian Zhi 1.3k Dec 22, 2022
Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. This Neural Network (NN) model recognizes the text contained in the images of segmented words.

Handwritten-Text-Recognition Handwritten Text Recognition (HTR) system implemented with TensorFlow (TF) and trained on the IAM off-line HTR dataset. T

27 Jan 08, 2023
It is a image ocr tool using the Tesseract-OCR engine with the pytesseract package and has a GUI.

OCR-Tool It is a image ocr tool made in Python using the Tesseract-OCR engine with the pytesseract package and has a GUI. This is my second ever pytho

Khant Htet Aung 4 Jul 11, 2022
第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)第一名;仅采用densenet识别图中文字

OCR 第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)冠军 模型结果 该比赛计算每一个条目的f1score,取所有条目的平均,具体计算方式在这里。这里的计算方式不对一句话里的相同文字重复计算,故f1score比提交的最终结果低: - train val f1score 0

尹畅 441 Dec 22, 2022