CRAFT-Pyotorch:Character Region Awareness for Text Detection Reimplementation for Pytorch

Overview

CRAFT-Reimplementation

Note:If you have any problems, please comment. Or you can join us weChat group. The QR code will update in issues #49 .

Reimplementation:Character Region Awareness for Text Detection Reimplementation based on Pytorch

Character Region Awareness for Text Detection

Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee (Submitted on 3 Apr 2019)

The full paper is available at: https://arxiv.org/pdf/1904.01941.pdf

Install Requirements:

1、PyTroch>=0.4.1
2、torchvision>=0.2.1
3、opencv-python>=3.4.2
4、check requiremtns.txt
5、4 nvidia GPUs(we use 4 nvidia titanX)

pre-trained model:

NOTE: There are old pre-trained models, I will upload the new results pre-trained models' link.
Syndata:Syndata for baidu drive || Syndata for google drive
Syndata+IC15:Syndata+IC15 for baidu drive || Syndata+IC15 for google drive
Syndata+IC13+IC17:Syndata+IC13+IC17 for baidu drive|| Syndata+IC13+IC17 for google drive

Training

Note: When you train the IC15-Data or MLT-Data, please see the annotation in data_loader.py line 92 and line 108-112.

Train for Syndata

  • download the Syndata(I will give the link)
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainSyndata.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、/data/CRAFT-pytorch/synweights/synweights -> /your_path/real_weights)

  • Run python trainSyndata.py

Train for IC15 data based on Syndata pre-trained model

  • download the IC15 data, rename the image file and the gt file for ch4_training_images and ch4_training_localization_transcription_gt,respectively.
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainic15data.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、/data/CRAFT-pytorch/real_weights -> /your_path/real_weights)

  • change the path in trainic15data.py file:

(1、/data/CRAFT-pytorch/1-7.pth -> /your_path/your_pre-trained_model_name 2、/data/CRAFT-pytorch/icdar1317 -> /your_ic15data_path/)

  • Run python trainic15data.py

Train for IC13+17 data based on Syndata pre-trained model

  • download the MLT data, rename the image file and the gt file,respectively.
  • change the path in basernet/vgg16_bn.py file:

(/data/CRAFT-pytorch/vgg16_bn-6c64b313.pth -> /your_path/vgg16_bn-6c64b313.pth).You can download the model here.baidu||google

  • change the path in trainic-MLT_data.py file:

(1、/data/CRAFT-pytorch/SynthText -> /your_path/SynthText 2、savemodel path-> your savemodel path)

  • change the path in trainic-MLT_data.py file:

(1、/data/CRAFT-pytorch/1-7.pth -> /your_path/your_pre-trained_model_name 2、/data/CRAFT-pytorch/icdar1317 -> /your_ic15data_path/)

  • Run python trainic-MLT_data.py

If you want to train for weak supervised use our Syndate pre-trained model:

1、You should first download the pre_trained model trained in the Syndata baidu||google.
2、change the data path and pre-trained model path.
3、run python trainic15data.py

This code supprts for Syndata and icdar2015, and we will release the training code for IC13 and IC17 as soon as possible.

Methods dataset Recall precision H-mean
Syndata ICDAR13 71.93% 81.31% 76.33%
Syndata+IC15 ICDAR15 76.12% 84.55% 80.11%
Syndata+MLT(deteval) ICDAR13 86.81% 95.28% 90.85%
Syndata+MLT(deteval)(new gaussian map method) ICDAR13 90.67% 94.56% 92.57%
Syndata+IC15(new gaussian map method) ICDAR15 80.36% 84.25% 82.26%

We have released the latest code with new gaussian map and random crop algorithm.

Note:new gaussian map method can split the inference gaussian region score map
Sample:

Note:We have solved the problem about detecting big word. Now we are training the model. And any issues or advice are welcome.

Sample:

###weChat QR code

Contributing to the project

We will release training code as soon as possible, and we have not yet reached the results given in the author's paper. Any pull requests or issues are welcome. We also hope that you could give us some advice for the project.

Acknowledgement

Thanks for Youngmin Baek, Bado Lee, Dongyoon Han, Sangdoo Yun, Hwalsuk Lee excellent work and code for test. In this repo, we use the author repo's basenet and test code.

License

For commercial use, please contact us.

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 02, 2023
YOLOv5 in DOTA with CSL_label.(Oriented Object Detection)(Rotation Detection)(Rotated BBox)

YOLOv5_DOTA_OBB YOLOv5 in DOTA_OBB dataset with CSL_label.(Oriented Object Detection) Datasets and pretrained checkpoint Datasets : DOTA Pretrained Ch

1.1k Dec 30, 2022
Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

EasyOCR Ready-to-use OCR with 80+ languages supported including Chinese, Japanese, Korean and Thai. What's new 1 February 2021 - Version 1.2.3 Add set

Jaided AI 16.7k Jan 03, 2023
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

keras-ctpn [TOC] 说明 预测 训练 例子 4.1 ICDAR2015 4.1.1 带侧边细化 4.1.2 不带带侧边细化 4.1.3 做数据增广-水平翻转 4.2 ICDAR2017 4.3 其它数据集 toDoList 总结 说明 本工程是keras实现的CPTN: Detecti

mick.yi 107 Jan 09, 2023
A python screen recorder for low-end computers, provides high quality video output.

RecorderX - v1.0 A screen recorder made in Python with the help of OpenCv, it has ability to record your screen in high quality. No matter what your P

Priyanshu Jindal 4 Nov 10, 2021
(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

ST3D Code release for the paper ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection, CVPR 2021 Authors: Jihan Yang*, Shaoshu

CVMI Lab 224 Dec 28, 2022
This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe libraries.

CVZone This is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe librar

CVZone 648 Dec 30, 2022
Face Recognizer using Opencv Python

Face Recognizer using Opencv Python The first step create your own dataset with file open-cv-create_dataset second step You can put the photo accordin

Han Izza 2 Nov 16, 2021
A machine learning software for extracting information from scholarly documents

GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means

Patrice Lopez 1.9k Jan 08, 2023
A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV.

DcoumentScanner A document scanner application for laptops/desktops developed using python, Tkinter and OpenCV. Directly install the .exe file to inst

Harsh Vardhan Singh 1 Oct 29, 2021
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
Text page dewarping using a "cubic sheet" model

page_dewarp Page dewarping and thresholding using a "cubic sheet" model - see full writeup at https://mzucker.github.io/2016/08/15/page-dewarping.html

Matt Zucker 1.2k Dec 29, 2022
OCR engine for all the languages

Description kraken is a turn-key OCR system optimized for historical and non-Latin script material. kraken's main features are: Fully trainable layout

431 Jan 04, 2023
BD-ALL-DIGIT - This Is Bangladeshi All Sim Cloner Tools

BANGLADESHI ALL SIM CLONER TOOLS INSTALL TOOL ON TERMUX $ apt update $ apt upgra

MAHADI HASAN AFRIDI 2 Jan 19, 2022
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and

Peter M. Stahl 532 Dec 28, 2022
PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

PRImA Research Lab 46 Nov 14, 2022
Zoom , GoogleMeets에서 Vtuber 데뷔하기

EasyVtuber Facial landmark와 GAN을 이용한 Character Face Generation Google Meets, Zoom 등에서 자신만의 웹툰, 만화 캐릭터로 대화해보세요! 악세사리는 어느정도 추가해도 잘 작동해요! 안타깝게도 RTX 2070

Gunwoo Han 140 Dec 23, 2022
FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

Ning Lu 599 Dec 19, 2022
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want.

Virtual Keyboard With the virtual keyboard, you can write on the real time images by combining the thumb and index fingers on the letter you want. At

Güldeniz Bektaş 5 Jan 23, 2022