Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Last update: Dec 28, 2022

Overview

CRAFT: Character-Region Awareness For Text detection

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector | Paper |

Overview

PyTorch implementation for CRAFT text detector that effectively detect text area by exploring each character region and affinity between characters. The bounding box of texts are obtained by simply finding minimum bounding rectangles on binary map after thresholding character region and affinity scores.

Getting started

Installation

Install using conda for Linux, Mac and Windows (preferred):

conda install -c fcakyon craft-text-detector

Install using pip for Linux and Mac:

pip install craft-text-detector

Basic Usage

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

Advanced Usage

# import craft functions
from craft_text_detector import (
    read_image,
    load_craftnet_model,
    load_refinenet_model,
    get_prediction,
    export_detected_regions,
    export_extra_results,
    empty_cuda_cache
)

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# read image
image = read_image(image_path)

# load models
refine_net = load_refinenet_model(cuda=True)
craft_net = load_craftnet_model(cuda=True)

# perform prediction
prediction_result = get_prediction(
    image=image,
    craft_net=craft_net,
    refine_net=refine_net,
    text_threshold=0.7,
    link_threshold=0.4,
    low_text=0.4,
    cuda=True,
    long_size=1280
)

# export detected text regions
exported_file_paths = export_detected_regions(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    output_dir=output_dir,
    rectify=True
)

# export heatmap, detection points, box visualization
export_extra_results(
    image_path=image_path,
    image=image,
    regions=prediction_result["boxes"],
    heatmaps=prediction_result["heatmaps"],
    output_dir=output_dir
)

# unload models from gpu
empty_cuda_cache()

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes++: A Single-Shot Oriented Scene Text Detector Introduction This is an application for scene text detection (TextBoxes++) and recognition (CR

930 Jan 4, 2023

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

24 Apr 28, 2022

Open Source research tool to search, browse, analyze and explore large document collections by Semantic Search Engine and Open Source Text Mining & Text Analytics platform (Integrates ETL for document processing, OCR for images & PDF, named entity recognition for persons, organizations & locations, metadata management by thesaurus & ontologies, search user interface & search apps for fulltext search, faceted search & knowledge graph)

Open Semantic Search https://opensemanticsearch.org Integrated search server, ETL framework for document processing (crawling, text extraction, text a

684 Jan 6, 2023

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Handwritten Line Text Recognition using Deep Learning with Tensorflow Description Use Convolutional Recurrent Neural Network to recognize the Handwrit

224 Jan 7, 2023

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

DataTuner You have just found the DataTuner. This repository provides tools for fine-tuning language models for a task. See LICENSE.txt for license de

81 Jan 1, 2023

This can be use to convert text in a file to handwritten text.

TextToHandwriting This can be used to convert text to handwriting. Clone this project or download the code. Run TextToImage.py give the filename of th

2 Feb 6, 2022

python ocr using tesseract/ with EAST opencv detector

pytextractor python ocr using tesseract/ with EAST opencv text detector Uses the EAST opencv detector defined here with pytesseract to extract text(de

38 Dec 5, 2022

Augmenting Anchors by the Detector Itself

Augmenting Anchors by the Detector Itself Introduction It is difficult to determine the scale and aspect ratio of anchors for anchor-based object dete

4 Nov 6, 2022

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream video.

10 Jun 30, 2021

Comments

Add more options for detect_text method

Hi, sometime I don't want detect_text from file, I want detect_text directly from image in ndarray format, that will save more cost of I/O time. So I contribute this. Thanks for your work

opened by ducviet00 2
Enable package to load model from local path

When using the pypi package it should be allowed to use a model from a local path, because loading it from a remote location removes the control over what model is currently used. And might also result in pull limits being reached.
enhancement

opened by TanjaBayer 1
Fix #8 - Fixing cuda issues in basic usage text detection

Fixing issue #8

In this quick-fix I referenced craft_net as a global variable. If this is not an acceptable workaround, then consider reorganizing the structure of the code.

Have a nice day :)

opened by gaborpelesz 1
accept customized weights path when loading models
path for the weight file can be specified by:

load_craftnet_model(weight_path="path/to/weight")

load_refinenet_model(weight_path="path/to/weight")
opened by fcakyon 0

Releases(0.4.3)

0.4.3(May 9, 2022)
What's Changed

Enable package to load model from local path by @TanjaBayer in https://github.com/fcakyon/craft-text-detector/pull/53

New Contributors

@TanjaBayer made their first contribution in https://github.com/fcakyon/craft-text-detector/pull/53

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.2...0.4.3
Source code(tar.gz)
Source code(zip)
0.4.2(Jan 6, 2022)
What's Changed

fix opencv version by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/48

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.1...0.4.2
Source code(tar.gz)
Source code(zip)
0.4.1(Dec 20, 2021)
What's Changed

fix crop export by @fcakyon in https://github.com/fcakyon/craft-text-detector/pull/45

Full Changelog: https://github.com/fcakyon/craft-text-detector/compare/0.4.0...0.4.1
Source code(tar.gz)
Source code(zip)
0.4.0(Jul 30, 2021)
enhancement

fix boxes outside image boundaries (#37)

breaking changes

drop conda support, update python version (#38)

Source code(tar.gz)
Source code(zip)
0.3.5(May 12, 2021)
Rebuild conda binaries.

Source code(tar.gz)
Source code(zip)

0.3.4(Apr 7, 2021)

add support for PIL and numpy images in addition to filepath. https://github.com/fcakyon/craft-text-detector/pull/28

from PIL import Image
import numpy

# can be filepath, PIL image or numpy array
image = 'figures/idcard.png' 
image = Image.open("figures/idcard.png")
image = numpy.array(Image.open("figures/idcard.png"))

# apply craft text detection
prediction_result = craft.detect_text(image)

Source code(tar.gz)
Source code(zip)

0.3.3(Mar 2, 2021)
Relax requirements for OpenCV (#25)

Source code(tar.gz)
Source code(zip)

0.3.2(Mar 2, 2021)

path for the weight file can be specified by:

load_craftnet_model(weight_path="path/to/weight")

load_refinenet_model(weight_path="path/to/weight")

Source code(tar.gz)
Source code(zip)

v0.3.1(May 14, 2020)
fix empty_cuda_cache

Source code(tar.gz)
Source code(zip)

v0.3.0(May 14, 2020)

updated basic usage for better device handling, now Craft instance should be created before calling detect_text:

# import Craft class
from craft_text_detector import Craft

# set image path and export folder directory
image_path = 'figures/idcard.png'
output_dir = 'outputs/'

# create a craft instance
craft = Craft(output_dir=output_dir, crop_type="poly", cuda=False)

# apply craft text detection and export detected regions to output directory
prediction_result = craft.detect_text(image_path)

# unload models from ram/gpu
craft.unload_craftnet_model()
craft.unload_refinenet_model()

some internal naming and styling changes

Source code(tar.gz)
Source code(zip)

v0.2.1(May 10, 2020)
fix cuda device bug

fix visualization export bug

Source code(tar.gz)
Source code(zip)
v0.2.0a(Apr 22, 2020)

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 22, 2020)
time profiling

better input size handling (with new long_size parameter)

bug fixes

Source code(tar.gz)
Source code(zip)

Owner

Senior Machine Learning Engineer, METU & Bilkent alum.

GitHub Repository

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

OCR-Pipeline-with-Keras The keras-ocr package generally consists of two parts: a Detector and a Recognizer: Detector is responsible for creating bound

1 Jan 05, 2022

Python package for handwriting and sketching in Jupyter cells

ipysketch A Python package for handwriting and sketching in Jupyter notebooks. Usage A movie is worth a thousand pictures is worth a million words...

16 Jan 05, 2023

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

OpenCV-CameraCalibration-Example FishEyeCameraCalibration.mp4 OpenCVを用いたカメラキャリブレーションのサンプルです 2021/06/21時点でPython実装のある以下3種類について用意しています。通常カメラ向け魚眼レンズ向け(

34 Nov 17, 2022

Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz.

opencv_yuz_bulma Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz. Bilgisarın kendi kamerasını kullanmak için;

6 Apr 16, 2022

Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

dio-live-textract2 Repositório de código para o live coding do dia 05/10/2021 sobre extração de dados estruturados e gravação em banco de dados a part

0 Jan 19, 2022

PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

46 Nov 14, 2022

Computer vision applications project (Flask and OpenCV)

Computer Vision Applications Project This project is at it's initial phase. This is all about the implementation of different computer vision techniqu

1 Jan 26, 2022

Maze generator and solver with python

Procedural-Maze-Generator-Algorithms Check out my youtube channel : Auctux Ressources Thanks to Jamis Buck Book : Mazes for programmers Requirements P

19 Dec 07, 2022

A Joint Video and Image Encoder for End-to-End Retrieval

Frozen️ in Time ❄️ ️️️️ ⏳ A Joint Video and Image Encoder for End-to-End Retrieval (arXiv) Repository to contain the code, models, data for end-to-end

225 Dec 25, 2022

A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

NRSC5-DUI is a graphical interface for nrsc5. It makes it easy to play your favorite FM HD radio stations using an RTL-SDR dongle. It will also displa

61 Dec 22, 2022

Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

CodeSquad PS1 Solution for Problem Statement 1 for AIDL 2020 conducted by @unifynd technologies. Problem Given images of bills/invoices, the task was

111 Nov 27, 2022

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Total-Text-Dataset (Official site) Updated on April 29, 2020 (Detection leaderboard is updated - highlighted E2E methods. Thank you shine-lcy.) Update

671 Dec 27, 2022

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

Related tags

Overview

CRAFT: Character-Region Awareness For Text detection

Overview

Getting started

Installation

Basic Usage

Advanced Usage

You might also like...

TextBoxes++: A Single-Shot Oriented Scene Text Detector

TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法，textBoxes_note记录了之前整理的笔记。

Use Convolutional Recurrent Neural Network to recognize the Handwritten line text image without pre segmentation into words or characters. Use CTC loss Function to train.

Code related to "Have Your Text and Use It Too! End-to-End Neural Data-to-Text Generation with Semantic Fidelity" paper

This can be use to convert text in a file to handwritten text.

python ocr using tesseract/ with EAST opencv detector

Augmenting Anchors by the Detector Itself

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Comments

Add more options for detect_text method

Enable package to load model from local path

Fix #8 - Fixing cuda issues in basic usage text detection

accept customized weights path when loading models

Releases(0.4.3)

0.4.3(May 9, 2022)

What's Changed

New Contributors

0.4.2(Jan 6, 2022)

What's Changed

0.4.1(Dec 20, 2021)

What's Changed

0.4.0(Jul 30, 2021)

enhancement

breaking changes

0.3.5(May 12, 2021)

0.3.4(Apr 7, 2021)

0.3.3(Mar 2, 2021)

0.3.2(Mar 2, 2021)

v0.3.1(May 14, 2020)

v0.3.0(May 14, 2020)

v0.2.1(May 10, 2020)

v0.2.0a(Apr 22, 2020)

v0.2.0(Apr 22, 2020)

Owner

Fine tuning keras-ocr python package with custom synthetic dataset from scratch

Python package for handwriting and sketching in Jupyter cells

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

Bu uygulamada Python ve Opencv kullanarak bilgisayar kamerasından yüz tespiti yapıyoruz.

Usando o Amazon Textract como OCR para Extração de Dados no DynamoDB

PAGE XML format collection for document image page content and more

Computer vision applications project (Flask and OpenCV)

Maze generator and solver with python

A Joint Video and Image Encoder for End-to-End Retrieval

A version of nrsc5-gui that merges the interface developed by cmnybo with the architecture developed by zefie in order to start a new baseline that is not heavily dependent upon Python processing.

Solution for Problem 1 by team codesquad for AIDL 2020. Uses ML Kit for OCR and OpenCV for image processing

Total Text Dataset. It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

Deskewing images with slanted content

Table recognition inside douments using neural networks

Text layer for bio-image annotation.

Handwritten Text Recognition (HTR) using TensorFlow 2.x

Automatically remove the mosaics in images and videos, or add mosaics to them.

Kornia is a open source differentiable computer vision library for PyTorch.

Run tesseract with the tesserocr bindings with @OCR-D's interfaces

Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.