OCR engine for all the languages

Last update: Jan 04, 2023

Overview

Description

https://travis-ci.org/mittagessen/kraken.svg?branch=master

kraken is a turn-key OCR system optimized for historical and non-Latin script material.

kraken's main features are:

Fully trainable layout analysis and character recognition

Right-to-Left, BiDi, and Top-to-Bottom script support

ALTO, PageXML, abbyXML, and hOCR output

Word bounding boxes and character cuts

Multi-script recognition support

Public repository of model files

Lightweight model files

Variable recognition network architectures

Installation

When using a recent version of pip all dependencies will be installed from binary wheel packages, so installing build-essential or your distributions equivalent is often unnecessary. kraken only runs on Linux or Mac OS X. Windows is not supported.

Install the latest development version through conda:

$ wget https://raw.githubusercontent.com/mittagessen/kraken/master/environment.yml
$ conda env create -f environment.yml

or:

$ wget https://raw.githubusercontent.com/mittagessen/kraken/master/environment_cuda.yml
$ conda env create -f environment_cuda.yml

for CUDA acceleration with the appropriate hardware.

It is also possible to install the latest stable release from pypi:

$ pip install kraken

Finally you'll have to scrounge up a model to do the actual recognition of characters. To download the default model for printed English text and place it in the kraken directory for the current user:

$ kraken get 10.5281/zenodo.2577813

A list of libre models available in the central repository can be retrieved by running:

$ kraken list

Quickstart

Recognizing text on an image using the default parameters including the prerequisite steps of binarization and page segmentation:

$ kraken -i image.tif image.txt binarize segment ocr

To binarize a single image using the nlbin algorithm:

$ kraken -i image.tif bw.png binarize

To segment an image (binarized or not) with the new baseline segmenter:

$ kraken -i image.tif lines.json segment -bl

To segment and OCR an image using the default model(s):

$ kraken -i image.tif image.txt segment -bl ocr

All subcommands and options are documented. Use the help option to get more information.

Documentation

Have a look at the docs

Funding

kraken is developed at the École Pratique des Hautes Études, Université PSL.

Comments

Training for Devanagari

I am trying to build a Devanagari model using kraken. When I use default values for training it works but when I specify training and eval data separately, I get a codec error.

The following uses the same set of trainingdata.

This worked:

ketos train devatrain/*.png > devatrain.log

WARNING: Logging before flag parsing goes to stderr.
W0218 04:40:21.871934 127598883522176 __init__.py:74] TensorFlow version 1.15.0 detected. Last version known to be fully compatible is 1.14.0 .
Initializing model ✓

This gets the error:

ketos -v  train -t devatrain/*.png -e devatrain/*.png -o devatraintest > devatraintest.log

Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/envs/py36/bin/ketos", line 8, in <module>
    sys.exit(cli())
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 1135, in invoke
    sub_ctx = cmd.make_context(cmd_name, args, parent=ctx)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 641, in make_context
    self.parse_args(ctx, args)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 940, in parse_args
    value, args = param.handle_parse_result(ctx, opts, args)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 1477, in handle_parse_result
    self.callback, ctx, self, value)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/click/core.py", line 96, in invoke_param_callback
    return callback(ctx, param, value)
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/site-packages/kraken/ketos.py", line 63, in _validate_manifests
    for entry in manifest.readlines():
  File "/home/ubuntu/anaconda3/envs/py36/lib/python3.6/codecs.py", line 321, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

opened by Shreeshrii 22

New training code is eating memory

I'm trying to train Japanese OCR model using 2637 images with PAGE xml files. ketos train -d cuda:0 -f page -F 1 -q early --augment -o jpn *.xml

With 3.0b5 ketos consumes about 2.1GB memory and it is reasonable. With master(patch https://github.com/mittagessen/kraken/pull/199 is needed to run) ketos is eating more than 30GB memory and hangs.

It seems all images are stored in memory. Is there any option not to do that?

opened by eighttails 21
Docs for the segmentation training format

Hey there, I'd like to train a segmenter before leaving next week, and I understood you will release it soon. Any place where I can find the segmenter training format ?

opened by PonteIneptique 17
Training kraken and RTL support?

@amitdo commented here on the specific RTL support in kraken. Since I am unsucessfully training OCR models for Hebrew with ocropy, I wonder if kraken could do the job. Can anyone introduce me to the details of kraken's RLT support? I could not find the related information in the documentation. Many thanks in advance!

opened by wrznr 17
Only one model in the repository?

I just installed kraken and kraken list gives only 10.5281/zenodo.2577813 (pytorch) - A generalized model for English printed text. Where are other models? I'm especially interested in Medieval Latin.

opened by jsbien 16

Error when i run kraken segment ocr

I ran the below command and receive an error as below.

This happens in both python3.6.8 and python3.7.2

 kraken -i bank.png bank.json segment  --remove_hlines --no-script-detect --scale 20 --pad 100 23 ocr --model en-default.pronn 
Loading RNN default	✓
Segmenting	✓
Traceback (most recent call last):
  File "/home/ram/code/lendsmart/py/kraken/venv/bin/kraken", line 10, in <module>
    sys.exit(cli())
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/click/core.py", line 1164, in invoke
    return _process_result(rv)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/click/core.py", line 1102, in _process_result
    **ctx.params)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/kraken/kraken.py", line 220, in process_pipeline
    task(base_image=base_image, input=input, output=output)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/kraken/kraken.py", line 157, in recognizer
    for pred in bar:
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/click/_termui_impl.py", line 285, in generator
    for rv in self.iter:
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/kraken/rpred.py", line 301, in rpred
    preds = network.predict(line)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/kraken/lib/models.py", line 81, in predict
    o = self.forward(line)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/kraken/lib/models.py", line 69, in forward
    o = self.nn.nn(line)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/torch/nn/modules/container.py", line 92, in forward
    input = module(input)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/kraken/lib/layers.py", line 350, in forward
    o, _ = self.layer(inputs)
  File "/home/ram/code/lendsmart/py/kraken/venv/lib/python3.6/site-packages/torch/nn/modules/module.py", line 489, in __call__
    result = self.forward(*input, **kwargs)
TypeError: forward() missing 1 required positional argument: 'hidden'

Here is my pip list in venv

screenshot from 2019-01-28 00-49-37

Let me know if i missed something.

Is my level of pips correct ?

Should i use a different level for torch ?

opened by kishoreneelamegam 16

Bad Credentials

Hello, I tried to install kraken using pip3 and everything went fine, but I cannot use it. As soon as I try to get default, I have the following error message

$ kraken get default
Retrieving model	⣾Traceback (most recent call last):
  File "/usr/local/bin/kraken", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 1092, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.6/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/kraken/kraken.py", line 346, in get
    partial(spin, 'Retrieving model'))
  File "/usr/local/lib/python3.6/site-packages/kraken/repo.py", line 46, in get_model
    raise KrakenRepoException(resp['message'])
kraken.lib.exceptions.KrakenRepoException: Bad credentials

I did remove kraken and tried with a pip2 install, but the result remains the same :

$ kraken get default
Retrieving model	⣾Traceback (most recent call last):
  File "/usr/local/bin/kraken", line 11, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 1092, in invoke
    rv.append(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python2.7/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python2.7/site-packages/kraken/kraken.py", line 346, in get
    partial(spin, 'Retrieving model'))
  File "/usr/local/lib/python2.7/site-packages/kraken/repo.py", line 46, in get_model
    raise KrakenRepoException(resp['message'])
kraken.lib.exceptions.KrakenRepoException: Bad credentials

What did I miss ? (I use a macOS 10.12.6, python 2.7 or 3.6)

opened by loranger 16

`Failed processing image.png: ndim`

When I run the following command:

kraken -i OCR17plus/Data/Balzac1624_Lettres_btv1b86262420_corrected/png/Balzac1624_Lettres_btv1b86262420_corrected_0042.png results.txt segment -bl -i OCR17plus/Model/Segment/appenzeller.mlmodel ocr -m OCR17plus/Model/HTR/dentduchat.mlmodel

I have the following issue:

[37.7261] Failed processing OCR17plus/Data/Balzac1624_Lettres_btv1b86262420_corrected/png/Balzac1624_Lettres_btv1b86262420_corrected_0042.png: ndim

The data used is available here: https://github.com/Heresta/OCR17plus

Any idea what the problem could be?

opened by gabays 15

Ketos segtrain applying topline when asking for baseline and vice-versa ?
Hey there,

I can be wrong, but I have had weird behavior training the segmenter. I believe there is a mistake in the Click definition of the segmenter, but again, I can be wrong...

In the click, you got -bl/-tl:

https://github.com/mittagessen/kraken/blob/45f6bb0b1b1632de6077e6712f11d5461ddad63a/kraken/ketos.py#L158-L161

This value is of baseline (True or False) is passed down

https://github.com/mittagessen/kraken/blob/45f6bb0b1b1632de6077e6712f11d5461ddad63a/kraken/ketos.py#L260

to the topline kwarg and can be reused here:

https://github.com/mittagessen/kraken/blob/abe08ba4bec3778cf9e15c5cea7c0de503358a61/kraken/lib/segmentation.py#L440-L444

In this excerpt, it shows that topline=True will use topline (which makes sense).

If you create a file simply containing the same click declaration, such as :

import click @click.command() @click.option('-bl/-tl', '--baseline/--topline', show_default=True, default=False, help='Switch for the baseline location in the scripts. ' 'Set to topline if the data is annotated with a hanging baseline, as is ' 'common with Hebrew, Bengali, Devanagari, etc.') def bl(baseline): print(baseline) if __name__ == "__main__": bl()

you'll see that -bl gets makes baseline=True which will be directly used as topline=True. On the contrary, using -tl will set it to False. Basically, it seems to be the opposite of what is expected.
opened by PonteIneptique 15
Segmenter proposes creative segmentations

Hey, To follow up on https://github.com/mittagessen/kraken/issues/256, I felt like a new issue was in order, as the first was targeted at the inversion of -bl and -tl.

Before today, training and segmenting would provide a rather good segmentation but strike-through:

Since the commits of today, the segmentation is completely all over the place, using the same training set, eval set and command:

@gabays has seen the same issue

opened by PonteIneptique 14
Multi-process/thread Dataset building ?

Hi there :) I was wondering if it'd be possible to boost a little the speed of building training set / valid set ? I looked at it and it seems quite "sequential"

https://github.com/mittagessen/kraken/blob/d39c45564df81d84bea58ee0067a48585a5f63e2/kraken/lib/train.py#L624-L632

I could take a chance at it if you do not have the time, and if you give me pointer on how you'd like it to be done (proxy function / reusing --threads, etc. )

opened by PonteIneptique 14
install contradiction

mamba env create -f environment_cuda.yml --> leaves me with numpy 1.19.5 which in turn gives an error "RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd" my attempts to update numpy failed as pip says : ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. qudida 0.0.4 requires opencv-python-headless>=4.0.1, which is not installed. albumentations 1.3.0 requires opencv-python-headless>=4.1.1, which is not installed. kraken 4.2.1.dev84 requires numpy<=1.23.0, but you have numpy 1.24.1 which is incompatible. coremltools 4.1 requires numpy<1.20,>=1.14.5, but you have numpy 1.24.1 which is incompatible.

opened by dstoekl 0
Windows

Is there any possibility to run it in Windows? Can Ubuntu be installed from Windows Store and then be used?

Does this model "Kraken:arabPersPrBigMixed_best(1)" ring a bell? If yes, from where to download it? More details in this video in minute 05:21 in the Open ITI project

Thanks,

Medo Hamdani

opened by MedoHamdani 0
in training phase ketos train repeats warning for unicode points not in training data.

WARNING Non-encodable sequence ︎◻ ךו... encountered. Advancing one code point. codec.py:131 etc

this is repeated for every epoch

suggestion: suppress this as the difference of training and testing codec already has a separate yellow warning before launch of training.

opened by dstoekl 0

`ketos train` repeats validation in a loop if early stopping comes too early

The training was started with at least 200 epochs and 20 tries to get a better model:

ketos train -f page -t list.train -e list.eval -o Juristische_Konsilien_Tuebingen+256 -d cuda:0 --augment --workers 24 -r 0.0001 -B 1 --min-epochs 200 --lag 20 -w 0 -s '[256,64,0,1 Cr4,2,8,4,2 Cr4,2,32,1,1 Mp4,2,4,2 Cr3,3,64,1,1 Mp1,2,1,2 S1(1x0)1,3 Lbx256 Do0.5 Lbx256 Do0.5 Lbx256 Do0.5 Cr255,1,85,1,1]'

Early stopping would have stopped after stage 111, but training continues because at least 200 was requested. Instead of producing stage 112, 113, 114, ..., it stays at stage 112 and repeats the validation step again and again:

stage 109/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7366/7366 0:00:00 0:05:14 val_accuracy: 0.87676  early_stopping: 18/20 0.87974
stage 110/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7366/7366 0:00:00 0:05:18 val_accuracy: 0.87542  early_stopping: 19/20 0.87974
stage 111/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7366/7366 0:00:00 0:05:18 val_accuracy: 0.87760  early_stopping: 20/20 0.87974
stage 112/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0/7366 -:--:-- 0:00:00  early_stopping: 20/20 0.87974Trainer was signaled to stop but the required `min_epochs=200` or `min_steps=None` has not been met. Training will continue...
stage 112/∞ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11802/7366 0:00:00 0:08:02 val_accuracy: 0.87345  early_stopping: 20/20 0.87974
Validation  ━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 223/826    0:00:40 0:00:16

opened by stweil 6

Kraken user-defined metadata keep the base model accuracy scores in the logs

Hey, I just discovered that when you fine tune a model, the new model keeps the logs from the base model, such that it has the dev score from the first model in model.user_defined_metadata["kraken_meta"])["accuracy"]

Is it intended ? :)

opened by PonteIneptique 0

Releases(4.1.2)

4.1.2(Jun 7, 2022)
Commits

3e10158: set border value in erosion in seamcarve (Benjamin Kiessling)

Source code(tar.gz)
Source code(zip)
kraken-4.1.2-0.tar-7a0c161677f48a303369418514853137.bz2(5.42 MB)
kraken-4.1.2-0.tar-ebf6d89d3b371f083bce070356b96ae8.bz2(5.42 MB)
kraken-4.1.2-0.tar.bz2(5.41 MB)
kraken-4.1.2-py3-none-any.whl(5.17 MB)
kraken-4.1.2.tar.gz(11.19 MB)
3.0.6(Nov 8, 2021)
This is mainly a bugfix release containing small improvements such as additional tests, typing, spelling corrections, additional contrib scripts, and fixes for rarely used functionality.

Bugfixes

Orthography and missing help messages in the CLI drivers

Documentation for batch input specifications

Fix a regression in early stopping when training on GPU

Fix a regression in polygonization in the presence of regions

Do not duplicate regions during serialization

Add dummy String beneath TextLine w/o text in ALTO to avoid standard-violating empty TextLines

The codec loading functionality of ketos train and KrakenTrainer actually loads a given codec now.

Fall back to simple scaling when centerline dewarping fails

Drop (duplicate) short option form -p for --pad in all ketos commands

Features

The forced alignment script contrib/forced_alignment_overlay.py now preserves the input file and only replaces the character cuts.

Add reading order tests

Explicit model sanity checks in blla.segment()

Add baseline offset options to repolygonization script

Make codec self-synchronizing

Add TextEquiv for Word and TextLine in PAGE XML output

Raise PIL image size limit to 20k*20k image dimensions

Source code(tar.gz)
Source code(zip)
kraken-3.0.6-py3-none-any.whl(5.18 MB)
kraken-3.0.6-py36_0.tar.bz2(5.41 MB)
kraken-3.0.6-py37_0.tar.bz2(5.41 MB)
kraken-3.0.6-py38_0.tar.bz2(5.41 MB)
kraken-3.0.6-py39_0.tar.bz2(5.41 MB)
kraken-3.0.6.tar.gz(10.69 MB)

Owner

GitHub Repository http://kraken.re

📷 This repository is focused on having various feature implementation of OpenCV in Python.

📷 This repository is focused on having various feature implementation of OpenCV in Python. The aim is to have a minimal implementation of all OpenCV features together, under one roof.

128 Dec 04, 2022

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Merantix-Labs: DAAIN This is the code for our paper DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows which can be found at

14 Oct 12, 2022

A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

109 Dec 12, 2022

Creating a virtual tv using opencv in python3.

Virtual-TV Creating a virtual tv using opencv in python3. In order to run the code follow the below given steps: Make sure the desired videos which ar

1 Jan 01, 2022

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation

This is the official implementation of "Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation". For more details, please

309 Dec 06, 2022

Code for paper "Role-based network embedding via structural features reconstruction with degree-regularized constraint"

Role-based network embedding via structural features reconstruction with degree-regularized constraint Train python main.py --dataset brazil-flights

1 Jun 28, 2022

Handwriting Recognition System based on a deep Convolutional Recurrent Neural Network architecture

Handwriting Recognition System This repository is the Tensorflow implementation of the Handwriting Recognition System described in Handwriting Recogni

346 Jan 07, 2023

Pixie - A full-featured 2D graphics library for Python

Pixie - A full-featured 2D graphics library for Python Pixie is a 2D graphics library similar to Cairo and Skia. pip install pixie-python Features: Ty

65 Dec 30, 2022

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf # it's a scriptable c

7.9k Jan 03, 2023

OpenCVを用いたカメラキャリブレーションのサンプルです。2021/06/21時点でPython実装のある3種類(通常カメラ向け、魚眼レンズ向け(fisheyeモジュール)、全方位カメラ向け(omnidirモジュール))について用意しています。

OpenCV-CameraCalibration-Example FishEyeCameraCalibration.mp4 OpenCVを用いたカメラキャリブレーションのサンプルです 2021/06/21時点でPython実装のある以下3種類について用意しています。通常カメラ向け魚眼レンズ向け(

34 Nov 17, 2022

Camera Intrinsic Calibration and Hand-Eye Calibration in Pybullet

This repository is mainly for camera intrinsic calibration and hand-eye calibration. Synthetic experiments are conducted in PyBullet simulator. 1. Tes

7 Oct 03, 2022

PianoVisuals - Create background videos synced with piano music using opencv

Steps Record piano video Use Neural Network to do body segmentation (video matti

4 Jan 24, 2022

Opencv face recognition desktop application

Opencv-Face-Recognition Opencv face recognition desktop application Program developed by Gustavo Wydler Azuaga - 2021-11-19 Screenshots of the program

1 Nov 19, 2021

PAGE XML format collection for document image page content and more

PAGE-XML PAGE XML format collection for document image page content and more For an introduction, please see the following publication: http://www.pri

46 Nov 14, 2022

FOTS Pytorch Implementation

News!!! Recognition branch now is added into model. The whole project has beed optimized and refactored. ICDAR Dataset SynthText 800K Dataset detectio

599 Dec 19, 2022

Text Detection from images using OpenCV

EAST Detector for Text Detection OpenCV’s EAST(Efficient and Accurate Scene Text Detection ) text detector is a deep learning model, based on a novel

88 Oct 20, 2022

基于Paddle框架的PSENet复现

PSENet-Paddle 基于Paddle框架的PSENet复现本项目基于paddlepaddle框架复现PSENet，并参加百度第三届论文复现赛，将在2021年5月15日比赛完后提供AIStudio链接～敬请期待 AIStudio链接参考项目： whai362-PSENet 环境配置本项目

4 Apr 24, 2022

Optical character recognition for Japanese text, with the main focus being Japanese manga

Manga OCR Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Tran

327 Jan 01, 2023

Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 02, 2023

This repository contains the code for the paper "SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks"

SCANimate: Weakly Supervised Learning of Skinned Clothed Avatar Networks (CVPR 2021 Oral) This repository contains the official PyTorch implementation

235 Dec 18, 2022