Optical Character Recognition + Instance Segmentation for russian and english languages

Last update: Dec 19, 2022

Overview

Распознавание рукописного текста в школьных тетрадях

Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS.

Результаты Public

Задача

Вам нужно разработать алгоритм, который способен распознать рукописный текст в школьных тетрадях. В качестве входных данных вам будут предоставлены фотографии целых листов. Предсказание модели — список распознанных строк с координатами полигонов и получившимся текстом.

Как должно работать решение?

Последовательность двух моделей: сегментации и распознавания. Сначала сегментационная модель предсказывает полигоны маски каждого слова на фото. Затем эти слова вырезаются из изображения по контуру маски (получаются кропы на каждое слово) и подаются в модель распознавания. В итоге получается список распознанных слов с их координатами.

Модели

Instance Segmentation

модель X101-FPN из зоопарка моделей detectron2 + аугментации + высокое разрешение

Optical Character Recognition (OCR)

архитектура CRNN с бекбоном Resnet-34, предобученным на топ 1 модели соревнования Digital Peter

Beam Search

модель KenLM, обученная на данных сорвенования Feedback, Решу ОГЭ/ЕГЭ, а также CTCDecoder

Ресурсы & Submit

Christofari с NVIDIA Tesla V100 и образом jupyter-cuda10.1-tf2.3.0-pt1.6.0-gpu:0.0.82

Мы не гарантируем поддержку сабмита всё время, поэтому предоставляем 2 ссылки: Google Drive и Yandex

Цитирование

@misc{nto-ai-text-recognition,
  author =       {Arseniy Shahmatov and Gerasomiv Maxim},
  title =        {notebook-recognition},
  howpublished = {\url{https://github.com/Lednik7/nto-ai-text-recognition}},
  year =         {2022}
}

You might also like...

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

Mask R-CNN for Object Detection and Segmentation This is an implementation of Mask R-CNN on Python 3, Keras, and TensorFlow. The model generates bound

22.5k Jan 4, 2023

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Swin Transformer for Object Detection This repo contains the supported code and configuration files to reproduce object detection results of Swin Tran

1.4k Dec 30, 2022

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Faster R-CNN and Mask R-CNN in PyTorch 1.0 maskrcnn-benchmark has been deprecated. Please see detectron2, which includes implementations for all model

9k Jan 4, 2023

Object detection and instance segmentation toolkit based on PaddlePaddle.

9.3k Jan 2, 2023

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

The Official PyTorch Implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

3 Oct 15, 2021

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision The PyTorch implementation of DiscoBox: Weakly Supe

1 Oct 23, 2021

Optical Character Recognition + Instance Segmentation for russian and english languages

Related tags

Overview

Распознавание рукописного текста в школьных тетрадях

Соревнование, проводимое в рамках олимпиады НТО, разработанное Сбером. Платформа ODS.

Результаты Public

Задача

Как должно работать решение?

Модели

Ресурсы & Submit

Цитирование

You might also like...

Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow

This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows" on Object Detection and Instance Segmentation.

Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

Object detection and instance segmentation toolkit based on PaddlePaddle.

DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision

The PyTorch implementation of DiscoBox: Weakly Supervised Instance Segmentation and Semantic Correspondence from Box Supervision.

Numbering permanent and deciduous teeth via deep instance segmentation in panoramic X-rays

Res2Net for Instance segmentation and Object detection using MaskRCNN

Keras implementation of PersonLab for Multi-Person Pose Estimation and Instance Segmentation.

Releases(v1.0.0)

v1.0.0(Mar 6, 2022)

Owner

Gerasimov Maxim

PyTorch implementation of Progressive Growing of GANs for Improved Quality, Stability, and Variation.

PyTorch implementation of saliency map-aided GAN for Auto-demosaic+denosing

Official code for article "Expression is enough: Improving traﬀic signal control with advanced traﬀic state representation"

Implementation for Shape from Polarization for Complex Scenes in the Wild

Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes (CVPR 2021 Oral)

Neural Scene Flow Prior (NeurIPS 2021 spotlight)

Per-Pixel Classification is Not All You Need for Semantic Segmentation

Tensor-Based Quantum Machine Learning

Reinforcement Learning Theory Book (rus)

project page for VinVL

Bunch of different tools which helps visualizing and annotating images for semantic/instance segmentation tasks

Kalman Filter book using Jupyter Notebook. Focuses on building intuition and experience, not formal proofs. Includes Kalman filters,extended Kalman filters, unscented Kalman filters, particle filters, and more. All exercises include solutions.

The implemention of Video Depth Estimation by Fusing Flow-to-Depth Proposals

Bayesian Generative Adversarial Networks in Tensorflow

Dataset VSD4K includes 6 popular categories: game, sport, dance, vlog, interview and city.

Like a cowsay but without cows!

Symmetry and Uncertainty-Aware Object SLAM for 6DoF Object Pose Estimation

atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

The codes and models in 'Gaze Estimation using Transformer'.

Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).