Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Last update: Jul 13, 2022

Related tags

Computer Vision u2netscan

Overview

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Toolset

U^2-Net is used for background removal
Textcleaner is used for image cleaning and line deskew (max 5 degrees)
Tesseract is used for text angle rotation
Deskew is used for line deskew (between 5 and 45 degrees)

Examples

Tested one document on smartphone camera with different angles

To build & deploy

Clone thee repo
Download the model: check app/saved_models/README.md
Build Docker image : docker build -t / : .
Test locally : Run Docker image and check if api is working by running http://localhost:10000
- CPU : docker run -it -v $PWD:/LOCAL/ -p 10000:80 / :
- GPU : docker run -it --gpus all -v $PWD:/LOCAL/ -p 10000:80 / :
Push docker image to Dockerhub (optional):
- Check: https://docs.docker.com/docker-hub/repos/ for account setup
- Create in Dockerhub Repo similar to the name of yout Image ID :
- Run docker push / :
Deploy to Cloud Run (optional):
- Create your google cloud account
- Push Docker Image to Google Container Registry
  - create new project called [PROJECT-ID]
  - Open Cloud shell in your Google account and run: docker pull / : docker tag [IMAGE] gcr.io/[PROJECT-ID]/[IMAGE] docker push gcr.io/[PROJECT-ID]/[IMAGE] more detail in this link
- Create CloudRun Service, and select Container that was created
  - Screenshot of the config - for demo purpose, it will be cost free
- Click Deploy, and test the Api Url that will display

Limits and Areas for improvements

Speed: It takes 7 to 10 seconds to process one image (serverless Cloud Run) With Gpu we can save 2 to 3 seconds (U^2-Net is 3 times faster)
Textcleaner is slow but works better on image cleaning, but needs some manual fine-tuning

References

U^2-Net https://github.com/xuebinqin/U-2-Net.git
Textcleaner http://www.fmwconcepts.com/imagemagick/textcleaner/
Tesseract https://github.com/tesseract-ocr/tesseract
Deskew https://github.com/sbrunner/deskew.git

Owner

AI

GitHub Repository https://amtam0.github.io/u2netscan/webapp/app_u2net.html

This repo contains several opencv projects done while learning opencv in python.

opencv-projects-python This repo contains both several opencv projects done while learning opencv by python and opencv learning resources [Basic conce

2 Nov 03, 2022

BoxToolBox is a simple python application built around the openCV library

BoxToolBox is a simple python application built around the openCV library. It is not a full featured application to guide you through the w

1 Nov 12, 2021

Text recognition (optical character recognition) with deep learning methods.

What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | paper | training and evaluation data | failure cases and cle

3.2k Jan 04, 2023

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network Introduction This is a tensorflow re-implementation of PSENet: Shape Robu

498 Dec 30, 2022

Motion detector, Full body detection, Upper body detection, Cat face detection, Smile detection, Face detection (haar cascade), Silverware detection, Face detection (lbp), and Sending email notifications

Security camera running OpenCV for object and motion detection. The camera will send email with image of any objects it detects. It also runs a server that provides web interface with live stream vid

10 Jun 30, 2021

Contextual speed detection for python

Speed Prediction using Optical Flow and 2D CNN About the challenge: Comma.AI Speed Challenge This challenge was developed by Comma.AI to predict the s

2 Dec 16, 2021

Discord QR Scam Code Generator + Token grab mobile device.

A Python script that automatically generates a Nitro scam QR code and grabs the Discord token when scanned.

9 Nov 22, 2022

Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Vedaldi, Andrew Zisserman, CVPR 2016.

SynthText Code for generating synthetic text images as described in "Synthetic Data for Text Localisation in Natural Images", Ankush Gupta, Andrea Ved

1.8k Dec 28, 2022

Ocular is a state-of-the-art historical OCR system.

Ocular Ocular is a state-of-the-art historical OCR system. Its primary features are: Unsupervised learning of unknown fonts: requires only document im

228 Dec 30, 2022

Recognizing the text contents from a scanned visiting card

Recognizing the text contents from a scanned visiting card. The application which is used to recognize the text from scanned images,printeddocuments,r

1 Jan 28, 2022

The code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Long-term Action Assessment".

Likert Scoring with Grade Decoupling for Long-term Action Assessment This is the code for CVPR2022 paper "Likert Scoring with Grade Decoupling for Lon

10 Oct 21, 2022

Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd.

Head Detector Code for the head detector (HeadHunter) proposed in our CVPR 2021 paper Tracking Pedestrian Heads in Dense Crowd. The head_detection mod

76 Dec 06, 2022

list all open dataset about ocr.

ocr-open-dataset list all open dataset about ocr. printed dataset year Born-Digital Images (Web and Email) 2011-2015 COCO-Text 2017 Text Extraction fr

95 Nov 24, 2022

This project is basically to draw lines with your hand, using python, opencv, mediapipe.

Paint Opencv 📷 This project is basically to draw lines with your hand, using python, opencv, mediapipe. Screenshoots 📱 Tools ⚙️ Python Opencv Mediap

3 Nov 17, 2021

A simple OCR API server, seriously easy to be deployed by Docker, on Heroku as well

ocrserver Simple OCR server, as a small working sample for gosseract. Try now here https://ocr-example.herokuapp.com/, and deploy your own now. Deploy

541 Dec 28, 2022

Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, rastreia padrões de gestos em vez de um mouse físico.

mouserController Script para controlar o movimento do mouse usando Python e openCV com câmera em tempo real que detecta pontos de referência da mão, r

6 Jun 28, 2022

A machine learning software for extracting information from scholarly documents

GROBID GROBID documentation Visit the GROBID documentation for more detailed information. Summary GROBID (or Grobid, but not GroBid nor GroBiD) means

1.9k Jan 08, 2023

Provides OCR (Optical Character Recognition) services through web applications

OCR4all As suggested by the name one of the main goals of OCR4all is to allow basically any given user to independently perform OCR on a wide variety

174 Dec 31, 2022

EQFace: An implementation of EQFace: A Simple Explicit Quality Network for Face Recognition

EQFace: A Simple Explicit Quality Network for Face Recognition The first face recognition network that generates explicit face quality online.

141 Dec 31, 2022

Creating a virtual tv using opencv in python3.

Virtual-TV Creating a virtual tv using opencv in python3. In order to run the code follow the below given steps: Make sure the desired videos which ar

1 Jan 01, 2022