An interactive document scanner built in Python using OpenCV

Last update: Feb 12, 2022

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

The scanner takes a poorly scanned image, finds the corners of the document, applies the perspective transformation to get a top-down view of the document, sharpens the image, and applies an adaptive color threshold to clean up the image.

On my test dataset of 280 images, the program correctly detected the corners of the document 92.8% of the time.

This project makes use of the transform and imutils modules from pyimagesearch (which can be accessed here). The UI code for the interactive mode is adapted from poly_editor.py from here.

You can manually click and drag the corners of the document to be perspective transformed:
The scanner can also process an entire directory of images automatically and save the output in an output directory:

Here are some examples of images before and after scan:

Usage

python scan.py (--images 
   
     | --image 
    
     ) [-i]

The -i flag enables interactive mode, where you will be prompted to click and drag the corners of the document. For example, to scan a single image with interactive mode enabled:

python scan.py --image sample_images/desk.JPG -i

Alternatively, to scan all images in a directory without any input:

python scan.py --images sample_images

An interactive document scanner built in Python using OpenCV

Related tags

Overview

Document Scanner

An interactive document scanner built in Python using OpenCV

Here are some examples of images before and after scan:

Usage

Owner

Kushal Shingote

📷 This repository is focused on having various feature implementation of OpenCV in Python.

PAGE XML format collection for document image page content and more

Repository relating to the CVPR21 paper TimeLens: Event-based Video Frame Interpolation

A community-supported supercharged version of paperless: scan, index and archive all your physical documents

Natural language detection

OCR system for Arabic language that converts images of typed text to machine-encoded text.

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

Educational application aimed at automating user-defined workflows for the mobile game, "Granblue Fantasy", using a variety of CV technologies in the backend such as OpenCV, PyAutoGUI and EasyOCR and a frontend coded in Typescript.

Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

Detect and fix skew in images containing text

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Regions sanitàries (RS), Sectors Sanitàris (SS) i Àrees Bàsiques de Salut (ABS) de Catalunya

Document Image Dewarping

Generate text images for training deep learning ocr model

python ocr using tesseract/ with EAST opencv detector

天池2021"全球人工智能技术创新大赛"【赛道一】：医学影像报告异常检测 - 第三名解决方案

A facial recognition device is a device that takes an image or a video of a human face and compares it to another image faces in a database.

This is a implementation of CRAFT OCR method