Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

[email protected]">

Last update: Jan 01, 2023

Related tags

Computer Vision DewarpNet

Overview

DewarpNet

This repository contains the codes for DewarpNet training.

Recent Updates

[May, 2020] Added evaluation images and an important note about Matlab SSIM.
[Dec, 2020] Added OCR evaluation details.

Training

Prepare Data: train.txt & val.txt. Contents should be like:

1/824_8-cp_Page_0503-7Ns0001
1/824_1-cp_Page_0504-2Cw0001

Train Shape Network: python trainwc.py --arch unetnc --data_path ./data/DewarpNet/doc3d/ --batch_size 50 --tboard
Train Texture Mapping Network: python trainbm.py --arch dnetccnl --img_rows 128 --img_cols 128 --img_norm --n_epoch 250 --batch_size 50 --l_rate 0.0001 --tboard --data_path ./DewarpNet/doc3d

Inference:

Run: python infer.py --wc_model_path ./eval/models/unetnc_doc3d.pkl --bm_model_path ./eval/models/dnetccnl_doc3d.pkl --show

Evaluation (Image Metrics):

We use the same evaluation code as DocUNet. To reproduce the quantitative results reported in the paper use the images available here.
[Important note about Matlab version] We noticed that Matlab 2020a uses a different SSIM implementation which gives a better MS-SSIM score (0.5623). Whereas we have used Matlab 2018b. Please compare the scores according to your Matlab version.

Evaluation (OCR Metrics):

The 25 images used for OCR evaluation is /eval/ocr_eval/ocr_files.txt
The corresponding ground-truth text is given in /eval/ocr_eval/tess_gt.json
For the OCR errors reported in the paper we had used cv2.blur as pre-processing which gives higher error in all the cases. For convenience, we provide the updated numbers (without using blur) in the following table:

Method	ED	CER	ED (no blur)	CER (no blur)
DocUNet	1975.86	0.4656(0.263)	1671.80	0.403 (0.256)
DocUNet on Doc3D	1684.34	0.3955 (0.272)	1296.00	0.294 (0.235)
DewarpNet	1288.60	0.3136 (0.248)	1007.28	0.249 (0.236)
DewarpNet (ref)	1114.40	0.2692 (0.234)	812.48	0.204 (0.228)

We had used the Tesseract (v4.1.0) default configuration for evaluation with PyTesseract (v0.2.6).

Models:

Pre-trained models are available here. These models are captured prior to end-to-end training, thus won't give you the end-to-end results reported in Table 2 of the paper. Use the images provided above to get the exact numbers as Table 2.

Dataset:

The doc3D dataset can be downloaded using the scripts here.

More Stuff:

Citation:

If you use the dataset or this code, please consider citing our work-

@inproceedings{SagnikKeICCV2019, 
Author = {Sagnik Das*, Ke Ma*, Zhixin Shu, Dimitris Samaras, Roy Shilkrot}, 
Booktitle = {Proceedings of International Conference on Computer Vision}, 
Title = {DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks}, 
Year = {2019}}

Acknowledgements:

These codes are heavily structured on pytorch-semseg.

Code for the paper "DewarpNet: Single-Image Document Unwarping With Stacked 3D and 2D Regression Networks" (ICCV '19)

Related tags

Overview

DewarpNet

Recent Updates

Training

Inference:

Evaluation (Image Metrics):

Evaluation (OCR Metrics):

Models:

Dataset:

More Stuff:

Citation:

Acknowledgements:

Owner

[email protected]

Python package for handwriting and sketching in Jupyter cells

A simple component to display annotated text in Streamlit apps.

Primary QPDF source code and documentation

Handwritten Character Recognition using CNN

Automatically download multiple papers by keywords in CVPR

Augmenting Anchors by the Detector Itself

Can We Find Neurons that Cause Unrealistic Images in Deep Generative Networks?

Kornia is a open source differentiable computer vision library for PyTorch.

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Ackermann Line Follower Robot Simulation.

Automatically resolve RidderMaster based on TensorFlow & OpenCV

Fully-automated scripts for collecting AI-related papers

Official code for "Bridging Video-text Retrieval with Multiple Choice Questions", CVPR 2022 (Oral).

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

🔎 Like Chardet. 🚀 Package for encoding & language detection. Charset detection.

A python screen recorder for low-end computers, provides high quality video output.

This is the official PyTorch implementation of the paper "TransFG: A Transformer Architecture for Fine-grained Recognition" (Ju He, Jie-Neng Chen, Shuai Liu, Adam Kortylewski, Cheng Yang, Yutong Bai, Changhu Wang, Alan Yuille).

Textboxes implementation with Tensorflow (python)

[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers