Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

Overview

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE

Introduction

This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE.
Origin Repository: argman/EAST - A tensorflow implementation of EAST text detector
Origin Author: argman

Author: Haozheng Li
Email: [email protected] or [email protected]

Contents

  1. Transform
  2. Models
  3. Demo
  4. Train
  5. Test
  6. Results

Transform

Some data in the dataset is abnormal, just like ICPR MTWI 2018[BaiduYun]. Abnormal means that the ground true labels are anticlockwise, or the images are not in 3 channels. Then errors like 'poly in wrong direction' will occur while using argman/EAST.

So I wrote a matlab program to check and transform the dataset. The program named <transform.m> is in the folder 'data_transform/' and its parameters are descripted as bellow:

icpr_img_folder = 'image_9000\';                   %origin images
icpr_txt_folder = 'txt_9000\';                     %origin ground true labels
icdar_img_folder = 'ICPR2018\';                    %transformed images
icdar_gt_folder = 'ICPR2018\';                     %transformed ground true labels
icdar_img_abnormal_folder = 'ICPR2018_abnormal\';  %images not in 3 channels, which give errors in argman/EAST
icdar_gt_abnormal_folder = 'ICPR2018_abnormal\';   %transformed ground true labels

%images and ground true labels files must be renamed as <img_1>, <img_2>, ..., <img_xxx> while using argman/EAST
first_index =  0;                                  %first index of the dataset
transform_list_name = 'transform_list.txt';        %file name of the rename list

Note: For abnormal images not in 3 channels, please transform them to normal ones through other tools like Format Factory. Then add the right data to the <icdar_img_folder> and <icdar_gt_folder>, so finally you get a whole normal dataset which has been checked and transformed.

Models

  1. Resnet_V1_50 Models trained on ICPR MTWI 2018 (train): [100k steps] [500k steps] [1035k steps]
  2. Resnet_V1_101 Models trained on ICPR MTWI 2018 (train) + ICDAR 2017 MLT (train + val) + RCTW-17 (train): [100k steps]
  3. Resnet_V1_101 Models pre-trained on Models-2, then trained on just ICPR MTWI 2018 (train): [987k steps]
  4. (In argman/EAST) Resnet_V1_50 Models trained on ICDAR 2013 (train) + ICDAR 2015 (train): [50k steps]
  5. (In argman/EAST) Resnet_V1_50 Models provided by tensorflow slim: [slim_resnet_v1_50]

Demo

Download the pre-trained models and run:

python run_demo_server.py --checkpoint-path models/east_icpr2018_resnet_v1_50_rbox_1035k/

Then Open http://localhost:8769 for the web demo server, or get the results in 'static/results/'.
Note: See argman/EAST#demo for more details.

Train

Prepare the training set and run:

python multigpu_train.py --gpu_list=0 --input_size=512 --batch_size_per_gpu=8 \
--checkpoint_path=models/east_icpr2018_resnet_v1_50_rbox/ \
--text_scale=512 --training_data_path=data/ICPR2018/ --geometry=RBOX \
--learning_rate=0.0001 --num_readers=18 --max_steps=50000

Note 1: Images and ground true labels files must be renamed as <img_1>, <img_2>, ..., <img_xxx> while using argman/EAST. Please see the examples in the folder 'training_samples/'.
Note 2: If --restore=True, training will restore from checkpoint and ignore the --pretrained_model_path. If --restore=False, training will delete checkpoint and initialize with the --pretrained_model_path (if exists).
Note 3: See argman/EAST#train for more details.

Test

Names of the images in ICPR MTWI 2018 are abnormal. Like <LB1gXi2JVXXXXXUXFXXXXXXXXXX.jpg> but not <img_10001.jpg>. Then errors will occur while using argman/EAST#test.

So I wrote two matlab programs to rename and inversely rename the dataset. Before evaluating, run the program named <rename.m> to make names of the images normal. This program is in the folder 'data_transform/' and its parameters are descripted as bellow:

icpr_img_folder = 'image_10000\';                      %origin images
icdar_img_folder = 'ICPR2018_test\';                   %transformed images
icdar_img_abnormal_folder = 'ICPR2018_test_abnormal\'; %images not in 3 channels, which give errors in argman/EAST

icpr_count =  10000;                                   %first index of the dataset
rename_list_name = 'rename_list.txt';                  %file name of the rename list

Note: Just like <transform.m>, please transform abnormal images through other tools like Format Factory.

After you have prepared the test set, run:

python eval.py --test_data_path=data/ICPR2018/ --gpu_list=0 \
--checkpoint_path=models/east_icpr2018_resnet_v1_50_rbox_1035k/ --output_dir=results/1035k/

Then get the results in 'results/'.

Finally inversely rename the result labels files from <img_10001.txt> to <LB1gXi2JVXXXXXUXFXXXXXXXXXX.txt> according to the rename list generated by <rename.m>. Run the program named <rename_inverse.m> which is in the folder 'data_transform/' and its parameters are descripted as bellow:

rename_list_name = 'rename_list.txt';  %file name of the rename list
icpr_img_folder = 'image_10000\';      %origin images
icpr_txt_folder = 'results\';          %result labels files generated by argman/EAST
icdar_gt_folder = 'txt_10000\';        %inversely renamed result labels files

Then zip the results in 'txt_10000/' and submit it to the ICPR MTWI 2018 CHALLENGE.

Results

Finally our model <east_icpr2018_resnet_v1_50_rbox_1035k> rank 31 in the ICPR MTWI 2018 CHALLENGE:
image

Here are some results on ICPR MTWI 2018:
image
image
image
image
image

Have fun!! :)

Owner
Haozheng Li
Computer Vision, Reinforcement Learning
Haozheng Li
Creating a virtual tv using opencv in python3.

Virtual-TV Creating a virtual tv using opencv in python3. In order to run the code follow the below given steps: Make sure the desired videos which ar

Vamsi 1 Jan 01, 2022
make a better chinese character recognition OCR than tesseract

deep ocr See README_en.md for English installation documentation. 只在ubuntu下面测试通过,需要virtualenv安装,安装路径可自行调整: git clone https://github.com/JinpengLI/deep

Jinpeng 1.5k Dec 28, 2022
Steve Tu 71 Dec 30, 2022
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
Automatically fishes for you while you are afk :)

Dank-memer-afk-script A simple and quick way to make easy money in Dank Memer! How to use Open a discord channel which has the Dank Memer bot enabled.

Pranav Doshi 9 Nov 11, 2022
Satoshi is a discord bot template in python using discord.py that allow you to track some live crypto prices with your own discord bot.

Satoshi ~ DiscordCryptoBot Satoshi is a simple python discord bot using discord.py that allow you to track your favorites cryptos prices with your own

Théo 2 Sep 15, 2022
Face Anonymizer - FaceAnonApp v1.0

Face Anonymizer - FaceAnonApp v1.0 Blur faces from image and video files in /data/files folder. Contents Repo of the source files for the FaceAnonApp.

6 Apr 18, 2022
Validate and transform various OCR file formats (hOCR, ALTO, PAGE, FineReader)

ocr-fileformat Validate and transform between OCR file formats (hOCR, ALTO, PAGE, FineReader) Installation Docker System-wide Usage CLI GUI API Transf

Universitätsbibliothek Mannheim 152 Dec 20, 2022
kaldi-asr/kaldi is the official location of the Kaldi project.

Kaldi Speech Recognition Toolkit To build the toolkit: see ./INSTALL. These instructions are valid for UNIX systems including various flavors of Linux

Kaldi 12.3k Jan 05, 2023
OpenCV-Erlang/Elixir bindings

evision [WIP] : OS : arch Build Status Ubuntu 20.04 arm64 Ubuntu 20.04 armv7 Ubuntu 20.04 s390x Ubuntu 20.04 ppc64le Ubuntu 20.04 x86_64 macOS 11 Big

Cocoa 194 Jan 05, 2023
A bot that extract text from images using the Tesseract OCR.

Text from image (OCR) @ocr_text_bot A simple bot to extract text from images. Usage What do I need? A AWS key configured locally, see here. NodeJS. I

Weverton Marques 4 Aug 06, 2021
A pkg stiching around view images(4-6cameras) to generate bird's eye view.

AVP-BEV-OPEN Please check our new work AVP_SLAM_SIM A pkg stiching around view images(4-6cameras) to generate bird's eye view! View Demo · Report Bug

Xinliang Zhong 37 Dec 01, 2022
A collection of resources (including the papers and datasets) of OCR (Optical Character Recognition).

OCR Resources This repository contains a collection of resources (including the papers and datasets) of OCR (Optical Character Recognition). Contents

Zuming Huang 363 Jan 03, 2023
Rotational region detection based on Faster-RCNN.

R2CNN_Faster_RCNN_Tensorflow Abstract This is a tensorflow re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detecti

UCAS-Det 581 Nov 22, 2022
A real-time dolly zoom camera effect

Dolly-Zoom I've always been amazed by the gradual perspective change of dolly zoom, and I have some experience in python and OpenCV, so I decided to c

Dylan Kai Lau 52 Dec 08, 2022
A novel region proposal network for more general object detection ( including scene text detection ).

DeRPN: Taking a further step toward more general object detection DeRPN is a novel region proposal network which concentrates on improving the adaptiv

Deep Learning and Vision Computing Lab, SCUT 151 Dec 12, 2022
Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

isearch is an OSINT tool on Instagram. Offers a face recognition reverse image search on Instagram profile feed photos.

Malek salem 20 Oct 25, 2022
CUTIE (TensorFlow implementation of Convolutional Universal Text Information Extractor)

CUTIE TensorFlow implementation of the paper "CUTIE: Learning to Understand Documents with Convolutional Universal Text Information Extractor." Xiaohu

Zhao,Xiaohui 147 Dec 20, 2022
APS 6º Semestre - UNIP (2021)

UNIP - Universidade Paulista Ciência da Computação (CC) DESENVOLVIMENTO DE UM SISTEMA COMPUTACIONAL PARA ANÁLISE E CLASSIFICAÇÃO DE FORMAS Link do git

Eduardo Talarico 5 Mar 09, 2022
Detect the mathematical formula from the given picture and the same formula is extracted and converted into the latex code

Mathematical formulae extractor The goal of this project is to create a learning based system that takes an image of a math formula and returns corres

6 May 22, 2022