利用Paddle框架复现CRAFT

Overview

CRAFT-Paddle

利用Paddle框架复现CRAFT

CRAFT

本项目基于paddlepaddle框架复现CRAFT,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待

参考项目:

CRAFT: Character-Region Awareness For Text detection

项目配置

pip install -r requirements.txt

你应该具有以下目录

/home/aistudio/CRAFT(工程目录)
/home/aistudio/Data(数据集文件)

数据集文件已挂载,自行解压即可

训练

The code for training is not included in this repository, and we cannot release the full training code for IP reason.

作者并未提供训练代码

权重转换

这里用到了X2Paddle神器,转换代码如下,具体使用文档参见X2Paddle

from craft import CRAFT
import torch
from collections import OrderedDict
import imgproc
import numpy as np
import cv2

def copyStateDict(state_dict):
    if list(state_dict.keys())[0].startswith("module"):
        start_idx = 1
    else:
        start_idx = 0
    new_state_dict = OrderedDict()
    for k, v in state_dict.items():
        name = ".".join(k.split(".")[start_idx:])
        new_state_dict[name] = v
    return new_state_dict

# 构建输入
input_data = np.random.rand(1, 3, 736, 1280).astype("float32")
net = CRAFT()
net.load_state_dict(copyStateDict(torch.load('craft_mlt_25k.pth')))
net = net.cuda()
net.eval()

# 进行转换
from x2paddle.convert import pytorch2paddle
pytorch2paddle(net, 
          save_dir="paddlemodel", 
          jit_type="trace", 
          input_examples=[torch.tensor(input_data).cuda()])

完成后你会出现如下文件目录

/home/aistudio/CRAFT/paddlemodel
└───inference_model
└──────model.pdiparams
└──────model.pdiparams.info
└──────model.pdmodel
└───model.pdparams
└───x2paddle_code.py

使用同样的方式转换refinenet

测试

模型下载

提取码:4yy1

AIStudio链接

cd /home/aistudio/CRAFT
python test.py

Model name Used datasets Languages Purpose Model Link
General SynthText, IC13, IC17 Eng + MLT For general purpose craft_mlt_25k
IC15 SynthText, IC15 Eng For IC15 only craft_ic15_20k
LinkRefiner CTW1500 - Used with the General Model craft_refiner_CTW1500

下图是实际测试效果

评估

可以采用以下代码进行评估

cd /home/aistudio/CRAFT
python eval.py
cd /home/aistudio/CRAFT/outputs/submit_ic15/
zip ../submit_ic15.zip *
cd /home/aistudio/CRAFT/eval
`./eval_ic15.sh` or `bash eval_ic15.sh`
Method Dataset Backbone refiner Precision (%) Recall (%) F-measure (%) Model
basenet ICDAR2015 VGG16_BN N 82.2 77.9 80.0 craft_ic15_20k
basenet ICDAR2015 VGG16_BN N 85.1 79.4 82.2 craft_mlt_25k
basenet ICDAR2015 VGG16_BN Y 61.9 45.1 52.2 craft_ic15_20k
basenet ICDAR2015 VGG16_BN Y 63.1 43.3 51.4 craft_mlt_25k

评估total_text数据集可参见我的PSNET项目eval文件价下的评估代码

关于作者

姓名 郭权浩
学校 电子科技大学研2020级
研究方向 计算机视觉
主页 Deep Hao的主页
如有错误,请及时留言纠正,非常蟹蟹!
后续会有更多论文复现系列推出,欢迎大家有问题留言交流学习,共同进步成长!
Owner
QuanHao Guo
master at UESTC
QuanHao Guo
Detect text blocks and OCR poorly scanned PDFs in bulk. Python module available via pip.

doc2text doc2text extracts higher quality text by fixing common scan errors Developing text corpora can be a massive pain in the butt. Much of the tex

Joe Sutherland 1.3k Jan 04, 2023
Slice a single image into multiple pieces and create a dataset from them

OpenCV Image to Dataset Converter Slice a single image of Persian digits into mu

Meysam Parvizi 14 Dec 29, 2022
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Code based on our WACV 2022 Accepted Paper: https://arxiv.org/pdf/

Andres 13 Dec 17, 2022
PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV)

About PyQT5 app that colorize black & white pictures using CNN(use pre-trained model which was made with OpenCV) Colorizor Приложение для проекта Yand

1 Apr 04, 2022
A semi-automatic open-source tool for Layout Analysis and Region EXtraction on early printed books.

LAREX LAREX is a semi-automatic open-source tool for layout analysis on early printed books. It uses a rule based connected components approach which

162 Jan 05, 2023
POT : Python Optimal Transport

This open source Python library provide several solvers for optimization problems related to Optimal Transport for signal, image processing and machine learning.

Python Optimal Transport 1.7k Jan 04, 2023
A curated list of resources for text detection/recognition (optical character recognition ) with deep learning methods.

awesome-deep-text-detection-recognition A curated list of awesome deep learning based papers on text detection and recognition. Text Detection Papers

2.4k Jan 08, 2023
TextBoxes: A Fast Text Detector with a Single Deep Neural Network https://github.com/MhLiao/TextBoxes 基于SSD改进的文本检测算法,textBoxes_note记录了之前整理的笔记。

TextBoxes: A Fast Text Detector with a Single Deep Neural Network Introduction This paper presents an end-to-end trainable fast scene text detector, n

zhangjing1 24 Apr 28, 2022
PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector

Description This is a PyTorch Re-Implementation of EAST: An Efficient and Accurate Scene Text Detector. Only RBOX part is implemented. Using dice loss

365 Dec 20, 2022
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.

Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless. This is the official Roboflow python package that interfaces with the Roboflow API.

Roboflow 52 Dec 23, 2022
Framework for the Complete Gaze Tracking Pipeline

Framework for the Complete Gaze Tracking Pipeline The figure below shows a general representation of the camera-to-screen gaze tracking pipeline [1].

Pascal 20 Jan 06, 2023
A Python script to capture images from multiple webcams at once and save them into your local machine

Capturing multiple images at once from Webcam Using OpenCV Capture multiple image by accessing the webcam of your system and save it to your machine.

Fazal ur Rehman 2 Apr 16, 2022
~1000 book pages + OpenCV + python = page regions identified as paragraphs, lines, images, captions, etc.

cosc428-structor I had an open-ended Computer Vision assignment to complete, and an out-of-copyright book that I wanted to turn into an ebook. Convent

Chad Oliver 45 Dec 06, 2022
Official code for :rocket: Unsupervised Change Detection of Extreme Events Using ML On-Board :rocket:

RaVAEn The RaVÆn system We introduce the RaVÆn system, a lightweight, unsupervised approach for change detection in satellite data based on Variationa

SpaceML 35 Jan 05, 2023
Read-only mirror of https://gitlab.gnome.org/GNOME/ocrfeeder

================================= OCRFeeder - A Complete OCR Suite ================================= OCRFeeder is a complete Optical Character Recogn

GNOME Github Mirror 81 Dec 23, 2022
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE

EAST_ICPR: EAST for ICPR MTWI 2018 CHALLENGE Introduction This is a repository forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE. Origin Reposi

Haozheng Li 157 Aug 23, 2022
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

EAST: An Efficient and Accurate Scene Text Detector Description: This version will be updated soon, please pay attention to this work. The motivation

Dejia Song 544 Dec 20, 2022
A simple document layout analysis using Python-OpenCV

Run the application: python main.py *Note: For first time running the application, create a folder named "output". The application is a simple documen

Roinand Aguila 109 Dec 12, 2022
Scale-aware Automatic Augmentation for Object Detection (CVPR 2021)

SA-AutoAug Scale-aware Automatic Augmentation for Object Detection Yukang Chen, Yanwei Li, Tao Kong, Lu Qi, Ruihang Chu, Lei Li, Jiaya Jia [Paper] [Bi

Jia Research Lab 182 Dec 29, 2022
SemTorch

SemTorch This repository contains different deep learning architectures definitions that can be applied to image segmentation. All the architectures a

David Lacalle Castillo 154 Dec 07, 2022