Official implementations of PSENet, PAN and PAN++.

Overview

News

  • (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23.
  • (2021/04/08) PSENet and PAN are included in MMOCR.

Introduction

This repository contains the official implementations of PSENet, PAN, PAN++, and FAST [coming soon].

Text Detection
Text Spotting

Installation

First, clone the repository locally:

git clone https://github.com/whai362/pan_pp.pytorch.git

Then, install PyTorch 1.1.0+, torchvision 0.3.0+, and other requirements:

conda install pytorch torchvision -c pytorch
pip install -r requirement.txt

Finally, compile codes of post-processing:

# build pse and pa algorithms
sh ./compile.sh

Dataset

Please refer to dataset/README.md for dataset preparation.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py ${CONFIG_FILE}

For example:

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_ic15.py

Testing

Evaluate the performance

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
cd eval/
./eval_{DATASET}.sh

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar
cd eval/
./eval_ic15.sh

Evaluate the speed

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar --report_speed

Citation

Please cite the related works in your publications if it helps your research:

PSENet

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection with Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

PAN

@inproceedings{wang2019efficient,
  title={Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network},
  author={Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={8440--8449},
  year={2019}
}

PAN++

@article{wang2021pan++,
  title={PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Liu, Xuebo and Liang, Ding and Zhibo, Yang and Lu, Tong and Shen, Chunhua},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021},
  publisher={IEEE}
}

FAST

@misc{chen2021fast,
  title={FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation}, 
  author={Zhe Chen and Wenhai Wang and Enze Xie and ZhiBo Yang and Tong Lu and Ping Luo},
  year={2021},
  eprint={2111.02394},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

License

This project is developed and maintained by IMAGINE [email protected] Key Laboratory for Novel Software Technology, Nanjing University.

IMAGINE Lab

This project is released under the Apache 2.0 license.

Comments
  • Evaluation of the performance result

    Evaluation of the performance result

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by dikubab 9
  • _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    _pickle.PicklingError: Can't pickle : import of module 'cPolygon' failed

    more complete log as belows: Epoch: [1 | 600] /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) (1/374) LR: 0.001000 | Batch: 2.668s | Total: 0min | ETA: 17min | Loss: 1.619 | Loss(text/kernel/emb/rec): 0.680/0.193/0.746/0.000 | IoU(text/kernel): 0.324/0.335 | Acc rec: 0.000 Traceback (most recent call last): File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/queues.py", line 236, in _feed obj = _ForkingPickler.dumps(obj) File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    the code runs normally when using the CTW1500 datasets. but encounter errors when using my own datasets.

    it seems fine in the first run (1/374), what is wrong ? I have no idea.

    opened by Zhang-O 5
  • 关于训练的问题

    关于训练的问题

    您好!我现在在自己的数据上进行训练,训练过程是这样的 image Epoch: [212 | 600] (1/198) LR: 0.000677 | Batch: 3.934s | Total: 0min | ETA: 13min | Loss: 0.752 | Loss(text/kernel/emb/rec): 0.493/0.199/0.059/0.000 | IoU(text/kernel): 0.055/0.553 | Acc rec: 0.000 (21/198) LR: 0.000677 | Batch: 1.089s | Total: 0min | ETA: 3min | Loss: 0.731 | Loss(text/kernel/emb/rec): 0.478/0.199/0.054/0.000 | IoU(text/kernel): 0.048/0.482 | Acc rec: 0.000 (41/198) LR: 0.000677 | Batch: 1.022s | Total: 1min | ETA: 3min | Loss: 0.732 | Loss(text/kernel/emb/rec): 0.478/0.198/0.056/0.000 | IoU(text/kernel): 0.049/0.476 | Acc rec: 0.000 这个Acc rec一直是0,我终止训练后,在测试数据上进行测试时,output输出的是空的,请问是怎么回事呢,感谢啦!

    opened by mayidu 3
  • 关于后处理的疑问

    关于后处理的疑问

    1. 后处理的代码中当kernel中两个连通域的面积比大于max_rate时,将这两个连通域的flag赋值为1,在扩充时,必须同时满足当前扩充的点所属的连通域的flag值为1且与kernal的similar vector距离大于3时才不扩充该点。请问设flag这步操作的作用是什么,直接判断与Kernel的similar vector的距离可以吗?
    2. 论文中扩充的点与kernel相似向量的欧式距离thresh值为6,代码中为3,请问实际应用中这个值跟什么有关系,是数据集的某些特点吗?
    opened by jewelc92 3
  • Regarding pa.pyx

    Regarding pa.pyx

    Hi,

    I try to run your code and figure out that in your last line in pa.pyx

    return _pa(kernels[:-1], emb, label, cc, kernel_num, label_num, min_area)

    Looks like this should be

    return _pa(kernels, emb, label, cc, kernel_num, label_num, min_area)

    So that we can scan over all kernels (you skip the last kernel) and there is no crash in this function. Am I correct?

    Thanks.

    opened by liuch37 3
  • AttributeError: 'Namespace' object has no attribute 'resume'

    AttributeError: 'Namespace' object has no attribute 'resume'

    PAN++ic15,An error appears when trying to test the model:

    reading type: pil. Traceback (most recent call last): File "test.py", line 155, in main(args) File "test.py", line 138, in main print("No checkpoint found at '{}'".format(args.resume)) AttributeError: 'Namespace' object has no attribute 'resume'

    opened by lrjj 2
  • 训练Total Text时遇到的问题

    训练Total Text时遇到的问题

    运行 python train.py config/pan/pan_r18_tt.py 后,出现如下情况: p1 Traceback (most recent call last): File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/queues.py", line 234, in _feed obj = _ForkingPickler.dumps(obj) File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed 似乎是迭代过程中出现的问题且只出现在训练TT数据集的时候 请问出现这种情况该怎样解决呢?谢谢您

    opened by mashumli 2
  • 执行test.py提示TypeError: 'module' object is not callable

    执行test.py提示TypeError: 'module' object is not callable

    将模型路径和config文件路径配置好了之后,执行python test.py,提示如下: Traceback (most recent call last): File "test.py", line 117, in main(args) File "test.py", line 107, in main test(test_loader, model, cfg) File "test.py", line 56, in test outputs = model(**data) File "/home/ethony/anaconda3/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/pan.py", line 104, in forward det_res = self.det_head.get_results(det_out, img_metas, cfg) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/head/pa_head.py", line 65, in get_results label = pa(kernels, emb) TypeError: 'module' object is not callable 看提示应该是model/post_processing下的pa没有正确导入,导入为模块了,这应该怎么解决呢

    opened by ethanlighter 2
  • problems in train.py

    problems in train.py

    Hi. When I run 'python train.py config/pan/pan_r18_ic15.py' , the errors are as followings: Do you know how to solve the problem? Thank you very much. Traceback (most recent call last): File "train.py", line 234, in main(args) File "train.py", line 216, in main train(train_loader, model, optimizer, epoch, start_iter, cfg) File "train.py", line 41, in train for iter, data in enumerate(train_loader): File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next data = self._next_data() File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data return self._process_data(data) File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data data.reraise() File "D:\Anaconda3\lib\site-packages\torch_utils.py", line 428, in reraise raise self.exc_type(msg) TypeError: function takes exactly 5 arguments (1 given)

    opened by YUDASHUAI916 2
  • not sure about run compile.sh

    not sure about run compile.sh

    (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$ sh ./compile.sh Compiling pa.pyx because it depends on /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/init.pxd. [1/1] Cythonizing pa.pyx /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.pyx tree = Parsing.p_module(s, pxd, full_module_name) running build_ext building 'pa' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include -I/data/tools/anaconda3/envs/zyl_torch16/include/python3.7m -c pa.cpp -o build/temp.linux-x86_64-3.7/pa.o -O3 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4, from pa.cpp:647: /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it with "
    ^~~~~~~ g++ -pthread -shared -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -L/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,-rpath=/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/pa.o -o /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.cpython-37m-x86_64-linux-gnu.so (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$

    this is the compile history, I am not sure whether is successully build or not.

    opened by Zhang-O 2
  • morphology operations from kornia

    morphology operations from kornia

    Hi,

    Your FAST paper is really amazing! While you already have an implementation of erosion/dilation, let me offer using our set of morphology, implemented in pyre pytorch: https://kornia.readthedocs.io/en/latest/morphology.html

    https://kornia-tutorials.readthedocs.io/en/master/morphology_101.html

    Best, Dmytro.

    opened by ducha-aiki 1
  • The sample 199 not present in GT

    The sample 199 not present in GT

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by zeng-cy 1
  • How  to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar cd eval/ ./eval_ic15.sh

    please inform me with [email protected] or wechat SanQian-2012,thanks you so much.

    Originally posted by @Devin521314 in https://github.com/whai362/pan_pp.pytorch/issues/91#issuecomment-1233810612

    opened by Devin521314 0
  • Why rec encoder use EOS? not SOS

    Why rec encoder use EOS? not SOS

    hi: I find there is no 'SOS' in code, I understand SOS should be embedding at the beginning. Please tell me ,thanks! ---------------code----------------------------------------------- class Encoder(nn.Module): def init(self, hidden_dim, voc, char2id, id2char): super(Encoder, self).init() self.hidden_dim = hidden_dim self.vocab_size = len(voc) self.START_TOKEN = char2id['EOS'] self.emb = nn.Embedding(self.vocab_size, self.hidden_dim) self.att = MultiHeadAttentionLayer(self.hidden_dim, 8)

    def forward(self, x):
        batch_size, feature_dim, H, W = x.size()
        x_flatten = x.view(batch_size, feature_dim, H * W).permute(0, 2, 1)
        st = x.new_full((batch_size,), self.START_TOKEN, dtype=torch.long)
        emb_st = self.emb(st)
        holistic_feature, _ = self.att(emb_st, x_flatten, x_flatten)
        return 
    
    opened by Patickk 0
Releases(v1)
A Python library created to assist programmers with complex mathematical functions

libmaths libmaths was created not only as a learning experience for me, but as a way to make mathematical models in seconds for Python users using mat

Simple 73 Oct 02, 2022
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 08, 2023
The FIRST GANs-based omics-to-omics translation framework

OmiTrans Please also have a look at our multi-omics multi-task DL freamwork 👀 : OmiEmbed The FIRST GANs-based omics-to-omics translation framework Xi

Xiaoyu Zhang 6 Dec 14, 2022
👨‍💻 run nanosaur in simulation with Gazebo/Ingnition

🦕 👨‍💻 nanosaur_gazebo nanosaur The smallest NVIDIA Jetson dinosaur robot, open-source, fully 3D printable, based on ROS2 & Isaac ROS. Designed & ma

nanosaur 9 Jul 19, 2022
code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology"

GIANT Code and data for paper "GIANT: Scalable Creation of a Web-scale Ontology" https://arxiv.org/pdf/2004.02118.pdf Please cite our paper if this pr

Excalibur 39 Dec 29, 2022
Compare GAN code.

Compare GAN This repository offers TensorFlow implementations for many components related to Generative Adversarial Networks: losses (such non-saturat

Google 1.8k Jan 05, 2023
SNE-RoadSeg in PyTorch, ECCV 2020

SNE-RoadSeg Introduction This is the official PyTorch implementation of SNE-RoadSeg: Incorporating Surface Normal Information into Semantic Segmentati

242 Dec 20, 2022
PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT.

MoCo v3 for Self-supervised ResNet and ViT Introduction This is a PyTorch implementation of MoCo v3 for self-supervised ResNet and ViT. The original M

Facebook Research 887 Jan 08, 2023
Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline

Built a deep neural network (DNN) that functions as an end-to-end machine translation pipeline. The pipeline accepts english text as input and returns the French translation.

Afropunk Technologist 1 Jan 24, 2022
An implementation for `Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction`

Text2Event An implementation for Text2Event: Controllable Sequence-to-Structure Generation for End-to-end Event Extraction Please contact Yaojie Lu (@

Roger 153 Jan 07, 2023
BOVText: A Large-Scale, Multidimensional Multilingual Dataset for Video Text Spotting

BOVText: A Large-Scale, Bilingual Open World Dataset for Video Text Spotting Updated on December 10, 2021 (Release all dataset(2021 videos)) Updated o

weijiawu 47 Dec 26, 2022
A scikit-learn-compatible module for estimating prediction intervals.

|Anaconda|_ MAPIE - Model Agnostic Prediction Interval Estimator MAPIE allows you to easily estimate prediction intervals using your favourite sklearn

SimAI 584 Dec 27, 2022
Simple embedding based text classifier inspired by fastText, implemented in tensorflow

FastText in Tensorflow This project is based on the ideas in Facebook's FastText but implemented in Tensorflow. However, it is not an exact replica of

Alan Patterson 306 Dec 02, 2022
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Focal Frequency Loss - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Focal Fre

Liming Jiang 460 Jan 04, 2023
Mae segmentation - Reproduction of semantic segmentation using masked autoencoder (mae)

ADE20k Semantic segmentation with MAE Getting started Install the mmsegmentation

97 Dec 17, 2022
PyTorch implementation of U-TAE and PaPs for satellite image time series panoptic segmentation.

Panoptic Segmentation of Satellite Image Time Series with Convolutional Temporal Attention Networks (ICCV 2021) This repository is the official implem

71 Jan 04, 2023
A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation

A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation This repository contains the source code of the paper A Differentiable

Bernardo Aceituno 2 May 05, 2022
This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression problems

Doctoral dissertation of Zheng Zhao This thesis is mainly concerned with state-space methods for a class of deep Gaussian process (DGP) regression pro

Zheng Zhao 21 Nov 14, 2022
OpenMMLab Image Classification Toolbox and Benchmark

Introduction English | 简体中文 MMClassification is an open source image classification toolbox based on PyTorch. It is a part of the OpenMMLab project. D

OpenMMLab 1.8k Jan 03, 2023
Object Detection Projekt in GKI WS2021/22

tfObjectDetection Object Detection Projekt with tensorflow in GKI WS2021/22 Docker Container: docker run -it --name --gpus all -v path/to/project:p

Tim Eggers 1 Jul 18, 2022