Official implementations of PSENet, PAN and PAN++.

Overview

News

  • (2021/11/03) Paddle implementation of PAN, see Paddle-PANet. Thanks @simplify23.
  • (2021/04/08) PSENet and PAN are included in MMOCR.

Introduction

This repository contains the official implementations of PSENet, PAN, PAN++, and FAST [coming soon].

Text Detection
Text Spotting

Installation

First, clone the repository locally:

git clone https://github.com/whai362/pan_pp.pytorch.git

Then, install PyTorch 1.1.0+, torchvision 0.3.0+, and other requirements:

conda install pytorch torchvision -c pytorch
pip install -r requirement.txt

Finally, compile codes of post-processing:

# build pse and pa algorithms
sh ./compile.sh

Dataset

Please refer to dataset/README.md for dataset preparation.

Training

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py ${CONFIG_FILE}

For example:

CUDA_VISIBLE_DEVICES=0,1,2,3 python train.py config/pan/pan_r18_ic15.py

Testing

Evaluate the performance

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE}
cd eval/
./eval_{DATASET}.sh

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar
cd eval/
./eval_ic15.sh

Evaluate the speed

python test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --report_speed

For example:

python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar --report_speed

Citation

Please cite the related works in your publications if it helps your research:

PSENet

@inproceedings{wang2019shape,
  title={Shape Robust Text Detection with Progressive Scale Expansion Network},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Hou, Wenbo and Lu, Tong and Yu, Gang and Shao, Shuai},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={9336--9345},
  year={2019}
}

PAN

@inproceedings{wang2019efficient,
  title={Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network},
  author={Wang, Wenhai and Xie, Enze and Song, Xiaoge and Zang, Yuhang and Wang, Wenjia and Lu, Tong and Yu, Gang and Shen, Chunhua},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  pages={8440--8449},
  year={2019}
}

PAN++

@article{wang2021pan++,
  title={PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text},
  author={Wang, Wenhai and Xie, Enze and Li, Xiang and Liu, Xuebo and Liang, Ding and Zhibo, Yang and Lu, Tong and Shen, Chunhua},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
  year={2021},
  publisher={IEEE}
}

FAST

@misc{chen2021fast,
  title={FAST: Searching for a Faster Arbitrarily-Shaped Text Detector with Minimalist Kernel Representation}, 
  author={Zhe Chen and Wenhai Wang and Enze Xie and ZhiBo Yang and Tong Lu and Ping Luo},
  year={2021},
  eprint={2111.02394},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

License

This project is developed and maintained by IMAGINE [email protected] Key Laboratory for Novel Software Technology, Nanjing University.

IMAGINE Lab

This project is released under the Apache 2.0 license.

Comments
  • Evaluation of the performance result

    Evaluation of the performance result

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by dikubab 9
  • _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    _pickle.PicklingError: Can't pickle : import of module 'cPolygon' failed

    more complete log as belows: Epoch: [1 | 600] /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:2941: UserWarning: nn.functional.upsample is deprecated. Use nn.functional.interpolate instead. warnings.warn("nn.functional.upsample is deprecated. Use nn.functional.interpolate instead.") /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/torch/nn/functional.py:3121: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details. "See the documentation of nn.Upsample for details.".format(mode)) (1/374) LR: 0.001000 | Batch: 2.668s | Total: 0min | ETA: 17min | Loss: 1.619 | Loss(text/kernel/emb/rec): 0.680/0.193/0.746/0.000 | IoU(text/kernel): 0.324/0.335 | Acc rec: 0.000 Traceback (most recent call last): File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/queues.py", line 236, in _feed obj = _ForkingPickler.dumps(obj) File "/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed

    the code runs normally when using the CTW1500 datasets. but encounter errors when using my own datasets.

    it seems fine in the first run (1/374), what is wrong ? I have no idea.

    opened by Zhang-O 5
  • 关于训练的问题

    关于训练的问题

    您好!我现在在自己的数据上进行训练,训练过程是这样的 image Epoch: [212 | 600] (1/198) LR: 0.000677 | Batch: 3.934s | Total: 0min | ETA: 13min | Loss: 0.752 | Loss(text/kernel/emb/rec): 0.493/0.199/0.059/0.000 | IoU(text/kernel): 0.055/0.553 | Acc rec: 0.000 (21/198) LR: 0.000677 | Batch: 1.089s | Total: 0min | ETA: 3min | Loss: 0.731 | Loss(text/kernel/emb/rec): 0.478/0.199/0.054/0.000 | IoU(text/kernel): 0.048/0.482 | Acc rec: 0.000 (41/198) LR: 0.000677 | Batch: 1.022s | Total: 1min | ETA: 3min | Loss: 0.732 | Loss(text/kernel/emb/rec): 0.478/0.198/0.056/0.000 | IoU(text/kernel): 0.049/0.476 | Acc rec: 0.000 这个Acc rec一直是0,我终止训练后,在测试数据上进行测试时,output输出的是空的,请问是怎么回事呢,感谢啦!

    opened by mayidu 3
  • 关于后处理的疑问

    关于后处理的疑问

    1. 后处理的代码中当kernel中两个连通域的面积比大于max_rate时,将这两个连通域的flag赋值为1,在扩充时,必须同时满足当前扩充的点所属的连通域的flag值为1且与kernal的similar vector距离大于3时才不扩充该点。请问设flag这步操作的作用是什么,直接判断与Kernel的similar vector的距离可以吗?
    2. 论文中扩充的点与kernel相似向量的欧式距离thresh值为6,代码中为3,请问实际应用中这个值跟什么有关系,是数据集的某些特点吗?
    opened by jewelc92 3
  • Regarding pa.pyx

    Regarding pa.pyx

    Hi,

    I try to run your code and figure out that in your last line in pa.pyx

    return _pa(kernels[:-1], emb, label, cc, kernel_num, label_num, min_area)

    Looks like this should be

    return _pa(kernels, emb, label, cc, kernel_num, label_num, min_area)

    So that we can scan over all kernels (you skip the last kernel) and there is no crash in this function. Am I correct?

    Thanks.

    opened by liuch37 3
  • AttributeError: 'Namespace' object has no attribute 'resume'

    AttributeError: 'Namespace' object has no attribute 'resume'

    PAN++ic15,An error appears when trying to test the model:

    reading type: pil. Traceback (most recent call last): File "test.py", line 155, in main(args) File "test.py", line 138, in main print("No checkpoint found at '{}'".format(args.resume)) AttributeError: 'Namespace' object has no attribute 'resume'

    opened by lrjj 2
  • 训练Total Text时遇到的问题

    训练Total Text时遇到的问题

    运行 python train.py config/pan/pan_r18_tt.py 后,出现如下情况: p1 Traceback (most recent call last): File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/queues.py", line 234, in _feed obj = _ForkingPickler.dumps(obj) File "/home/dell2/anaconda3/envs/pannet/lib/python3.6/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj) _pickle.PicklingError: Can't pickle <class 'cPolygon.Error'>: import of module 'cPolygon' failed 似乎是迭代过程中出现的问题且只出现在训练TT数据集的时候 请问出现这种情况该怎样解决呢?谢谢您

    opened by mashumli 2
  • 执行test.py提示TypeError: 'module' object is not callable

    执行test.py提示TypeError: 'module' object is not callable

    将模型路径和config文件路径配置好了之后,执行python test.py,提示如下: Traceback (most recent call last): File "test.py", line 117, in main(args) File "test.py", line 107, in main test(test_loader, model, cfg) File "test.py", line 56, in test outputs = model(**data) File "/home/ethony/anaconda3/envs/ocr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 547, in call result = self.forward(*input, **kwargs) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/pan.py", line 104, in forward det_res = self.det_head.get_results(det_out, img_metas, cfg) File "/media/ethony/C14D581BDA18EBFA/lyg_datas_and_code/OCR_work/pan_pp.pytorch-master/models/head/pa_head.py", line 65, in get_results label = pa(kernels, emb) TypeError: 'module' object is not callable 看提示应该是model/post_processing下的pa没有正确导入,导入为模块了,这应该怎么解决呢

    opened by ethanlighter 2
  • problems in train.py

    problems in train.py

    Hi. When I run 'python train.py config/pan/pan_r18_ic15.py' , the errors are as followings: Do you know how to solve the problem? Thank you very much. Traceback (most recent call last): File "train.py", line 234, in main(args) File "train.py", line 216, in main train(train_loader, model, optimizer, epoch, start_iter, cfg) File "train.py", line 41, in train for iter, data in enumerate(train_loader): File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 435, in next data = self._next_data() File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1085, in _next_data return self._process_data(data) File "D:\Anaconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1111, in _process_data data.reraise() File "D:\Anaconda3\lib\site-packages\torch_utils.py", line 428, in reraise raise self.exc_type(msg) TypeError: function takes exactly 5 arguments (1 given)

    opened by YUDASHUAI916 2
  • not sure about run compile.sh

    not sure about run compile.sh

    (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$ sh ./compile.sh Compiling pa.pyx because it depends on /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/init.pxd. [1/1] Cythonizing pa.pyx /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.pyx tree = Parsing.p_module(s, pxd, full_module_name) running build_ext building 'pa' extension creating build creating build/temp.linux-x86_64-3.7 gcc -pthread -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include -I/data/tools/anaconda3/envs/zyl_torch16/include/python3.7m -c pa.cpp -o build/temp.linux-x86_64-3.7/pa.o -O3 cc1plus: warning: command line option ‘-Wstrict-prototypes’ is valid for C/ObjC but not for C++ In file included from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:0, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/ndarrayobject.h:12, from /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/arrayobject.h:4, from pa.cpp:647: /data/tools/anaconda3/envs/zyl_torch16/lib/python3.7/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: #warning "Using deprecated NumPy API, disable it with " "#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-Wcpp] #warning "Using deprecated NumPy API, disable it with "
    ^~~~~~~ g++ -pthread -shared -B /data/tools/anaconda3/envs/zyl_torch16/compiler_compat -L/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,-rpath=/data/tools/anaconda3/envs/zyl_torch16/lib -Wl,--no-as-needed -Wl,--sysroot=/ build/temp.linux-x86_64-3.7/pa.o -o /data/zhangyl/pan_pp.pytorch-master/models/post_processing/pa/pa.cpython-37m-x86_64-linux-gnu.so (zyl_torch16) [email protected]:/data/zhangyl/pan_pp.pytorch-master$

    this is the compile history, I am not sure whether is successully build or not.

    opened by Zhang-O 2
  • morphology operations from kornia

    morphology operations from kornia

    Hi,

    Your FAST paper is really amazing! While you already have an implementation of erosion/dilation, let me offer using our set of morphology, implemented in pyre pytorch: https://kornia.readthedocs.io/en/latest/morphology.html

    https://kornia-tutorials.readthedocs.io/en/master/morphology_101.html

    Best, Dmytro.

    opened by ducha-aiki 1
  • The sample 199 not present in GT

    The sample 199 not present in GT

    Hello Author, First of all, I would like to appreciate your work and effort. I have tried your repo. The evaluation code gives me an error of the "The sample 199 not present in GT," but the label text is there. When I tried to see the result via visualizing it on the images, it seems good. Let me know if there is any solution from your side.

    opened by zeng-cy 1
  • How  to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    How to predict a new image using the training weight?it doesn't work below.

    python test.py config/pan/pan_r18_ic15.py checkpoints/pan_r18_ic15/checkpoint.pth.tar cd eval/ ./eval_ic15.sh

    please inform me with [email protected] or wechat SanQian-2012,thanks you so much.

    Originally posted by @Devin521314 in https://github.com/whai362/pan_pp.pytorch/issues/91#issuecomment-1233810612

    opened by Devin521314 0
  • Why rec encoder use EOS? not SOS

    Why rec encoder use EOS? not SOS

    hi: I find there is no 'SOS' in code, I understand SOS should be embedding at the beginning. Please tell me ,thanks! ---------------code----------------------------------------------- class Encoder(nn.Module): def init(self, hidden_dim, voc, char2id, id2char): super(Encoder, self).init() self.hidden_dim = hidden_dim self.vocab_size = len(voc) self.START_TOKEN = char2id['EOS'] self.emb = nn.Embedding(self.vocab_size, self.hidden_dim) self.att = MultiHeadAttentionLayer(self.hidden_dim, 8)

    def forward(self, x):
        batch_size, feature_dim, H, W = x.size()
        x_flatten = x.view(batch_size, feature_dim, H * W).permute(0, 2, 1)
        st = x.new_full((batch_size,), self.START_TOKEN, dtype=torch.long)
        emb_st = self.emb(st)
        holistic_feature, _ = self.att(emb_st, x_flatten, x_flatten)
        return 
    
    opened by Patickk 0
Releases(v1)
Code for our NeurIPS 2021 paper 'Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation'

Exploiting the Intrinsic Neighborhood Structure for Source-free Domain Adaptation (NeurIPS 2021) Code for our NeurIPS 2021 paper 'Exploiting the Intri

Shiqi Yang 53 Dec 25, 2022
Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption

SG-GAN TensorFlow implementation of SG-GAN. Prerequisites TensorFlow (implemented in v1.3) numpy scipy pillow Getting Started Train Prepare dataset. W

lplcor 61 Jun 07, 2022
Data Engineering ZoomCamp

Data Engineering ZoomCamp I'm partaking in a Data Engineering Bootcamp / Zoomcamp and will be tracking my progress here. I can't promise these notes w

Aaron 61 Jan 06, 2023
implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks

YOLOR implementation of paper - You Only Learn One Representation: Unified Network for Multiple Tasks To reproduce the results in the paper, please us

Kin-Yiu, Wong 1.8k Jan 04, 2023
Atif Hassan 103 Dec 14, 2022
Libtorch yolov3 deepsort

Overview It is for my undergrad thesis in Tsinghua University. There are four modules in the project: Detection: YOLOv3 Tracking: SORT and DeepSORT Pr

Xu Wei 226 Dec 13, 2022
[MedIA2021]MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning

MIDeepSeg: Minimally Interactive Segmentation of Unseen Objects from Medical Images Using Deep Learning [MedIA or Arxiv] and [Demo] This repository pr

Healthcare Intelligence Laboratory 92 Dec 08, 2022
Explore extreme compression for pre-trained language models

Code for paper "Exploring extreme parameter compression for pre-trained language models ICLR2022"

twinkle 16 Nov 14, 2022
Vowpal Wabbit is a machine learning system which pushes the frontier of machine learning with techniques such as online, hashing, allreduce, reductions, learning2search, active, and interactive learning.

This is the Vowpal Wabbit fast online learning code. Why Vowpal Wabbit? Vowpal Wabbit is a machine learning system which pushes the frontier of machin

Vowpal Wabbit 8.1k Jan 06, 2023
Learn the Deep Learning for Computer Vision in three steps: theory from base to SotA, code in PyTorch, and space-repetition with Anki

DeepCourse: Deep Learning for Computer Vision arthurdouillard.com/deepcourse/ This is a course I'm giving to the French engineering school EPITA each

Arthur Douillard 113 Nov 29, 2022
Learning Tracking Representations via Dual-Branch Fully Transformer Networks

Learning Tracking Representations via Dual-Branch Fully Transformer Networks DualTFR ⭐ We achieves the runner-ups for both VOT2021ST (short-term) and

phiphi 19 May 04, 2022
Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices

Joint Channel and Weight Pruning for Model Acceleration on Mobile Devices Abstract For practical deep neural network design on mobile devices, it is e

11 Dec 30, 2022
Pipeline code for Sequential-GAM(Genome Architecture Mapping).

Sequential-GAM Pipeline code for Sequential-GAM(Genome Architecture Mapping). mapping whole_preprocess.sh include the whole processing of mapping. usa

3 Nov 03, 2022
Classify the disease status of a plant given an image of a passion fruit

Passion Fruit Disease Detection I tried to create an accurate machine learning models capable of localizing and identifying multiple Passion Fruits in

3 Nov 09, 2021
MMDetection3D is an open source object detection toolbox based on PyTorch

MMDetection3D is an open source object detection toolbox based on PyTorch, towards the next-generation platform for general 3D detection. It is a part of the OpenMMLab project developed by MMLab.

OpenMMLab 3.2k Jan 05, 2023
PyTorch implementation of "Optimization Planning for 3D ConvNets"

Optimization-Planning-for-3D-ConvNets Code for the ICML 2021 paper: Optimization Planning for 3D ConvNets. Authors: Zhaofan Qiu, Ting Yao, Chong-Wah N

Zhaofan Qiu 2 Jan 12, 2022
An Open Source Machine Learning Framework for Everyone

Documentation TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, a

170.1k Jan 04, 2023
Code for our paper Aspect Sentiment Quad Prediction as Paraphrase Generation in EMNLP 2021.

Aspect Sentiment Quad Prediction (ASQP) This repo contains the annotated data and code for our paper Aspect Sentiment Quad Prediction as Paraphrase Ge

Isaac 39 Dec 11, 2022
Hcpy - Interface with Home Connect appliances in Python

Interface with Home Connect appliances in Python This is a very, very beta inter

Trammell Hudson 116 Dec 27, 2022
Lipschitz-constrained Unsupervised Skill Discovery

Lipschitz-constrained Unsupervised Skill Discovery This repository is the official implementation of Seohong Park, Jongwook Choi*, Jaekyeom Kim*, Hong

Seohong Park 17 Dec 18, 2022