PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Last update: Jul 20, 2022

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

The unofficial code of CDistNet.

Now, we have implemented all the modules according to the papaer except for TPS in the visual branch.You can refer ASTER for the implementation of TPS.

Requirements

Python3.6.8
lmdb==0.98
torch==1.5.1
torchvision==0.6.1
Pillow==6.1.0
opencv-python==4.2.0.32
numpy==1.17.1

Data preparation

We offer you a tool to transform raw dataset to LMDB dataset. Details please refer to tools/create_lmdb_dataset.py

You can also download lmdb dataset from OCR_Dataset

Train

First you need to modify some arguments in configs/cdistnet.yml.

TrainReader set the path of train lmdb dataset.
EvalReader set the path of evaluation lmdb dataset.
Global set the args like image_shape, dict_file, etc.
VisualModule set the args of visual branch in the original paper.
PositionalEmbedding set the args of positional branch.
SemanticEmbedding set the args of semantic branch.
MDCDP set the args of MDCDP.

python train.py -c configs/cdistnet.yml

Demo

Modify these arguments below in configs/cdistnet.yml.

pretrain_weights set the path of model file path.
infer_img set the image path.
`is_train set to False.

python predict.py -c configs/cdistnet.yml

TODO

Pretrained models
Test code
Comparison with original paper on benchmarks(CUTE, IC13, IC15, IIIT5K, SVT, SVTP)

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Related tags

Overview

PyTorch implementation of CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition

Requirements

Data preparation

Train

Demo

TODO

Owner

Conditional Gradients For The Approximately Vanishing Ideal

Monocular Depth Estimation Using Laplacian Pyramid-Based Depth Residuals

Best Practices on Recommendation Systems

Implementation of Auto-Conditioned Recurrent Networks for Extended Complex Human Motion Synthesis

Unified file system operation experience for different backend

A python script to convert images to animated sus among us crewmate twerk jifs as seen on r/196

Machine learning and Deep learning models, deploy on telegram (the best social media)

Streaming Anomaly Detection Framework in Python (Outlier Detection for Streaming Data)

【steal piano】GitHub偷情分析工具！

The source code for Adaptive Kernel Graph Neural Network at AAAI2022

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

Differentiable Optimizers with Perturbations in Pytorch

An abstraction layer for mathematical optimization solvers.

Official Pytorch Implementation for Splicing ViT Features for Semantic Appearance Transfer presenting Splice

Aesara is a Python library that allows one to define, optimize, and efficiently evaluate mathematical expressions involving multi-dimensional arrays.

ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

StyleGAN2 - Official TensorFlow Implementation

The repository for our EMNLP 2021 paper "Finnish Dialect Identification: The Effect of Audio and Text"

Nerf pl - NeRF (Neural Radiance Fields) and NeRF in the Wild using pytorch-lightning

Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN, EBGAN, LSGAN)