Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Last update: Sep 15, 2022

Related tags

Deep Learning QVR-SimpleDLM

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Environment

CUDA="11.0"
CUDNN="8"
UBUNTU="18.04"

Install

bash install.sh
git clone https://github.com/NVIDIA/apex && cd apex
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
pip install .
# under our project root folder
pip install .

Data Preparation

Our model is pre-trained on IIT-CDIP dataset, fine-tuned on FUNSD train set and evaluated on FUNSD test set and INV-CDIP test set.

Download our processed OCR results of IIT-CDIP with hocr_list_addr.txt and put under PRETRAIN_DATA_FOLDER/.
Download our processed FUNSD and INV-CDIP datasets and put under DATA_DIR/.

Reproduce Our Results

Download our model fine-tuned on FUNSD here.
Do inference following

# $MODEL_PATH here is where you save the fine-tuned model.
# DATASET_NAME is FUNSD or INV-CDIP.
bash reproduce_results.sh $MODEL_PATH $DATA_DIR/DATASET_NAME

You should get the following results.

Datasets	Precision	Recall	F1
FUNSD	60.4	60.9	60.7
INV-CDIP	50.5	47.6	49.0

Pre-training

You can skip the following steps by downloading our pre-trained SimpleDLM model here.
Or download layoutlm-base-uncased.
Do pre-training following

# $NUM_GPUS is the number of gpus you want to do the pretraining on. To reproduce the paper's results we recommend to use 8 gpus.
# $MODEL_PATH here is where you save the LayoutLM model.
# $PRETRAIN_DATA_FOLDER is the folder of IIT-CDIP hocr files.

python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS pretraining.py \
--model_name_or_path $MODEL_PATH  --data_dir $PRETRAIN_DATA_FOLDER \
--output_dir $OUTPUT_DIR

Fine-tuning

Do fine-tuning following

# $MODEL_PATH is where you save the pre-trained simpleDLM model.

CUDA_VISIBLE_DEVICES=0 python run_query_value_retrieval.py --model_type simpledlm --model_name_or_path $MODEL_PATH \
--data_dir $DATA_DIR/FUNSD/ --output_dir $OUTPUT_DIR --do_train --evaluate_during_training

Citation

If you find this codebase useful, please cite our paper:

@article{gao2021value,
  title={Value Retrieval with Arbitrary Queries for Form-like Documents},
  author={Gao, Mingfei and Xue, Le and Ramaiah, Chetan and Xing, Chen and Xu, Ran and Xiong, Caiming},
  journal={arXiv preprint arXiv:2112.07820},
  year={2021}
}

Contact

Please send an email to [email protected] or [email protected] if you have questions.

Pytorch Implementation of Value Retrieval with Arbitrary Queries for Form-like Documents.

Related tags

Overview

Value Retrieval with Arbitrary Queries for Form-like Documents

Introduction

Environment

Install

Data Preparation

Reproduce Our Results

Pre-training

Fine-tuning

Citation

Contact

Owner

Salesforce

Face Transformer for Recognition

Novel Instances Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection

This repository provides data for the VAW dataset as described in the CVPR 2021 paper titled "Learning to Predict Visual Attributes in the Wild"

Bringing Characters to Life with Computer Brains in Unity

This code uses generative adversarial networks to generate diverse task allocation plans for Multi-agent teams.

codes for paper Combining Dynamic Local Context Focus and Dependency Cluster Attention for Aspect-level sentiment classification

MINERVA: An out-of-the-box GUI tool for offline deep reinforcement learning

Code for the IJCAI 2021 paper "Structure Guided Lane Detection"

Gesture recognition on Event Data

TensorFlow implementation of "Learning from Simulated and Unsupervised Images through Adversarial Training"

(Py)TOD: Tensor-based Outlier Detection, A General GPU-Accelerated Framework

Learned Initializations for Optimizing Coordinate-Based Neural Representations

This repository contains implementations of all Machine Learning Algorithms from scratch in Python. Mathematics required for ML and many projects have also been included.

Yolo algorithm for detection + centroid tracker to track vehicles

Differentiable Wavetable Synthesis

Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Implementation of Squeezenet in pytorch, pretrained models on Cifar 10 data to come

MAU: A Motion-Aware Unit for Video Prediction and Beyond, NeurIPS2021

The project was to detect traffic signs, based on the Megengine framework.

An efficient and easy-to-use deep learning model compression framework