LETR: Line Segment Detection Using Transformers without Edges

Related tags

Deep LearningLETR
Overview

LETR: Line Segment Detection Using Transformers without Edges

Introduction

This repository contains the official code and pretrained models for Line Segment Detection Using Transformers without Edges. Yifan Xu*, Weijian Xu*, David Cheung, and Zhuowen Tu. CVPR2021 (Oral)

In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. Our method, named LinE segment TRansformers (LETR), takes advantages of having integrated tokenized queries, a self-attention mechanism, and encoding-decoding strategy within Transformers by skipping standard heuristic designs for the edge element detection and perceptual grouping processes. We equip Transformers with a multi-scale encoder/decoder strategy to perform fine-grained line segment detection under a direct endpoint distance loss. This loss term is particularly suitable for detecting geometric structures such as line segments that are not conveniently represented by the standard bounding box representations. The Transformers learn to gradually refine line segments through layers of self-attention.

Model Pipeline

Changelog

05/07/2021: Code for LETR Basic Usage Demo are released.

04/30/2021: Code and pre-trained checkpoint for LETR are released.

Results and Checkpoints

Name sAP10 sAP15 sF10 sF15 URL
Wireframe 65.6 68.0 66.1 67.4 LETR-R101
YorkUrban 29.6 32.0 40.5 42.1 LETR-R50

Reproducing Results

Step1: Code Preparation

git clone https://github.com/mlpc-ucsd/LETR.git

Step2: Environment Installation

mkdir -p data
mkdir -p evaluation/data
mkdir -p exp


conda create -n letr python anaconda
conda activate letr
conda install -c pytorch pytorch torchvision
conda install cython scipy
pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
pip install docopt

Step3: Data Preparation

To reproduce our results, you need to process two datasets, ShanghaiTech and YorkUrban. Files located at ./helper/wireframe.py and ./helper/york.py are both modified based on the code from L-CNN, which process the raw data from download.

  • ShanghaiTech Train Data
    • To Download (modified based on from L-CNN)
      cd data
      bash ../helper/gdrive-download.sh 1BRkqyi5CKPQF6IYzj_dQxZFQl0OwbzOf wireframe_raw.tar.xz
      tar xf wireframe_raw.tar.xz
      rm wireframe_raw.tar.xz
      python ../helper/wireframe.py ./wireframe_raw ./wireframe_processed
      
  • YorkUrban Train Data
    • To Download
      cd data
      wget https://www.dropbox.com/sh/qgsh2audfi8aajd/AAAQrKM0wLe_LepwlC1rzFMxa/YorkUrbanDB.zip
      unzip YorkUrbanDB.zip 
      python ../helper/york.py ./YorkUrbanDB ./york_processed
      
  • Processed Evaluation Data
    bash ./helper/gdrive-download.sh 1T4_6Nb5r4yAXre3lf-zpmp3RbmyP1t9q ./evaluation/data/wireframe.tar.xz
    bash ./helper/gdrive-download.sh 1ijOXv0Xw1IaNDtp1uBJt5Xb3mMj99Iw2 ./evaluation/data/york.tar.xz
    tar -vxf ./evaluation/data/wireframe.tar.xz -C ./evaluation/data/.
    tar -vxf ./evaluation/data/york.tar.xz -C ./evaluation/data/.
    rm ./evaluation/data/wireframe.tar.xz
    rm ./evaluation/data/york.tar.xz

Step4: Train Script Examples

  1. Train a coarse-model (a.k.a. stage1 model).

    # Usage: bash script/*/*.sh [exp name]
    bash script/train/a0_train_stage1_res50.sh  res50_stage1 # LETR-R50  
    bash script/train/a1_train_stage1_res101.sh res101_stage1 # LETR-R101 
  2. Train a fine-model (a.k.a. stage2 model).

    # Usage: bash script/*/*.sh [exp name]
    bash script/train/a2_train_stage2_res50.sh  res50_stage2  # LETR-R50
    bash script/train/a3_train_stage2_res101.sh res101_stage2 # LETR-R101 
  3. Fine-tune the fine-model with focal loss (a.k.a. stage2_focal model).

    # Usage: bash script/*/*.sh [exp name]
    bash script/train/a4_train_stage2_focal_res50.sh   res50_stage2_focal # LETR-R50
    bash script/train/a5_train_stage2_focal_res101.sh  res101_stage2_focal # LETR-R101 

Step5: Evaluation

  1. Evaluate models.
    # Evaluate sAP^10, sAP^15, sF^10, sF^15 (both Wireframe and YorkUrban datasets).
    bash script/evaluation/eval_stage1.sh [exp name]
    bash script/evaluation/eval_stage2.sh [exp name]
    bash script/evaluation/eval_stage2_focal.sh [exp name]

Citation

If you use this code for your research, please cite our paper:

@InProceedings{Xu_2021_CVPR,
    author    = {Xu, Yifan and Xu, Weijian and Cheung, David and Tu, Zhuowen},
    title     = {Line Segment Detection Using Transformers Without Edges},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2021},
    pages     = {4257-4266}
}

Acknowledgments

This code is based on the implementations of DETR: End-to-End Object Detection with Transformers.

Owner
mlpc-ucsd
mlpc-ucsd
Angle data is a simple data type.

angledat Angle data is a simple data type. Installing + using Put angledat.py in the main dir of your project. Import it and use. Comments Comments st

1 Jan 05, 2022
Semantic Image Synthesis with SPADE

Semantic Image Synthesis with SPADE New implementation available at imaginaire repository We have a reimplementation of the SPADE method that is more

NVIDIA Research Projects 7.3k Jan 07, 2023
Pytorch-diffusion - A basic PyTorch implementation of 'Denoising Diffusion Probabilistic Models'

PyTorch implementation of 'Denoising Diffusion Probabilistic Models' This reposi

Arthur Juliani 76 Jan 07, 2023
Code for "Reconstructing 3D Human Pose by Watching Humans in the Mirror", CVPR 2021 oral

Reconstructing 3D Human Pose by Watching Humans in the Mirror Qi Fang*, Qing Shuai*, Junting Dong, Hujun Bao, Xiaowei Zhou CVPR 2021 Oral The videos a

ZJU3DV 178 Dec 13, 2022
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
Project of 'TBEFN: A Two-branch Exposure-fusion Network for Low-light Image Enhancement '

TBEFN: A Two-branch Exposure-fusion Network for Low-light Image Enhancement Codes for TMM20 paper "TBEFN: A Two-branch Exposure-fusion Network for Low

KUN LU 31 Nov 06, 2022
harmonic-percussive-residual separation algorithm wrapped as a VST3 plugin (iPlug2)

Harmonic-percussive-residual separation plug-in This work is a study on the plausibility of a sines-transients-noise decomposition inspired algorithm

Derp Learning 9 Sep 01, 2022
Learning infinite-resolution image processing with GAN and RL from unpaired image datasets, using a differentiable photo editing model.

Exposure: A White-Box Photo Post-Processing Framework ACM Transactions on Graphics (presented at SIGGRAPH 2018) Yuanming Hu1,2, Hao He1,2, Chenxi Xu1,

Yuanming Hu 719 Dec 29, 2022
A Python library for common tasks on 3D point clouds

Point Cloud Utils (pcu) - A Python library for common tasks on 3D point clouds Point Cloud Utils (pcu) is a utility library providing the following fu

Francis Williams 622 Dec 27, 2022
SASM - simple crossplatform IDE for NASM, MASM, GAS and FASM assembly languages

SASM (SimpleASM) - простая кроссплатформенная среда разработки для языков ассемблера NASM, MASM, GAS, FASM с подсветкой синтаксиса и отладчиком. В SA

Dmitriy Manushin 5.6k Jan 06, 2023
PyTorch implementation of MICCAI 2018 paper "Liver Lesion Detection from Weakly-labeled Multi-phase CT Volumes with a Grouped Single Shot MultiBox Detector"

Grouped SSD (GSSD) for liver lesion detection from multi-phase CT Note: the MICCAI 2018 paper only covers the multi-phase lesion detection part of thi

Sang-gil Lee 36 Oct 12, 2022
Learning cell communication from spatial graphs of cells

ncem Features Repository for the manuscript Fischer, D. S., Schaar, A. C. and Theis, F. Learning cell communication from spatial graphs of cells. 2021

Theis Lab 77 Dec 30, 2022
Implementation of TimeSformer, a pure attention-based solution for video classification

TimeSformer - Pytorch Implementation of TimeSformer, a pure and simple attention-based solution for reaching SOTA on video classification.

Phil Wang 602 Jan 03, 2023
OpenDelta - An Open-Source Framework for Paramter Efficient Tuning.

OpenDelta is a toolkit for parameter efficient methods (we dub it as delta tuning), by which users could flexibly assign (or add) a small amount parameters to update while keeping the most paramters

THUNLP 386 Dec 26, 2022
Diagnostic tests for linguistic capacities in language models

LM diagnostics This repository contains the diagnostic datasets and experimental code for What BERT is not: Lessons from a new suite of psycholinguist

61 Jan 02, 2023
Official repository of "BasicVSR++: Improving Video Super-Resolution with Enhanced Propagation and Alignment"

BasicVSR_PlusPlus (CVPR 2022) [Paper] [Project Page] [Code] This is the official repository for BasicVSR++. Please feel free to raise issue related to

Kelvin C.K. Chan 227 Jan 01, 2023
IMBENS: class-imbalanced ensemble learning in Python.

IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [知乎/Zhihu] [中文README] [a

Zhining Liu 176 Jan 04, 2023
Official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

Vision Transformer with Progressive Sampling This is the official implementation of the paper Vision Transformer with Progressive Sampling, ICCV 2021.

yuexy 123 Jan 01, 2023
Code for our CVPR 2022 Paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection"

GEN-VLKT Code for our CVPR 2022 paper "GEN-VLKT: Simplify Association and Enhance Interaction Understanding for HOI Detection". Contributed by Yue Lia

Yue Liao 47 Dec 04, 2022
Implementation of neural class expression synthesizers

NCES Implementation of neural class expression synthesizers (NCES) Installation Clone this repository: https://github.com/ConceptLengthLearner/NCES.gi

NeuralConceptSynthesis 0 Jan 06, 2022