Patch2Pix: Epipolar-Guided Pixel-Level Correspondences [CVPR2021]

Overview

Patch2Pix for Accurate Image Correspondence Estimation

This repository contains the Pytorch implementation of our paper accepted at CVPR2021: Patch2Pix: Epipolar-Guided Pixel-Level Correspondences. [Paper] [Video].

Overview To use our code, first download the repository:

git clone [email protected]:GrumpyZhou/patch2pix.git

Setup Running Environment

The code has been tested on Ubuntu (16.04&18.04) with Python 3.7 + Pytorch 1.7.0 + CUDA 10.2.
We recommend to use Anaconda to manage packages and reproduce the paper results. Run the following lines to automatically setup a ready environment for our code.

conda env create -f environment.yml
conda activte patch2pix

Download Pretrained Models

In order to run our examples, one needs to first download our pretrained Patch2Pix model. To further train a Patch2Pix model, one needs to download the pretrained NCNet. We provide the download links in pretrained/download.sh. To download both, one can run

cd pretrained
bash download.sh

Evaluation

❗️ NOTICE ❗️ : In this repository, we only provide examples to estimate correspondences using our Patch2Pix implemenetation.

To reproduce our evalutions on HPatches, Aachen and InLoc benchmarks, we refer you to our toolbox for image matching: image-matching-toolbox. There, you can also find implementation to reproduce the results of other state-of-the-art methods that we compared to in our paper.

Matching Examples

In our notebook examples/visualize_matches.ipynb , we give examples how to obtain matches given a pair of images using both Patch2Pix (our pretrained) and NCNet (our adapted). The example image pairs are borrowed from D2Net, one can easily replace it with your own examples.

Training

Notice the followings are necessary only if you want to train a model yourself.

Data preparation

We use MegaDepth dataset for training. To keep more data for training, we didn't split a validation set from MegaDepth. Instead we use the validation splits of PhotoTourism. The following steps describe how to prepare the same training and validation data that we used.

Preapre Training Data

  1. We preprocess MegaDepth dataset following the preprocessing steps proposed by D2Net. For details, please checkout their "Downloading and preprocessing the MegaDepth dataset" section in their github documentation.

  2. Then place the processed MegaDepth dataset under data/ folder and name it as MegaDepth_undistort (or create a symbolic link for it).

  3. One can directly download our pre-computred training pairs using our download script.

cd data_pairs
bash download.sh

In case one wants to generate pairs with different settings, we provide notebooks to generate pairs from scratch. Once you finish step 1 and 2, the training pairs can be generated using our notebook data_pairs/prep_megadepth_training_pairs.ipynb.

Preapre Validation Data

  1. Use our script to dowload and extract the subset of train and val sequences from the PhotoTourism dataset.
cd data
bash prepare_immatch_val_data.sh
  1. Precompute image pairwise overlappings for fast loading of validation pairs.
# Under the root folder: patch2pix/
python -m data_pairs.precompute_immatch_val_ovs \
		--data_root data/immatch_benchmark/val_dense

Training Examples

To train our best model:

python -m train_patch2pix --gpu 0 \
    --epochs 25 --batch 4 \
    --save_step 1 --plot_counts 20 --data_root 'data' \
    --change_stride --panc 8 --ptmax 400 \
    --pretrain 'pretrained/ncn_ivd_5ep.pth' \
    -lr 0.0005 -lrd 'multistep' 0.2 5 \
    --cls_dthres 50 5 --epi_dthres 50 5  \
    -o 'output/patch2pix' 

The above command will save the log file and checkpoints to the output folder specified by -o. Our best model was trained on a 48GB GPU. To train on a smaller GPU, e.g, with 12 GB, one can either set --batch 1 or --ptmax 250 which defines the maximum number of match proposals to be refined for each image pair. However, those changes might also decrease the training performance according to our experience. Notice, during the testing, our network only requires 12GB GPU.

Usage of Visdom Server Our training script is coded to monitor the training process using Visdom. To enable the monitoring, one needs to:

  1. Run a visdom sever on your localhost, for example:
# Feel free to change the port
python -m visdom.server -port 9333 \
-env_path ~/.visdom/patch2pix
  1. Append options -vh 'localhost' -vp 9333 to the commands of the training example above.

BibTeX

If you use our method or code in your project, please cite our paper:

@inproceedings{ZhouCVPRpatch2pix,
        author       = "Zhou, Qunjie and Sattler, Torsten and Leal-Taixe, Laura",
        title        = "Patch2Pix: Epipolar-Guided Pixel-Level Correspondences",
        booktitle    = "CVPR",
        year         = 2021,
}
Owner
Qunjie Zhou
PhD Candidate at the Dynamic Vision and Learning Group.
Qunjie Zhou
A Python library for differentiable optimal control on accelerators.

A Python library for differentiable optimal control on accelerators.

Google 80 Dec 21, 2022
Code for the Image similarity challenge.

ISC 2021 This repository contains code for the Image Similarity Challenge 2021. Getting started The docs subdirectory has step-by-step instructions on

Facebook Research 173 Dec 12, 2022
UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model

UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model Official repository for the ICCV 2021 paper: UltraPose: Syn

MomoAILab 92 Dec 21, 2022
PIXIE: Collaborative Regression of Expressive Bodies

PIXIE: Collaborative Regression of Expressive Bodies [Project Page] This is the official Pytorch implementation of PIXIE. PIXIE reconstructs an expres

Yao Feng 331 Jan 04, 2023
Several simple examples for popular neural network toolkits calling custom CUDA operators.

Neural Network CUDA Example Several simple examples for neural network toolkits (PyTorch, TensorFlow, etc.) calling custom CUDA operators. We provide

WeiYang 798 Jan 01, 2023
[CVPR 2021] A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts

Visual-Reasoning-eXplanation [CVPR 2021 A Peek Into the Reasoning of Neural Networks: Interpreting with Structural Visual Concepts] Project Page | Vid

Andy_Ge 54 Dec 21, 2022
Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

Rodrigo Arenas 36 Dec 12, 2022
Online-compatible Unsupervised Non-resonant Anomaly Detection Repository

Online-compatible Unsupervised Non-resonant Anomaly Detection Repository Repository containing all scripts used in the studies of Online-compatible Un

0 Nov 09, 2021
A curated list of awesome neural radiance fields papers

Awesome Neural Radiance Fields A curated list of awesome neural radiance fields papers, inspired by awesome-computer-vision. How to submit a pull requ

Yen-Chen Lin 3.9k Dec 27, 2022
Resources for our AAAI 2022 paper: "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification".

LOREN Resources for our AAAI 2022 paper (pre-print): "LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification". DEMO System Check out o

Jiangjie Chen 37 Dec 27, 2022
A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM's

sign-language-detection A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM. The project is built for a vocabular

Hashim 4 Feb 06, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer This repository contains code to compute depth from a

Intelligent Systems Lab Org 2.3k Jan 01, 2023
The original implementation of TNDM used in the NeurIPS 2021 paper (no longer being updated)

TNDM - Targeted Neural Dynamical Modeling Note: This code is no longer being updated. The official re-implementation can be found at: https://github.c

1 Jul 21, 2022
NeuroFind - A solution to the to the Task given by the Oberseminar of Messtechnik Institute of TU Dresden in 2021

NeuroFind A solution to the to the Task given by the Oberseminar of Messtechnik

1 Jan 20, 2022
Official implementation for the paper "Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection"

Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D Object Detection PyTorch code release of the paper "Attentive Prototypes for Sour

Deepti Hegde 23 Oct 17, 2022
Official repository of the paper "GPR1200: A Benchmark for General-PurposeContent-Based Image Retrieval"

GPR1200 Dataset GPR1200: A Benchmark for General-Purpose Content-Based Image Retrieval (ArXiv) Konstantin Schall, Kai Uwe Barthel, Nico Hezel, Klaus J

Visual Computing Group 16 Nov 21, 2022
The Body Part Regression (BPR) model translates the anatomy in a radiologic volume into a machine-interpretable form.

Copyright © German Cancer Research Center (DKFZ), Division of Medical Image Computing (MIC). Please make sure that your usage of this code is in compl

MIC-DKFZ 40 Dec 18, 2022
This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation

This is a GUI interface which can process forest fire detection, smoke detection and fire segmentation. Yolov5 is used to detect fire and smoke and unet is used to segment fire.

7 Jan 08, 2023
YolactEdge: Real-time Instance Segmentation on the Edge

YolactEdge, the first competitive instance segmentation approach that runs on small edge devices at real-time speeds. Specifically, YolactEdge runs at up to 30.8 FPS on a Jetson AGX Xavier (and 172.7

Haotian Liu 1.1k Jan 06, 2023
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022