Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Last update: Jan 04, 2023

Overview

IterMVS

official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Introduction

IterMVS is a novel learning-based MVS method combining highest efficiency and competitive reconstruction quality. We propose a novel GRU-based estimator that encodes pixel-wise probability distributions of depth in its hidden state. Ingesting multi-scale matching information, our model refines these distributions over multiple iterations and infers depth and confidence. Extensive experiments on DTU, Tanks & Temples and ETH3D show highest efficiency in both memory and run-time, and a better generalization ability than many state-of-the-art learning-based methods.

If you find this project useful for your research, please cite:

@misc{wang2021itermvs,
      title={IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo}, 
      author={Fangjinhua Wang and Silvano Galliani and Christoph Vogel and Marc Pollefeys},
      year={2021},
      eprint={2112.05126},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Installation

Requirements

python 3.6
CUDA 10.1

pip install -r requirements.txt

Reproducing Results

Download pre-processed datasets (provided by PatchmatchNet): DTU's evaluation set, Tanks & Temples and ETH3D benchmark. Each dataset is organized as follows:

root_directory
├──scan1 (scene_name1)
├──scan2 (scene_name2) 
      ├── images                 
      │   ├── 00000000.jpg       
      │   ├── 00000001.jpg       
      │   └── ...                
      ├── cams_1                   
      │   ├── 00000000_cam.txt   
      │   ├── 00000001_cam.txt   
      │   └── ...                
      └── pair.txt

Camera file cam.txt stores the camera parameters, which includes extrinsic, intrinsic, minimum depth and maximum depth:

extrinsic
E00 E01 E02 E03
E10 E11 E12 E13
E20 E21 E22 E23
E30 E31 E32 E33

intrinsic
K00 K01 K02
K10 K11 K12
K20 K21 K22

DEPTH_MIN DEPTH_MAX

pair.txt stores the view selection result. For each reference image, 10 best source views are stored in the file:

TOTAL_IMAGE_NUM
IMAGE_ID0                       # index of reference image 0 
10 ID0 SCORE0 ID1 SCORE1 ...    # 10 best source images for reference image 0 
IMAGE_ID1                       # index of reference image 1
10 ID0 SCORE0 ID1 SCORE1 ...    # 10 best source images for reference image 1 
...

Evaluation on DTU:

For DTU's evaluation set, first download our processed camera parameters from here. Unzip it and replace all the old camera files in the folders cams_1 with new files for all the scans.
In eval_dtu.sh, set DTU_TESTING as the root directory of corresponding dataset, set --outdir as the directory to store the reconstructed point clouds.
CKPT_FILE is the path of checkpoint file (default as our pretrained model which is trained on DTU, the path is checkpoints/dtu/model_000015.ckpt).
Test on GPU by running bash eval_dtu.sh. The code includes depth map estimation and depth fusion. The outputs are the point clouds in ply format.
For quantitative evaluation, download SampleSet and Points from DTU's website. Unzip them and place Points folder in SampleSet/MVS Data/. The structure looks like:

SampleSet
├──MVS Data
      └──Points

In evaluations/dtu/BaseEvalMain_web.m, set dataPath as the path to SampleSet/MVS Data/, plyPath as directory that stores the reconstructed point clouds and resultsPath as directory to store the evaluation results. Then run evaluations/dtu/BaseEvalMain_web.m in matlab.

The results look like:

Acc. (mm)	Comp. (mm)	Overall (mm)
0.373	0.354	0.363

Evaluation on Tansk & Temples:

In eval_tanks.sh, set TANK_TESTING as the root directory of the dataset and --outdir as the directory to store the reconstructed point clouds.
CKPT_FILE is the path of checkpoint file (default as our pretrained model which is trained on DTU, the path is checkpoints/dtu/model_000015.ckpt). We also provide our pretrained model trained on BlendedMVS (checkpoints/blendedmvs/model_000015.ckpt)
Test on GPU by running bash eval_tanks.sh. The code includes depth map estimation and depth fusion. The outputs are the point clouds in ply format.
For our detailed quantitative results on Tanks & Temples, please check the leaderboards (Tanks & Temples: trained on DTU, Tanks & Temples: trained on BlendedMVS).

Evaluation on ETH3D:

In eval_eth.sh, set ETH3D_TESTING as the root directory of the dataset and --outdir as the directory to store the reconstructed point clouds.
CKPT_FILE is the path of checkpoint file (default as our pretrained model which is trained on DTU, the path is checkpoints/dtu/model_000015.ckpt). We also provide our pretrained model trained on BlendedMVS (checkpoints/blendedmvs/model_000015.ckpt)
Test on GPU by running bash eval_eth.sh. The code includes depth map estimation and depth fusion. The outputs are the point clouds in ply format.
For our detailed quantitative results on ETH3D, please check the leaderboards (ETH3D: trained on DTU, ETH3D: trained on BlendedMVS).

Evaluation on custom dataset:

We support preparing the custom dataset from COLMAP's results. The script colmap_input.py (modified based on the script from MVSNet) converts COLMAP's sparse reconstruction results into the same format as the datasets that we provide.
Test on GPU by running bash eval_custom.sh.

Training

DTU

Download pre-processed DTU's training set (provided by PatchmatchNet). The dataset is already organized as follows:

root_directory
├──Cameras_1
├──Rectified
└──Depths_raw

Download our processed camera parameters from here. Unzip all the camera folders into root_directory/Cameras_1.
In train_dtu.sh, set MVS_TRAINING as the root directory of dataset; set --logdir as the directory to store the checkpoints.
Train the model by running bash train_dtu.sh.

BlendedMVS

Download the dataset.
In train_blend.sh, set MVS_TRAINING as the root directory of dataset; set --logdir as the directory to store the checkpoints.
Train the model by running bash train_blend.sh.

Acknowledgements

Thanks to Yao Yao for opening source of his excellent work MVSNet. Thanks to Xiaoyang Guo for opening source of his PyTorch implementation of MVSNet MVSNet-pytorch.

Official source code of paper 'IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo'

Related tags

Overview

IterMVS

Introduction

Installation

Requirements

Reproducing Results

Evaluation on DTU:

Evaluation on Tansk & Temples:

Evaluation on ETH3D:

Evaluation on custom dataset:

Training

DTU

BlendedMVS

Acknowledgements

Owner

Fangjinhua Wang

Official implementation of "Articulation Aware Canonical Surface Mapping"

Opinionated code formatter, just like Python's black code formatter but for Beancount

Group Fisher Pruning for Practical Network Compression(ICML2021)

Spatiotemporal resampling methods for mlr3

3D Generative Adversarial Network

This repository contains code to train and render Mixture of Volumetric Primitives (MVP) models

Dataset and codebase for NeurIPS 2021 paper: Exploring Forensic Dental Identification with Deep Learning

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

Emotional conditioned music generation using transformer-based model.

Temporal Knowledge Graph Reasoning Triggered by Memories

PyTorch implementation of CVPR 2020 paper (Reference-Based Sketch Image Colorization using Augmented-Self Reference and Dense Semantic Correspondence) and pre-trained model on ImageNet dataset

The codes and related files to reproduce the results for Image Similarity Challenge Track 1.

This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

This repository contains the code for designing risk bounded motion plans for car-like robot using Carla Simulator.

This library contains a Tensorflow implementation of the paper Stability Analysis of Unfolded WMMSE for Power Allocation

Python package for multiple object tracking research with focus on laboratory animals tracking.

Code for "Finding Regions of Heterogeneity in Decision-Making via Expected Conditional Covariance" at NeurIPS 2021

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers.

[ECCV 2020] Reimplementation of 3DDFAv2, including face mesh, head pose, landmarks, and more.

Dynamic Realtime Animation Control