A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Last update: Nov 17, 2022

Related tags

Deep Learning idn-solver

Overview

idn-solver

Paper | Project Page

This repository contains the code release of our ICCV 2021 paper:

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Wang Zhao*, Shaohui Liu*, Yi Wei, Hengkai Guo, Yong-Jin Liu

Installation

We recommend to use conda to setup a specified environment. Run

conda env create -f environment.yml

Test on a sequence

First download the pretrained model from here and put it under ./pretrain/ folder.

Prepare the sequence data with color images, camera poses (4x4 cam2world transformation) and intrinsics. The sequence data structure should be like:

sequence_name
  | color
      | 00000.jpg
  | pose
      | 00000.txt
  | K.txt

Run the following command to get the outputs:

python infer_folder.py --seq_dir /path/to/the/sequence/data --output_dir /path/to/save/outputs --config ./configs/test_folder.yaml

Tune the "reference gap" parameter to make sure there are sufficient overlaps and camera translations within an image pair. For ScanNet-like sequence, we recommend to use reference_gap of 20.

Test on ScanNet

Prepare ScanNet test split data

Download the ScanNet test split data from the official site and pre-process the data using:

python ./data/preprocess.py --data_dir /path/to/scannet/test/split/ --output_dir /path/to/save/pre-processed/scannet/test/data

This includes 1. resize the color images to 480x640 resolution 2. sample the data with interval of 20

Run evaluation

python eval_scannet.py --data_dir /path/to/processed/scannet/test/split/ --config ./configs/test_scannet.yaml

Train

Prepare ScanNet training data

We use the pre-processed ScanNet data from NAS, you could download the data using this link. The data structure is like:

scannet
  | scannet_nas
    | train
      | scene0000_00
          | color
            | 0000.jpg
          | pose
            | 0000.txt
          | depth
            | 0000.npy
          | intrinsic
          | normal
            | 0000_normal.npy
    | val
  | scans_test_sample (preprocessed ScanNet test split)

Run training

Modify the "dataset_path" variable with yours in the config yaml.

The network is trained with a two-stage strategy. The whole training process takes ~6 days with 4 Nvidia V100 GPUs.

python train.py ./configs/scannet_stage1.yaml
python train.py ./configs/scannet_stage2.yaml

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Zhao_2021_ICCV,
    author    = {Zhao, Wang and Liu, Shaohui and Wei, Yi and Guo, Hengkai and Liu, Yong-Jin},
    title     = {A Confidence-Based Iterative Solver of Depths and Surface Normals for Deep Multi-View Stereo},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {6168-6177}
}

Acknowledgement

This project heavily relies codes from NAS and we thank the authors for releasing their code.

We also thank Xiaoxiao Long for kindly helping with ScanNet evaluations.

A Confidence-based Iterative Solver of Depths and Surface Normals for Deep Multi-view Stereo

Related tags

Overview

idn-solver

Installation

Test on a sequence

Test on ScanNet

Prepare ScanNet test split data

Run evaluation

Train

Prepare ScanNet training data

Run training

Citation

Acknowledgement

Owner

zhaowang

Official Tensorflow implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

An efficient toolkit for Face Stylization based on the paper "AgileGAN: Stylizing Portraits by Inversion-Consistent Transfer Learning"

Replication attempt for the Protein Folding Model

Gradient-free global optimization algorithm for multidimensional functions based on the low rank tensor train format

Unofficial Implementation of RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series (AAAI 2019)

FLAVR is a fast, flow-free frame interpolation method capable of single shot multi-frame prediction

Freecodecamp Scientific Computing with Python Certification; Solution for Challenge 2: Time Calculator

領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge

Official PyTorch code for Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling (HCFlow, ICCV2021)

Python Library for Signal/Image Data Analysis with Transport Methods

Signals-backend - A suite of card games written in Python

Franka Emika Panda manipulator kinematics&dynamics simulation

Facilitating Database Tuning with Hyper-ParameterOptimization: A Comprehensive Experimental Evaluation

Learning RGB-D Feature Embeddings for Unseen Object Instance Segmentation

Abstractive opinion summarization system (SelSum) and the largest dataset of Amazon product summaries (AmaSum). EMNLP 2021 conference paper.

This repo includes the supplementary of our paper "CEMENT: Incomplete Multi-View Weak-Label Learning with Long-Tailed Labels"

Repository for Multimodal AutoML Benchmark

It's like Shape Editor in Maya but works with skeletons (transforms).

The code uses SegFormer for Semantic Segmentation on Drone Dataset.