PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Overview

PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility

Jae Yong Lee, Joseph DeGol, Chuhang Zou, Derek Hoiem

Installation

To install necessary python package for our work:

conda install pytorch torchvision numpy matplotlib pandas tqdm tensorboard cudatoolkit=11.1 -c pytorch -c conda-forge
pip install opencv-python tabulate moviepy openpyxl pyntcloud open3d==0.9 pytorch-lightning==1.4.9

To setup dataset for training for our work, please download:

To setup dataset for testing, please use:

  • ETH3D High-Res (PatchMatchNet pre-processed sets)
    • NOTE: We use our own script to pre-process. We are currently preparing code for the script. We will post update once it is available.
  • Tanks and Temples (MVSNet pre-processed sets)

Training

To train out method:

python bin/train.py --experiment_name=EXPERIMENT_NAME \
                    --log_path=TENSORBOARD_LOG_PATH \
                    --checkpoint_path=CHECKPOINT_PATH \
                    --dataset_path=ROOT_PATH_TO_DATA \
                    --dataset={BlendedMVS,DTU} \
                    --resume=True # if want to resume training with the same experiment_name

Testing

To test our method, we need two scripts. First script to generate geometetry, and the second script to fuse the geometry. Geometry generation code:

python bin/generate.py --experiment_name=EXPERIMENT_USED_FOR_TRAINING \
                       --checkpoint_path=CHECKPOINT_PATH \
                       --epoch_id=EPOCH_ID \
                       --num_views=NUMBER_OF_VIEWS \
                       --dataset_path=ROOT_PATH_TO_DATA \
                       --output_path=PATH_TO_OUTPUT_GEOMETRY \
                       --width=(optional)WIDTH \
                       --height=(optional)HEIGHT \
                       --dataset={ETH3DHR, TanksAndTemples} \
                       --device=DEVICE

This will generate depths / normals / images into the folder specified by --output_path. To be more precise:

OUTPUT_PATH/
    EXPERIMENT_NAME/
        CHECKPOINT_FILE_NAME/
            SCENE_NAME/
                000000_camera.pth <-- contains intrinsics / extrinsics
                000000_depth_map.pth
                000000_normal_map.pth
                000000_meta.pth <-- contains src_image ids
                ...

Once the geometries are generated, we can use the fusion code to fuse them into point cloud: GPU Fusion code:

python bin/fuse_output.py --output_path=OUTPUT_PATH_USED_IN_GENERATE.py
                          --experiment_name=EXPERIMENT_NAME \
                          --epoch_id=EPOCH_ID \
                          --dataset=DATASET \
                          # fusion related args
                          --proj_th=PROJECTION_DISTANCE_THRESHOLD \
                          --dist_th=DISTANCE_THRESHOLD \
                          --angle_th=ANGLE_THRESHOLD \
                          --num_consistent=NUM_CONSITENT_IMAGES \
                          --target_width=(Optional) target image width for fusion \
                          --target_height=(Optional) target image height for fusion \
                          --device=DEVICE \

The target width / height are useful for fusing depth / normal after upsampling.

We also provide ETH3D testing script:

python bin/evaluate_eth3d.py --eth3d_binary_path=PATH_TO_BINARY_EXE \
                             --eth3d_gt_path=PATH_TO_GT_MLP_FOLDER \
                             --output_path=PATH_TO_FOLDER_WITH_POINTCLOUDS \
                             --experiment_name=NAME_OF_EXPERIMENT \
                             --epoch_id=EPOCH_OF_CHECKPOINT_TO_LOAD (default last.ckpt)

Resources

Citation

If you want to use our work in your project, please cite:

@InProceedings{lee2021patchmatchrl,
    author    = {Lee, Jae Yong and DeGol, Joseph and Zou, Chuhang and Hoiem, Derek},
    title     = {PatchMatch-RL: Deep MVS with Pixelwise Depth, Normal, and Visibility},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision},
    month     = {October},
    year      = {2021}
}
Train a state-of-the-art yolov3 object detector from scratch!

TrainYourOwnYOLO: Building a Custom Object Detector from Scratch This repo let's you train a custom image detector using the state-of-the-art YOLOv3 c

AntonMu 616 Jan 08, 2023
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Xcessiv Xcessiv is a tool to help you create the biggest, craziest, and most excessive stacked ensembles you can think of. Stacked ensembles are simpl

Reiichiro Nakano 1.3k Nov 17, 2022
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022
Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Repository for scripts and notebooks from the book: Programming PyTorch for Deep Learning

Ian Pointer 368 Dec 17, 2022
On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation

On Nonlinear Latent Transformations for GAN-based Image Editing - PyTorch implementation On Nonlinear Latent Transformations for GAN-based Image Editi

Valentin Khrulkov 22 Oct 24, 2022
OBBDetection is a oriented object detection library, which is based on MMdetection.

OBBDetection news: We are now updating OBBDetection to new vision based on MMdetection v2.10, which has more advanced models and more efficient featur

jbwang1997 401 Jan 02, 2023
An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym

gym-idsgame An Abstract Cyber Security Simulation and Markov Game for OpenAI Gym gym-idsgame is a reinforcement learning environment for simulating at

Kim Hammar 29 Dec 03, 2022
PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Study-CSRNet-pytorch This is the PyTorch version repo for CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

0 Mar 01, 2022
Code for Efficient Visual Pretraining with Contrastive Detection

Code for DetCon This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff,

DeepMind 56 Nov 13, 2022
Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Hierarchical Skills for Efficient Exploration This is the source code release for the paper Hierarchical Skills for Efficient Exploration. It contains

Facebook Research 38 Dec 06, 2022
Open source hardware and software platform to build a small scale self driving car.

Donkeycar is minimalist and modular self driving library for Python. It is developed for hobbyists and students with a focus on allowing fast experimentation and easy community contributions.

Autorope 2.4k Jan 04, 2023
Cleaned up code for DSTC 10: SIMMC 2.0 track: subtask 2: multimodal coreference resolution

UNITER-Based Situated Coreference Resolution with Rich Multimodal Input: arXiv MMCoref_cleaned Code for the MMCoref task of the SIMMC 2.0 dataset. Pre

Yichen (William) Huang 2 Dec 05, 2022
DeepFashion2 is a comprehensive fashion dataset.

DeepFashion2 Dataset DeepFashion2 is a comprehensive fashion dataset. It contains 491K diverse images of 13 popular clothing categories from both comm

switchnorm 1.8k Jan 07, 2023
Code for the paper "Reinforced Active Learning for Image Segmentation"

Reinforced Active Learning for Image Segmentation (RALIS) Code for the paper Reinforced Active Learning for Image Segmentation Dependencies python 3.6

Arantxa Casanova 79 Dec 19, 2022
Source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network

D-HAN The source code of D-HAN This is the source code of D-HAN: Dynamic News Recommendation with Hierarchical Attention Network. However, only the co

30 Sep 22, 2022
Cross-Modal Contrastive Learning for Text-to-Image Generation

Cross-Modal Contrastive Learning for Text-to-Image Generation This repository hosts the open source JAX implementation of XMC-GAN. Setup instructions

Google Research 94 Nov 12, 2022
Txt2Xml tool will help you convert from txt COCO format to VOC xml format in Object Detection Problem.

TXT 2 XML All codes assume running from root directory. Please update the sys path at the beginning of the codes before running. Over View Txt2Xml too

Nguyễn Trường Lâu 4 Nov 24, 2022
DiffQ performs differentiable quantization using pseudo quantization noise. It can automatically tune the number of bits used per weight or group of weights, in order to achieve a given trade-off between model size and accuracy.

Differentiable Model Compression via Pseudo Quantization Noise DiffQ performs differentiable quantization using pseudo quantization noise. It can auto

Facebook Research 145 Dec 30, 2022
Yolo algorithm for detection + centroid tracker to track vehicles

Vehicle Tracking using Centroid tracker Algorithm used : Yolo algorithm for detection + centroid tracker to track vehicles Backend : opencv and python

6 Dec 21, 2022
This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

DBSegment This tool generates 30 deep brain structures segmentation, as well as a brain mask from T1-Weighted MRI. The whole procedure should take ~1

Luxembourg Neuroimaging (Platform OpNeuroImg) 2 Oct 25, 2022