PyTorch implementation of our ICCV2021 paper: StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

Overview

StructDepth

PyTorch implementation of our ICCV2021 paper:

StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation

Boying Li*, Yuan Huang*, Zeyu Liu, Danping Zou, Wenxian Yu

(* Equal Contribution) Image text Please consider citing our paper in your publications if the project helps your research.

@inproceedings{structdepth,
  title={StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation},
  author={Li, Boying and Huang, Yuan and Liu, Zeyu and Zou, Danping and Yu, Wenxian},
  booktitle={Proceedings of the IEEE International Conference on Computer Vision},
  year={2021}
}

Getting Started

Installation

The Python and PyTorch versions we use:

python=3.6

pytorch=1.7.1=py3.6_cuda10.1.243_cudnn7.6.3_0

Step1: Creating a virtual environment

conda create -n struct_depth python=3.6
conda activate struct_depth
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch

Step2: Download the modified scikit_image package , in which the input parameters of the Felzenswalb algorithm have been changed to accommodate our method.

unzip scikit-image-0.17.2.zip
cd scikit-image-0.17.2
python setup.py build_ext -i
pip install -e .

Step3: Installing other packages

pip install -r requirements.txt

Download pretrained model

Please download pretrained models and unzip them to MODEL_PATH

Inference single image

python inference_single_image.py --image_path=/path/to/image --load_weights_folder=MODEL_PATH

Evaluation

Download test dataset

Please download test dataset

It is recommended to unpack all test data and training data into the same data path and then modify the DATA_PATH when running a training or evaluation script.

Evaluate NYUv2/InteriorNet/ScanNet depth or norm

Modify the evaluation script in eval.sh to evaluate NYUv2/InteriorNet/ScanNet depth and norm separately

python evaluation/nyuv2_eval_norm.py \
  --data_path DATA_PATH \
  --load_weights_folder MODEL_PATH \

Trainning

Download NYU V2 dataset

The raw NYU dataset is about 400G and has 590 videos. You can download the raw datasets from there

Extract Main directions

python extract_vps_nyu.py --data_path DATA_PATH --output_dir VPS_PATH --failed_list TMP_LIST -- thresh 60 

If you need to train with a random flip, run the main direction extraction script on the images before and after the flip(add --flip) in advance, and note the failure examples, which can be skipped by referring to the code in datasets/nyu_datases.py.

Training

Modify the training script train.sh for PATH or different trainning settings.

python train.py \
  --data_path DATA_PATH \
  --val_path DATA_PATH \
  --train_split ./splits/nyu_train_0_10_20_30_40_21483-exceptfailed-21465.txt \
  --vps_path VPS_PATH \
  --log_dir LOG_PATH \
  --model_name 1 \
  --batch_size 32 \
  --num_epochs 50 \
  --start_epoch 0 \
  --using_disp2seg \
  --using_normloss \
  --load_weights_folder PRETRAIN_MODEL_PATH \
  --lambda_planar_reg 0.1 \
  --lambda_norm_reg 0.05 \
  --planar_thresh 200 \

Acknowledgement

We borrowed a lot of codes from scikit-image, monodepth2, P2Net, and LEGO. Thanks for their excellent works!

Owner
SJTU-ViSYS
Vision and Intelligent System Group
SJTU-ViSYS
CVAT is free, online, interactive video and image annotation tool for computer vision

Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our

OpenVINO Toolkit 8.6k Jan 04, 2023
Learning from Synthetic Humans, CVPR 2017

Learning from Synthetic Humans (SURREAL) Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev and Cordelia Schmid,

Gul Varol 538 Dec 18, 2022
Semi-Supervised Learning, Object Detection, ICCV2021

End-to-End Semi-Supervised Object Detection with Soft Teacher By Mengde Xu*, Zheng Zhang*, Han Hu, Jianfeng Wang, Lijuan Wang, Fangyun Wei, Xiang Bai,

Microsoft 789 Dec 27, 2022
Code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2021

The repo provides the code for paper "Extract, Denoise and Enforce: Evaluating and Improving Concept Preservation for Text-to-Text Generation" EMNLP 2

Yuning Mao 18 May 24, 2022
WRENCH: Weak supeRvision bENCHmark

🔧 What is it? Wrench is a benchmark platform containing diverse weak supervision tasks. It also provides a common and easy framework for development

Jieyu Zhang 176 Dec 28, 2022
Code of the lileonardo team for the 2021 Emotion and Theme Recognition in Music task of MediaEval 2021

Emotion and Theme Recognition in Music The repository contains code for the submission of the lileonardo team to the 2021 Emotion and Theme Recognitio

Vincent Bour 8 Aug 02, 2022
A repository with exploration into using transformers to predict DNA ↔ transcription factor binding

Transcription Factor binding predictions with Attention and Transformers A repository with exploration into using transformers to predict DNA ↔ transc

Phil Wang 62 Dec 20, 2022
Multi-Joint dynamics with Contact. A general purpose physics simulator.

MuJoCo Physics MuJoCo stands for Multi-Joint dynamics with Contact. It is a general purpose physics engine that aims to facilitate research and develo

DeepMind 5.2k Jan 02, 2023
Segmentation vgg16 fcn - cityscapes

VGGSegmentation Segmentation vgg16 fcn - cityscapes Priprema skupa skripta prepare_dataset_downsampled.py Iz slika cityscapesa izrezuje haubu automobi

6 Oct 24, 2020
Code and project page for ICCV 2021 paper "DisUnknown: Distilling Unknown Factors for Disentanglement Learning"

DisUnknown: Distilling Unknown Factors for Disentanglement Learning See introduction on our project page Requirements PyTorch = 1.8.0 torch.linalg.ei

Sitao Xiang 24 May 16, 2022
Segmentation-Aware Convolutional Networks Using Local Attention Masks

Segmentation-Aware Convolutional Networks Using Local Attention Masks [Project Page] [Paper] Segmentation-aware convolution filters are invariant to b

144 Jun 29, 2022
Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX.

Python scripts for performing object detection with the 1000 labels of the ImageNet dataset in ONNX. The repository combines a class agnostic object localizer to first detect the objects in the image

Ibai Gorordo 24 Nov 14, 2022
Official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks"

Easy-To-Hard The official repository for the paper "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks". Gett

Avi Schwarzschild 52 Sep 08, 2022
Adapter-BERT: Parameter-Efficient Transfer Learning for NLP.

Adapter-BERT: Parameter-Efficient Transfer Learning for NLP.

Google Research 340 Jan 03, 2023
ACV is a python library that provides explanations for any machine learning model or data.

ACV is a python library that provides explanations for any machine learning model or data. It gives local rule-based explanations for any model or data and different Shapley Values for tree-based mod

Salim Amoukou 85 Dec 27, 2022
Advanced Signal Processing Notebooks and Tutorials

Advanced Digital Signal Processing Notebooks and Tutorials Prof. Dr. -Ing. Gerald Schuller Jupyter Notebooks and Videos: Renato Profeta Applied Media

Guitars.AI 115 Dec 13, 2022
(IEEE TIP 2021) Regularized Densely-connected Pyramid Network for Salient Instance Segmentation

RDPNet IEEE TIP 2021: Regularized Densely-connected Pyramid Network for Salient Instance Segmentation PyTorch training and testing code are available.

Yu-Huan Wu 41 Oct 21, 2022
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

LiDAR fog simulation Created by Martin Hahner at the Computer Vision Lab of ETH Zurich. This is the official code release of the paper Fog Simulation

Martin Hahner 110 Dec 30, 2022
Randstad Artificial Intelligence Challenge (powered by VGEN). Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato

Randstad Artificial Intelligence Challenge (powered by VGEN) Soluzione proposta da Stefano Fiorucci (anakin87) - primo classificato Struttura director

Stefano Fiorucci 1 Nov 13, 2021
source code for 'Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge' by A. Shah, K. Shanmugam, K. Ahuja

Source code for "Finding Valid Adjustments under Non-ignorability with Minimal DAG Knowledge" Reference: Abhin Shah, Karthikeyan Shanmugam, Kartik Ahu

Abhin Shah 1 Jun 03, 2022