RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

Overview

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching

This repository contains the source code for our paper:

RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching
Lahav Lipson, Zachary Teed and Jia Deng

@article{lipson2021raft,
  title={{RAFT-Stereo: Multilevel Recurrent Field Transforms for Stereo Matching}},
  author={Lipson, Lahav and Teed, Zachary and Deng, Jia},
  journal={arXiv preprint arXiv:2109.07547},
  year={2021}
}

Requirements

The code has been tested with PyTorch 1.7 and Cuda 10.2.

conda env create -f environment.yaml
conda activate raftstereo

Required Data

To evaluate/train RAFT-stereo, you will need to download the required datasets.

To download the ETH3D and Middlebury test datasets for the demos, run

chmod ug+x download_datasets.sh && ./download_datasets.sh

By default stereo_datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

├── datasets
    ├── FlyingThings3D
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Monkaa
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── Driving
        ├── frames_cleanpass
        ├── frames_finalpass
        ├── disparity
    ├── KITTI
        ├── testing
        ├── training
        ├── devkit
    ├── Middlebury
        ├── MiddEval3
    ├── ETH3D
        ├── lakeside_1l
        ├── ...
        ├── tunnel_3s

Demos

Pretrained models can be downloaded by running

chmod ug+x download_models.sh && ./download_models.sh

or downloaded from google drive

You can demo a trained model on pairs of images. To predict stereo for Middlebury, run

python demo.py --restore_ckpt models/raftstereo-sceneflow.pth

Or for ETH3D:

python demo.py --restore_ckpt models/raftstereo-eth3d.pth -l=datasets/ETH3D/*/im0.png -r=datasets/ETH3D/*/im1.png

Using our fastest model:

python demo.py --restore_ckpt models/raftstereo-realtime.pth  --shared_backbone --n_downsample 3 --n_gru_layers 2 --slow_fast_gru 

To save the disparity values as .npy files, run any of the demos with the --save_numpy flag.

Converting Disparity to Depth

If the camera focal length and camera baseline are known, disparity predictions can be converted to depth values using

Note that the units of the focal length are pixels not millimeters.

Evaluation

To evaluate a trained model on a validation set (e.g. Middlebury), run

python evaluate_stereo.py --restore_ckpt models/raftstereo-middlebury.pth --dataset middlebury_H

Training

Our model is trained on two RTX-6000 GPUs using the following command. Training logs will be written to runs/ which can be visualized using tensorboard.

python train_stereo.py --batch_size 8 --train_iters 22 --valid_iters 32 --spatial_scale -0.2 0.4 --saturation_range 0 1.4 --n_downsample 2 --num_steps 200000 --mixed_precision

To train using significantly less memory, change --n_downsample 2 to --n_downsample 3. This will slightly reduce accuracy.

(Optional) Faster Implementation

We provide a faster CUDA implementation of the correlation volume which works with mixed precision feature maps.

cd sampler && python setup.py install && cd ..

Running demo.py, train_stereo.py or evaluate.py with --corr_implementation reg_cuda together with --mixed_precision will speed up the model without impacting performance.

To significantly decrease memory consumption on high resolution images, use --corr_implementation alt. This implementation is slower than the default, however.

Owner
Princeton Vision & Learning Lab
Princeton Vision & Learning Lab
CrossNorm and SelfNorm for Generalization under Distribution Shifts (ICCV 2021)

CrossNorm (CN) and SelfNorm (SN) (Accepted at ICCV 2021) This is the official PyTorch implementation of our CNSN paper, in which we propose CrossNorm

100 Dec 28, 2022
Code implementation of Data Efficient Stagewise Knowledge Distillation paper.

Data Efficient Stagewise Knowledge Distillation Table of Contents Data Efficient Stagewise Knowledge Distillation Table of Contents Requirements Image

IvLabs 112 Dec 02, 2022
Implementation for Panoptic-PolarNet (CVPR 2021)

Panoptic-PolarNet This is the official implementation of Panoptic-PolarNet. [ArXiv paper] Introduction Panoptic-PolarNet is a fast and robust LiDAR po

Zixiang Zhou 126 Jan 01, 2023
The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

HCQ_Tweet_Dataset: FREE to Download. Keywords: HCQ, hydroxychloroquine, tweet, twitter, COVID-19 This dataset is associated with the paper "Understand

2 Mar 16, 2022
CNN visualization tool in TensorFlow

tf_cnnvis A blog post describing the library: https://medium.com/@falaktheoptimist/want-to-look-inside-your-cnn-we-have-just-the-right-tool-for-you-ad

InFoCusp 778 Jan 02, 2023
Apply a perspective transformation to a raster image inside Inkscape (no need to use an external software such as GIMP or Krita).

Raster Perspective Apply a perspective transformation to bitmap image using the selected path as envelope, without the need to use an external softwar

s.ouchene 19 Dec 22, 2022
This repository is for DSA and CP scripts for reference.

dsa-script-collections This Repo is the collection of DSA and CP scripts for reference. Contents Python Bubble Sort Insertion Sort Merge Sort Quick So

Aditya Kumar Pandey 9 Nov 22, 2022
Fight Recognition from Still Images in the Wild @ WACVW2022, Real-world Surveillance Workshop

Fight Detection from Still Images in the Wild Detecting fights from still images is an important task required to limit the distribution of social med

Şeymanur Aktı 10 Nov 09, 2022
hySLAM is a hybrid SLAM/SfM system designed for mapping

HySLAM Overview hySLAM is a hybrid SLAM/SfM system designed for mapping. The system is based on ORB-SLAM2 with some modifications and refactoring. Raú

Brian Hopkinson 15 Oct 10, 2022
curl-impersonate: A special compilation of curl that makes it impersonate Chrome & Firefox

curl-impersonate A special compilation of curl that makes it impersonate real browsers. It can impersonate the four major browsers: Chrome, Edge, Safa

lwthiker 1.9k Jan 03, 2023
This is the code for the paper "Motion-Focused Contrastive Learning of Video Representations" (ICCV'21).

Motion-Focused Contrastive Learning of Video Representations Introduction This is the code for the paper "Motion-Focused Contrastive Learning of Video

11 Sep 23, 2022
Sionna: An Open-Source Library for Next-Generation Physical Layer Research

Sionna: An Open-Source Library for Next-Generation Physical Layer Research Sionna™ is an open-source Python library for link-level simulations of digi

NVIDIA Research Projects 313 Dec 22, 2022
Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Long-term-Motion-in-3D-Scenes This is an implementation of the CVPR'21 paper "Synthesizing Long-Term 3D Human Motion and Interaction in 3D". Please ch

Jiashun Wang 76 Dec 13, 2022
A Python type explainer!

typesplainer A Python typehint explainer! Available as a cli, as a website, as a vscode extension, as a vim extension Usage First, install the package

Typesplainer 79 Dec 01, 2022
simple demo codes for Learning to Teach with Dynamic Loss Functions

Learning to Teach with Dynamic Loss Functions This repo contains the simple demo for the NeurIPS-18 paper: Learning to Teach with Dynamic Loss Functio

Lijun Wu 15 Dec 30, 2021
Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection"

Official code for paper "ISNet: Costless and Implicit Image Segmentation for Deep Classifiers, with Application in COVID-19 Detection". LRPDenseNet.py

Pedro Ricardo Ariel Salvador Bassi 2 Sep 21, 2022
Deep Learning Models for Causal Inference

Extensive tutorials for learning how to build deep learning models for causal inference using selection on observables in Tensorflow 2.

Bernard J Koch 151 Dec 31, 2022
Its a Plant Leaf Disease Detection System based on Machine Learning.

My_Project_Code Its a Plant Leaf Disease Detection System based on Machine Learning. I have used Tomato Leaves Dataset from kaggle. This system detect

Sanskriti Sidola 3 Jun 15, 2022
Generating Images with Recurrent Adversarial Networks

Generating Images with Recurrent Adversarial Networks Python (Theano) implementation of Generating Images with Recurrent Adversarial Networks code pro

Daniel Jiwoong Im 121 Sep 08, 2022
Cards Against Humanity AI

cah-ai This is a Cards Against Humanity AI implemented using a pre-trained Semantic Search model. How it works A player is described by a combination

Alex Nichol 2 Aug 22, 2022