Extreme Rotation Estimation using Dense Correlation Volumes

Overview

Extreme Rotation Estimation using Dense Correlation Volumes

This repository contains a PyTorch implementation of the paper:

Extreme Rotation Estimation using Dense Correlation Volumes [Project page] [Arxiv]

Ruojin Cai, Bharath Hariharan, Noah Snavely, Hadar Averbuch-Elor

CVPR 2021

Introduction

We present a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap. We observe that, even when images do not overlap, there may be rich hidden cues as to their geometric relationship, such as light source directions, vanishing points, and symmetries present in the scene. We propose a network design that can automatically learn such implicit cues by comparing all pairs of points between the two input images. Our method therefore constructs dense feature correlation volumes and processes these to predict relative 3D rotations. Our predictions are formed over a fine-grained discretization of rotations, bypassing difficulties associated with regressing 3D rotations. We demonstrate our approach on a large variety of extreme RGB image pairs, including indoor and outdoor images captured under different lighting conditions and geographic locations. Our evaluation shows that our model can successfully estimate relative rotations among non-overlapping images without compromising performance over overlapping image pairs.

Overview of our Method:

Overview

Given a pair of images, a shared-weight Siamese encoder extracts feature maps. We compute a 4D correlation volume using the inner product of features, from which our model predicts the relative rotation (here, as distributions over Euler angles).

Dependencies

# Create conda environment with python 3.6, torch 1.3.1 and CUDA 10.0
conda env create -f ./tools/environment.yml
conda activate rota

Dataset

Perspective images are randomly sampled from panoramas with a resolution of 256 × 256 and a 90◦ FoV. We sample images distributed uniformly over the range of [−180, 180] for yaw angles. To avoid generating textureless images that focus on the ceiling/sky or the floor, we limit the range over pitch angles to [−30◦, 30◦] for the indoor datasets and [−45◦, 45◦] for the outdoor dataset.

Download InteriorNet, SUN360, and StreetLearn datasets to obtain the full panoramas.

Metadata files about the training and test image pairs are available in the following google drive: link. Download the metadata.zip file, unzip it and put it under the project root directory.

We base on this MATLAB Toolbox that extracts perspective images from an input panorama. Before running PanoBasic/pano2perspective_script.m, you need to modify the path to the datasets and metadata files in the script.

Pretrained Model

Pretrained models are be available in the following google drive: link. To use the pretrained models, download the pretrained.zip file, unzip it and put it under the project root directory.

Testing the pretrained model:

The following commands test the performance of the pre-trained models in the rotation estimation task. The commands output the mean and median geodesic error, and the percentage of pairs with a relative rotation error under 10◦ for different levels of overlap on the test set.

# Usage:
# python test.py <config> --pretrained <checkpoint_filename>

python test.py configs/sun360/sun360_cv_distribution.yaml \
    --pretrained pretrained/sun360_cv_distribution.pt

python test.py configs/interiornet/interiornet_cv_distribution.yaml \
    --pretrained pretrained/interiornet_cv_distribution.pt

python test.py configs/interiornetT/interiornetT_cv_distribution.yaml \
    --pretrained pretrained/interiornetT_cv_distribution.pt

python test.py configs/streetlearn/streetlearn_cv_distribution.yaml \
    --pretrained pretrained/streetlearn_cv_distribution.pt

python test.py configs/streetlearnT/streetlearnT_cv_distribution.yaml \
    --pretrained pretrained/streetlearnT_cv_distribution.pt

Rotation estimation evaluation of the pretrained models is as follows:

InteriorNet InteriorNet-T SUM360 StreetLearn StreetLearn-T
Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10° Avg(°) Med(°) 10°
Large 1.82 0.88 98.76% 8.86 1.86 93.13% 1.37 1.09 99.51% 1.38 1.12 100.00% 24.98 2.50 78.95%
Small 4.31 1.16 96.58% 30.43 2.63 74.07% 6.13 1.77 95.86% 3.25 1.41 98.34% 27.84 3.19 74.76%
None 37.69 3.15 61.97% 49.44 4.17 58.36% 34.92 4.43 61.39% 5.46 1.65 96.60% 32.43 3.64 72.69%
All 13.49 1.18 86.90% 29.68 2.58 75.10% 20.45 2.23 78.30% 4.10 1.46 97.70% 29.85 3.19 74.30%

Training

# Usage:
# python train.py <config>

python train.py configs/interiornet/interiornet_cv_distribution.yaml

python train.py configs/interiornetT/interiornetT_cv_distribution.yaml

python train.py configs/sun360/sun360_cv_distribution_overlap.yaml
python train.py configs/sun360/sun360_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

python train.py configs/streetlearn/streetlearn_cv_distribution_overlap.yaml
python train.py configs/streetlearn/streetlearn_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

python train.py configs/streetlearnT/streetlearnT_cv_distribution_overlap.yaml
python train.py configs/streetlearnT/streetlearnT_cv_distribution.yaml --resume --pretrained <checkpoint_filename>

For SUN360 and StreetLearn dataset, finetune from the pretrained model, which is training with only overlapping pairs, at epoch 10. More configs about baselines can be found in the folder configs/sun360.

Cite

Please cite our work if you find it useful:

@inproceedings{Cai2021Extreme,
 title={Extreme Rotation Estimation using Dense Correlation Volumes},
 author={Cai, Ruojin and Hariharan, Bharath and Snavely, Noah and Averbuch-Elor, Hadar},
 booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
 year={2021}
}

Acknowledgment

This work was supported in part by the National Science Foundation (IIS-2008313) and by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program and the Zuckerman STEM leadership program.

Owner
Ruojin Cai
Ph.D. student at Cornell University
Ruojin Cai
Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors.

Here is the diagnostic tool for BMVC 2021 paper Diagnosing Errors in Video Relation Detectors. We provide a tiny ground truth file demo_gt.json, and t

Shuo Chen 3 Dec 26, 2022
ReferFormer - Official Implementation of ReferFormer

The official implementation of the paper: Language as Queries for Referring Vide

Jonas Wu 232 Dec 29, 2022
Makes patches from huge resolution .svs slide files using openslide

openslide_patcher Makes patches from huge resolution .svs slide files using openslide Example collage I made from outputs:

2 Dec 23, 2021
Official implementation of the paper "Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering"

Light Field Networks Project Page | Paper | Data | Pretrained Models Vincent Sitzmann*, Semon Rezchikov*, William Freeman, Joshua Tenenbaum, Frédo Dur

Vincent Sitzmann 130 Dec 29, 2022
Mall-Customers-Segmentation - Customer Segmentation Using K-Means Clustering

Overview Customer Segmentation is one the most important applications of unsupervised learning. Using clustering techniques, companies can identify th

NelakurthiSudheer 2 Jan 03, 2022
Finite Element Analysis

FElupe - Finite Element Analysis FElupe is a Python 3.6+ finite element analysis package focussing on the formulation and numerical solution of nonlin

Andreas D. 20 Jan 09, 2023
PyTorch implementation of PSPNet segmentation network

pspnet-pytorch PyTorch implementation of PSPNet segmentation network Original paper Pyramid Scene Parsing Network Details This is a slightly different

Roman Trusov 532 Dec 29, 2022
Paper: De-rendering Stylized Texts

Paper: De-rendering Stylized Texts Wataru Shimoda1, Daichi Haraguchi2, Seiichi Uchida2, Kota Yamaguchi1 1CyberAgent.Inc, 2 Kyushu University Accepted

CyberAgent AI Lab 55 Dec 18, 2022
Apache Flink

Apache Flink Apache Flink is an open source stream processing framework with powerful stream- and batch-processing capabilities. Learn more about Flin

The Apache Software Foundation 20.4k Dec 30, 2022
Simple cross-platform application for DaVinci surgical video frame annotation

About DaVid is a simple cross-platform GUI for annotating robotic and endoscopic surgical actions for use in deep-learning research. Features Simple a

Cyril Zakka 4 Oct 09, 2021
Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange

MyTT Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! to Stock Market Financial Technical Analysis Python

dev 34 Dec 27, 2022
Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Just playing with getting VQGAN+CLIP running locally, rather than having to use colab.

Nerdy Rodent 2.3k Jan 04, 2023
Code for all the Advent of Code'21 challenges mostly written in python

Advent of Code 21 Code for all the Advent of Code'21 challenges mostly written in python. They are not necessarily the best or fastest solutions but j

4 May 26, 2022
[CVPR 2022] Deep Equilibrium Optical Flow Estimation

Deep Equilibrium Optical Flow Estimation This is the official repo for the paper Deep Equilibrium Optical Flow Estimation (CVPR 2022), by Shaojie Bai*

CMU Locus Lab 136 Dec 18, 2022
OpenMMLab 3D Human Parametric Model Toolbox and Benchmark

Introduction English | 简体中文 MMHuman3D is an open source PyTorch-based codebase for the use of 3D human parametric models in computer vision and comput

OpenMMLab 782 Jan 04, 2023
Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders"

AAVAE Official implementation of the paper "AAVAE: Augmentation-AugmentedVariational Autoencoders" Abstract Recent methods for self-supervised learnin

Grid AI Labs 48 Dec 12, 2022
Official implementation of Neural Bellman-Ford Networks (NeurIPS 2021)

NBFNet: Neural Bellman-Ford Networks This is the official codebase of the paper Neural Bellman-Ford Networks: A General Graph Neural Network Framework

MilaGraph 136 Dec 21, 2022
Face Recognition Attendance Project

Face-Recognition-Attendance-Project In This Project You will learn how to mark attendance using face recognition, Hello Guys This is Gautam Kumar, Thi

Gautam Kumar 1 Dec 03, 2022
Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Super-Fast-Adversarial-Training This is a PyTorch Implementation code for develo

LBK 26 Dec 02, 2022
Code for Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games How to run our algorithm? Create the new environment using: conda

MARL @ SJTU 8 Dec 27, 2022