《Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching》(CVPR 2020)

Overview

Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching

This contains the codes for cross-view geo-localization method described in: Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching, CVPR2020. alt text

Abstract

Cross-view geo-localization is the problem of estimating the position and orientation (latitude, longitude and azimuth angle) of a camera at ground level given a large-scale database of geo-tagged aerial (\eg, satellite) images. Existing approaches treat the task as a pure location estimation problem by learning discriminative feature descriptors, but neglect orientation alignment. It is well-recognized that knowing the orientation between ground and aerial images can significantly reduce matching ambiguity between these two views, especially when the ground-level images have a limited Field of View (FoV) instead of a full field-of-view panorama. Therefore, we design a Dynamic Similarity Matching network to estimate cross-view orientation alignment during localization. In particular, we address the cross-view domain gap by applying a polar transform to the aerial images to approximately align the images up to an unknown azimuth angle. Then, a two-stream convolutional network is used to learn deep features from the ground and polar-transformed aerial images. Finally, we obtain the orientation by computing the correlation between cross-view features, which also provides a more accurate measure of feature similarity, improving location recall. Experiments on standard datasets demonstrate that our method significantly improves state-of-the-art performance. Remarkably, we improve the top-1 location recall rate on the CVUSA dataset by a factor of $1.5\times$ for panoramas with known orientation, by a factor of $3.3\times$ for panoramas with unknown orientation, and by a factor of $6\times$ for $180^{\circ}$-FoV images with unknown orientation.

Experiment Dataset

We use two existing dataset to do the experiments

  • CVUSA dataset: a dataset in America, with pairs of ground-level images and satellite images. All ground-level images are panoramic images.
    The dataset can be accessed from https://github.com/viibridges/crossnet

  • CVACT dataset: a dataset in Australia, with pairs of ground-level images and satellite images. All ground-level images are panoramic images.
    The dataset can be accessed from https://github.com/Liumouliu/OriCNN

Dataset Preparation: Polar transform

  1. Please Download the two datasets from above links, and then put them under the director "Data/". The structure of the director "Data/" should be: "Data/CVUSA/ Data/ANU_data_small/"
  2. Please run "data_preparation.py" to get polar transformed aerial images of the two datasets and pre-crop-and-resize the street-view images in CVACT dataset to accelerate the training speed.

Codes

Codes for training and testing on unknown orientation (train_grd_noise=360) and different FoV.

  1. Training: CVUSA: python train_cvusa_fov.py --polar 1 --train_grd_noise 360 --train_grd_FOV $YOUR_FOV --test_grd_FOV $YOUR_FOV CVACT: python train_cvact_fov.py --polar 1 --train_grd_noise 360 --train_grd_FOV $YOUR_FOV --test_grd_FOV $YOUR_FOV

  2. Evaluation: CVUSA: python test_cvusa_fov.py --polar 1 --train_grd_noise 360 --train_grd_FOV $YOUR_FOV --test_grd_FOV $YOUR_FOV CVACT: python test_cvact_fov.py --polar 1 --train_grd_noise 360 --train_grd_FOV $YOUR_FOV --test_grd_FOV $YOUR_FOV

Note that the test set construction operations are inside the data preparation script, polar_input_data_orien_FOV_3.py for CVUSA and ./OriNet_CVACT/input_data_act_polar_3.py for CVACT. We use "np.random.rand(2019)" in test_cvusa_fov.py and test_cvact_fov.py to make sure the constructed test set is the same one whenever they are used for performance evaluation for different models.

In case readers are interested to see the query images of newly constructed test sets where the ground images are with unkown orientation and small FoV, we provide the following two python scripts to save the images and their ground truth orientations at the local disk:

  • CVUSA datset: python generate_test_data_cvusa.py

  • CVACT dataset: python generate_test_data_cvact.py

Readers are encouraged to visit "https://github.com/Liumouliu/OriCNN" to access codes for evaluation on the fine-grained geo-localization CVACT_test set.

Models:

Our trained models for CVUSA and CVACT are available in here.

There is also an "Initialize" model for your own training step. The VGG16 part in the "Initialize" model is initialised by the online model and other parts are initialised randomly.

Please put them under the director of "Model/" and then you can use them for training or evaluation.

Publications

This work is published in CVPR 2020.
[Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching]

If you are interested in our work and use our code, we are pleased that you can cite the following publication:
Yujiao Shi, Xin Yu, Dylan Campbell, Hongdong Li. Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching.

@inproceedings{shi2020where, title={Where am I looking at? Joint Location and Orientation Estimation by Cross-View Matching}, author={Shi, Yujiao and Yu, Xin and Campbell, Dylan and Li, Hongdong}, booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition}, year={2020} }

DeepDiffusion: Unsupervised Learning of Retrieval-adapted Representations via Diffusion-based Ranking on Latent Feature Manifold

DeepDiffusion Introduction This repository provides the code of the DeepDiffusion algorithm for unsupervised learning of retrieval-adapted representat

4 Nov 15, 2022
This repo is a PyTorch implementation for Paper "Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds"

Unsupervised Learning for Cuboid Shape Abstraction via Joint Segmentation from Point Clouds This repository is a PyTorch implementation for paper: Uns

Kaizhi Yang 42 Dec 09, 2022
Implementation of paper "Decision-based Black-box Attack Against Vision Transformers via Patch-wise Adversarial Removal"

Patch-wise Adversarial Removal Implementation of paper "Decision-based Black-box Attack Against Vision Transformers via Patch-wise Adversarial Removal

4 Oct 12, 2022
Experiments with differentiable stacks and queues in PyTorch

Please use stacknn-core instead! StackNN This project implements differentiable stacks and queues in PyTorch. The data structures are implemented in s

Will Merrill 141 Oct 06, 2022
Repository accompanying the "Sign Pose-based Transformer for Word-level Sign Language Recognition" paper

by Matyáš Boháček and Marek Hrúz, University of West Bohemia Should you have any questions or inquiries, feel free to contact us here. Repository acco

Matyáš Boháček 30 Dec 30, 2022
CvT2DistilGPT2 is an encoder-to-decoder model that was developed for chest X-ray report generation.

CvT2DistilGPT2 Improving Chest X-Ray Report Generation by Leveraging Warm-Starting This repository houses the implementation of CvT2DistilGPT2 from [1

The Australian e-Health Research Centre 21 Dec 28, 2022
Session-based Recommendation, CoHHN, price preferences, interest preferences, Heterogeneous Hypergraph, Co-guided Learning, SIGIR2022

This is our implementation for the paper: Price DOES Matter! Modeling Price and Interest Preferences in Session-based Recommendation Xiaokun Zhang, Bo

Xiaokun Zhang 27 Dec 02, 2022
VOS: Learning What You Don’t Know by Virtual Outlier Synthesis

VOS This is the source code accompanying the paper VOS: Learning What You Don’t

248 Dec 25, 2022
Code for the paper "TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks"

TadGAN: Time Series Anomaly Detection Using Generative Adversarial Networks This is a Python3 / Pytorch implementation of TadGAN paper. The associated

Arun 92 Dec 03, 2022
This repo contains the official code of our work SAM-SLR which won the CVPR 2021 Challenge on Large Scale Signer Independent Isolated Sign Language Recognition.

Skeleton Aware Multi-modal Sign Language Recognition By Songyao Jiang, Bin Sun, Lichen Wang, Yue Bai, Kunpeng Li and Yun Fu. Smile Lab @ Northeastern

Isen (Songyao Jiang) 128 Dec 08, 2022
Implementation of ICLR 2020 paper "Revisiting Self-Training for Neural Sequence Generation"

Self-Training for Neural Sequence Generation This repo includes instructions for running noisy self-training algorithms from the following paper: Revi

Junxian He 45 Dec 31, 2022
Dynamic View Synthesis from Dynamic Monocular Video

Dynamic View Synthesis from Dynamic Monocular Video Project Website | Video | Paper Dynamic View Synthesis from Dynamic Monocular Video Chen Gao, Ayus

Chen Gao 139 Dec 28, 2022
Diverse Branch Block: Building a Convolution as an Inception-like Unit

Diverse Branch Block: Building a Convolution as an Inception-like Unit (PyTorch) (CVPR-2021) DBB is a powerful ConvNet building block to replace regul

253 Dec 24, 2022
NaturalProofs: Mathematical Theorem Proving in Natural Language

NaturalProofs: Mathematical Theorem Proving in Natural Language NaturalProofs: Mathematical Theorem Proving in Natural Language Sean Welleck, Jiacheng

Sean Welleck 83 Jan 05, 2023
Object Tracking and Detection Using OpenCV

Object tracking is one such application of computer vision where an object is detected in a video, otherwise interpreted as a set of frames, and the object’s trajectory is estimated. For instance, yo

Happy N. Monday 4 Aug 21, 2022
fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Ali Abdalla 34 Jan 05, 2023
Athena is the only tool that you will ever need to optimize your portfolio.

Athena Portfolio optimization is the process of selecting the best portfolio (asset distribution), out of the set of all portfolios being considered,

Indrajit 1 Mar 25, 2022
통일된 DataScience 폴더 구조 제공 및 가상환경 작업의 부담감 해소

Lucas coded by linux shell 목차 Mac버전 CookieCutter (autoenv) 1.How to Install autoenv 2.폴더 진입 시, activate 구현하기 3.폴더 탈출 시, deactivate 구현하기 4.Alias 설정하기 5

ello 3 Feb 21, 2022
ComputerVision - This repository aims at realized easy network architecture

ComputerVision This repository aims at realized easy network architecture Colori

DongDong 4 Dec 14, 2022
Random Erasing Data Augmentation. Experiments on CIFAR10, CIFAR100 and Fashion-MNIST

Random Erasing Data Augmentation =============================================================== black white random This code has the source code for

Zhun Zhong 654 Dec 26, 2022