Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Related tags

Deep LearningWAKD
Overview

Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation

Introduction

WAKD is a PyTorch implementation for our ICPR-2022 paper "Weak-supervised Visual Geo-localization via Attention-based Knowledge Distillation".

Installation

We test this repo with Python 3.8, PyTorch 1.9.0, and CUDA 10.2. But it should be runnable with recent PyTorch versions (Pytorch >=1.0.0).

python setup.py develop

Preparation

Datasets

We test our models on three geo-localization benchmarks, Pittsburgh, Tokyo 24/7 and Tokyo Time Machine datasets. The three datasets can be downloaded at here.

The directory of datasets used is like

datasets/data
├── pitts
│   ├── raw
│   │   ├── pitts250k_test.mat
│   │   ├── pitts250k_train.mat
│   │   ├── pitts250k_val.mat
│   │   ├── pitts30k_test.mat
│   │   ├── pitts30k_train.mat
│   │   ├── pitts30k_val.mat
│   └── └── Pittsburgh
│           ├──images/
│           └──queries/
└── tokyo
    ├── raw
    │   ├── tokyo247
    │   │   ├──images/
    │   │   └──query/
    │   ├── tokyo247.mat
    │   ├── tokyoTM/images/
    │   ├── tokyoTM_train.mat
    └── └── tokyoTM_val.mat

Pre-trained Weights

The file tree we used for storing the pre-trained weights is like

logs
├── vgg16_pretrained.pth.tar # refer to (1)
├── mbv3_large.pth.tar
└── vgg16_pitts_64_desc_cen.hdf5 # refer to (2)
└── mobilenetv3_large_pitts_64_desc_cen.hdf5

(1) ImageNet-pretrained weights for CNNs backbone

The ImageNet-pretrained weights for CNNs backbone or the pretrained weights for the whole model.

(2) initial cluster centers for VLAD layer

Note that the VLAD layer cannot work with random initialization. The original cluster centers provided by NetVLAD or self-computed cluster centers by running the scripts/cluster.sh.

./scripts/cluster.sh mobilenetv3_large

Training

Train by running script in the terminal. Script location: scripts/train_wakd_st.sh

Format:

bash scripts/train_wakd_st.sh arch archT

where, arch is the backbone name, such as mobilenetv3_large. archT is the teacher backbone name, such as vgg16.

For example:

bash scripts/train_wakd_st.sh mobilenetv3_large vgg16

In the train_wakd_st.sh. In case you want to fasten testing, enlarge GPUS for more GPUs, or enlarge the --tuple-size for more tuples on one GPU. In case your GPU does not have enough memory, reduce --pos-num or --neg-num for fewer positives or negatives in one tuple.

Testing

Test by running script in the terminal. Script location: scripts/test.sh

Format:

bash scripts/test.sh resume arch dataset scale

where, resume is the trained model path. arch is the backbone name, such as vgg16, mobilenetv3_large and resnet152. dataset scale, such as pitts 30k and pitts 250k.

For example:

  1. Test mobilenetv3_large on pitts 250k:
bash scripts/test.sh logs/netVLAD/pitts30k-mobilenetv3_large/model_best.pth.tar mobilenetv3_large pitts 250k
  1. Test vgg16 on tokyo:
bash scripts/test.sh logs/netVLAD/pitts30k-vgg16/model_best.pth.tar model_best.pth.tar vgg16 tokyo

In the test.sh. In case you want to fasten testing, enlarge GPUS for more GPUs, or enlarge the --test-batch-size for larger batch size on one GPU. In case your GPU does not have enough memory, reduce --test-batch-size for smaller batch size on one GPU.

Acknowledgements

We truely thanksful of the following two piror works. Particularly, part of the code is inspired by [pytorch-NetVlad]

  • NetVLAD: CNN architecture for weakly supervised place recognition (CVPR'16) [paper] [pytorch-NetVlad]
  • SARE: Stochastic Attraction-Repulsion Embedding for Large Scale Image Localization (ICCV'19) [paper] [deepIBL]
Adabelief-Optimizer - Repository for NeurIPS 2020 Spotlight "AdaBelief Optimizer: Adapting stepsizes by the belief in observed gradients"

AdaBelief Optimizer NeurIPS 2020 Spotlight, trains fast as Adam, generalizes well as SGD, and is stable to train GANs. Release of package We have rele

Juntang Zhuang 998 Dec 29, 2022
Machine Learning University: Accelerated Computer Vision Class

Machine Learning University: Accelerated Computer Vision Class This repository contains slides, notebooks, and datasets for the Machine Learning Unive

AWS Samples 1.3k Dec 28, 2022
🥈78th place in Riiid Solution🥈

Riiid Answer Correctness Prediction Introduction This repository is the code that placed 78th in Riiid Answer Correctness Prediction competition. Requ

ds wook 14 Apr 26, 2022
Code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation

PiecewiseLinearTimeSeriesApproximation code from Daniel Lemire, A Better Alternative to Piecewise Linear Time Series Segmentation, SIAM Data Mining 20

Daniel Lemire 21 Oct 27, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

8 Oct 06, 2022
Official PyTorch implementation of PS-KD

Self-Knowledge Distillation with Progressive Refinement of Targets (PS-KD) Accepted at ICCV 2021, oral presentation Official PyTorch implementation of

61 Dec 28, 2022
A simple version for graphfpn

GraphFPN: Graph Feature Pyramid Network for Object Detection Download graph-FPN-main.zip For training , run: python train.py For test with Graph_fpn

WorldGame 67 Dec 25, 2022
交互式标注软件,暂定名 iann

iann 交互式标注软件,暂定名iann。 安装 按照官网介绍安装paddle。 安装其他依赖 pip install -r requirements.txt 运行 git clone https://github.com/PaddleCV-SIG/iann/ cd iann python iann

294 Dec 30, 2022
Demo for Real-time RGBD-based Extended Body Pose Estimation paper

Real-time RGBD-based Extended Body Pose Estimation This repository is a real-time demo for our paper that was published at WACV 2021 conference The ou

Renat Bashirov 118 Dec 26, 2022
Clinica is a software platform for clinical research studies involving patients with neurological and psychiatric diseases and the acquisition of multimodal data

Clinica Software platform for clinical neuroimaging studies Homepage | Documentation | Paper | Forum | See also: AD-ML, AD-DL ClinicaDL About The Proj

ARAMIS Lab 165 Dec 29, 2022
TCNN Temporal convolutional neural network for real-time speech enhancement in the time domain

TCNN Pandey A, Wang D L. TCNN: Temporal convolutional neural network for real-time speech enhancement in the time domain[C]//ICASSP 2019-2019 IEEE Int

凌逆战 16 Dec 30, 2022
Implementation of gaze tracking and demo

Predicting Customer Demand by Using Gaze Detecting and Object Tracking This project is the integration of gaze detecting and object tracking. Predict

2 Oct 20, 2022
Unsupervised Image Generation with Infinite Generative Adversarial Networks

Unsupervised Image Generation with Infinite Generative Adversarial Networks Here is the implementation of MICGANs using DCGAN architecture on MNIST da

16 Dec 24, 2021
Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach

Performance Analysis of Multi-user NOMA Wireless-Powered mMTC Networks: A Stochastic Geometry Approach Thanh Luan Nguyen, Tri Nhu Do, Georges Kaddoum

Thanh Luan Nguyen 2 Oct 10, 2022
Axel - 3D printed robotic hands and they controll with Raspberry Pi and Arduino combo

Axel It's our graduation project about 3D printed robotic hands and they control

0 Feb 14, 2022
Powerful and efficient Computer Vision Annotation Tool (CVAT)

Computer Vision Annotation Tool (CVAT) CVAT is free, online, interactive video and image annotation tool for computer vision. It is being used by our

OpenVINO Toolkit 8.6k Jan 01, 2023
TLoL (Python Module) - League of Legends Deep Learning AI (Research and Development)

TLoL-py - League of Legends Deep Learning Library TLoL-py is the Python component of the TLoL League of Legends deep learning library. It provides a s

7 Nov 29, 2022
PyTorch implementation of UNet++ (Nested U-Net).

PyTorch implementation of UNet++ (Nested U-Net) This repository contains code for a image segmentation model based on UNet++: A Nested U-Net Architect

4ui_iurz1 642 Jan 04, 2023
A3C LSTM Atari with Pytorch plus A3G design

NEWLY ADDED A3G A NEW GPU/CPU ARCHITECTURE OF A3C FOR SUBSTANTIALLY ACCELERATED TRAINING!! RL A3C Pytorch NEWLY ADDED A3G!! New implementation of A3C

David Griffis 532 Jan 02, 2023
The Most Efficient Temporal Difference Learning Framework for 2048

moporgic/TDL2048+ TDL2048+ is a highly optimized temporal difference (TD) learning framework for 2048. Features Many common methods related to 2048 ar

Hung Guei 5 Nov 23, 2022