Tree-based Search Graph for Approximate Nearest Neighbor Search

Last update: Dec 27, 2022

Related tags

Overview

TBSG: Tree-based Search Graph for Approximate Nearest Neighbor Search.

TBSG is a graph-based algorithm for ANNS based on Cover Tree, which is also an approximation of Monotonic Search Network (MSNET). TBSG is very efficient with high precision.

Benchmark datasets

Datasets | No. of base | dimension | No. of query | download link
Sift | 1,000,000 | 128 | 10,000 | (http://corpus-texmex.irisa.fr/)
Gist | 1,000,000 | 300 | 1,000 | (http://corpus-texmex.irisa.fr/)
Glove | 1,183,514 | 100 | 10,000 | (http://downloads.zjulearning.org.cn/data/glove-100.tar.gz)
Crawl | 1,989,995 | 300 | 10,000 | (http://commoncrawl.org/)

How to use TBSG

1) compile

Prerequisite : openmp, cmake, eigen3

$ cd /path/to/project  
$ cmake . && make

2) build an approximate kNNG

We use efanna_graph to build the kNNG.

3) create a TBSG index

$ cd /path/to/project/  
$ ./TBSG_index data_path M S MP nnfile save_path

data_path is the path of base data.
M is the maximum of size of neighbors.
S is the candidate set size to build TBSG.
MP is the minimum of min_prob.
nnfile is the file of k nearest neighbor graph.
save_path is the path to save the index.

4) search with TBSG index

$ cd /path/to/project/
$ ./TBSG_search data_path query_path groundtruth_path save_path step

data_path is the path of base data.
query_path is the path of query data.
groundtruth is the path of groundtruth data.
save_path is the path to save the index.
step is the step size to expand the search pool.

Parameters used for four datasets

parameters for building kNNG

Dataset	K	L	iter	S	R
Sift	200	200	12	10	100
Gist	400	400	12	15	100
Glove	400	420	12	20	300
Crawl	400	420	12	20	100

parameters for building index

Datasets	M	S	MP
Sift	50	100	0.53
Gist	70	200	0.515
Glove	80	300	0.53
Crawl	50	200	0.53

Tree-based Search Graph for Approximate Nearest Neighbor Search

Related tags

Overview

TBSG: Tree-based Search Graph for Approximate Nearest Neighbor Search.

Benchmark datasets

How to use TBSG

1) compile

2) build an approximate kNNG

3) create a TBSG index

4) search with TBSG index

Parameters used for four datasets

parameters for building kNNG

parameters for building index

Owner

Fanxbin

Animal Sound Classification (Cats Vrs Dogs Audio Sentiment Classification)

SFD implement with pytorch

List of papers, code and experiments using deep learning for time series forecasting

Establishing Strong Baselines for TripClick Health Retrieval; ECIR 2022

💃 VALSE: A Task-Independent Benchmark for Vision and Language Models Centered on Linguistic Phenomena

Differentiable Quantum Chemistry (only Differentiable Density Functional Theory and Hartree Fock at the moment)

Code for reproducing experiments in "Improved Training of Wasserstein GANs"

Pytorch-Swin-Unet-V2 - a modified version of Swin Unet based on Swin Transfomer V2

PoseCamera is python based SDK for human pose estimation through RGB webcam.

Medical Image Segmentation using Squeeze-and-Expansion Transformers

This repository contains the source code of an efficient 1D probabilistic model for music time analysis proposed in ICASSP2022 venue.

Constraint-based geometry sketcher for blender

Implementation of TabTransformer, attention network for tabular data, in Pytorch

PyTorch implementation of Deformable Convolution

"Projelerle Yapay Zeka Ve Bilgisayarlı Görü" Kitabımın projeleri

Canonical Appearance Transformations

Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

Automatic differentiation with weighted finite-state transducers.

Neural implicit reconstruction experiments for the Vector Neuron paper

A PyTorch implementation of the Relational Graph Convolutional Network (RGCN).