Near-Duplicate Video Retrieval with Deep Metric Learning

Overview

Near-Duplicate Video Retrieval
with Deep Metric Learning

This repository contains the Tensorflow implementation of the paper Near-Duplicate Video Retrieval with Deep Metric Learning. It provides code for training and evalutation of a Deep Metric Learning (DML) network on the problem of Near-Duplicate Video Retrieval (NDVR). During training, the DML network is fed with video triplets, generated by a triplet generator. The network is trained based on the triplet loss function. The architecture of the network is displayed in the figure below. For evaluation, mean Average Precision (mAP) and Presicion-Recall curve (PR-curve) are calculated. Two publicly available dataset are supported, namely VCDB and CC_WEB_VIDEO.

Prerequisites

  • Python
  • Tensorflow 1.xx

Getting started

Installation

  • Clone this repo:
git clone https://github.com/MKLab-ITI/ndvr-dml
cd ndvr-dml
  • You can install all the dependencies by
pip install -r requirements.txt

or

conda install --file requirements.txt

Triplet generation

Run the triplet generation process for each dataset, VCDB and CC_WEB_VIDEO. This process will generate two files for each dataset:

  1. the global feature vectors for each video in the dataset:
    <output_dir>/<dataset>_features.npy
  2. the generated triplets:
    <output_dir>/<dataset>_triplets.npy

To execute the triplet generation process, do as follows:

  • The code does not extract features from videos. Instead, the .npy files of the already extracted features have to be provided. You may use the tool in here to do so.

  • Create a file that contains the video id and the path of the feature file for each video in the processing dataset. Each line of the file have to contain the video id (basename of the video file) and the full path to the corresponding .npy file of its features, separated by a tab character (\t). Example:

      23254771545e5d278548ba02d25d32add952b2a4	features/23254771545e5d278548ba02d25d32add952b2a4.npy
      468410600142c136d707b4cbc3ff0703c112575d	features/468410600142c136d707b4cbc3ff0703c112575d.npy
      67f1feff7f624cf0b9ac2ebaf49f547a922b4971	features/67f1feff7f624cf0b9ac2ebaf49f547a922b4971.npy
                                               ...	
    
  • Run the triplet generator and provide the generated file from the previous step, the name of the processed dataset, and the output directory.

python triplet_generator.py --dataset vcdb --feature_files vcdb_feature_files.txt --output_dir output_data/

DML training

  • Train the DML network by providing the global features and triplet of VCDB, and a directory to save the trained model.
python train_dml.py --train_set output_data/vcdb_features.npy --triplets output_data/vcdb_triplets.npy --model_path model/ 
  • Triplets from the CC_WEB_VIDEO can be injected if the global features and triplet of the evaluation set are provide.
python train_dml.py --evaluation_set output_data/cc_web_video_features.npy --evaluation_triplets output_data/cc_web_video_triplets.npy --train_set output_data/vcdb_features.npy --triplets output_data/vcdb_triplets.npy --model_path model/

Evaluation

  • Evaluate the performance of the system by providing the trained model path and the global features of the CC_WEB_VIDEO.
python evaluation.py --fusion Early --evaluation_set output_data/cc_vgg_features.npy --model_path model/

OR

python evaluation.py --fusion Late --evaluation_features cc_web_video_feature_files.txt --evaluation_set output_data/cc_vgg_features.npy --model_path model/
  • The mAP and PR-curve are returned

Citation

If you use this code for your research, please cite our paper.

@inproceedings{kordopatis2017dml,
  title={Near-Duplicate Video Retrieval with Deep Metric Learning},
  author={Kordopatis-Zilos, Giorgos and Papadopoulos, Symeon and Patras, Ioannis and Kompatsiaris, Yiannis},
  booktitle={2017 IEEE International Conference on Computer Vision Workshop (ICCVW)},
  year={2017},
}

Related Projects

ViSiL Intermediate-CNN-Features FIVR-200K

License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details

Contact for further details about the project

Giorgos Kordopatis-Zilos ([email protected])
Symeon Papadopoulos ([email protected])

The official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness.

This repository is the official implementation of A Unified Game-Theoretic Interpretation of Adversarial Robustness. Requirements pip install -r requi

Jie Ren 17 Dec 12, 2022
🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

AI_SPARK_CHALLENG_Object_Detection 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지 🏅 Top 5% in mAP(0.75) (443명 중 13등, mAP: 0.98116) 대회 설명 Edge 환경에서의 가축 Object Dete

3 Sep 19, 2022
MAVE: : A Product Dataset for Multi-source Attribute Value Extraction

MAVE: : A Product Dataset for Multi-source Attribute Value Extraction The dataset contains 3 million attribute-value annotations across 1257 unique ca

Google Research Datasets 89 Jan 08, 2023
Official project repository for 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination'

NCAE_UAD Official project repository of 'Normality-Calibrated Autoencoder for Unsupervised Anomaly Detection on Data Contamination' Abstract In this p

Jongmin Andrew Yu 2 Feb 10, 2022
Rafael Project- Classifying rockets to different types using data science algorithms.

Rocket-Classify Rafael Project- Classifying rockets to different types using data science algorithms. In this project we received data base with data

Hadassah Engel 5 Sep 18, 2021
Koç University deep learning framework.

Knet Knet (pronounced "kay-net") is the Koç University deep learning framework implemented in Julia by Deniz Yuret and collaborators. It supports GPU

1.4k Dec 31, 2022
Replication attempt for the Protein Folding Model

RGN2-Replica (WIP) To eventually become an unofficial working Pytorch implementation of RGN2, an state of the art model for MSA-less Protein Folding f

Eric Alcaide 36 Nov 29, 2022
Next-gen Rowhammer fuzzer that uses non-uniform, frequency-based patterns.

Blacksmith Rowhammer Fuzzer This repository provides the code accompanying the paper Blacksmith: Scalable Rowhammering in the Frequency Domain that is

Computer Security Group @ ETH Zurich 173 Nov 16, 2022
PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

PyTorch implementation of Decoupling Value and Policy for Generalization in Reinforcement Learning

48 Dec 08, 2022
[ICCV 2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation

ADDS-DepthNet This is the official implementation of the paper Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation I

LIU_LINA 52 Nov 24, 2022
A novel framework to automatically learn high-quality scanning of non-planar, complex anisotropic appearance.

appearance-scanner About This repository is an implementation of the neural network proposed in Free-form Scanning of Non-planar Appearance with Neura

Xiaohe Ma 14 Oct 18, 2022
Semi-Supervised Semantic Segmentation with Pixel-Level Contrastive Learning from a Class-wise Memory Bank

This repository provides the official code for replicating experiments from the paper: Semi-Supervised Semantic Segmentation with Pixel-Level Contrast

Iñigo Alonso Ruiz 58 Dec 15, 2022
A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM's

sign-language-detection A Sign Language detection project using Mediapipe landmark detection and Tensorflow LSTM. The project is built for a vocabular

Hashim 4 Feb 06, 2022
Detect roadway lanes using Python OpenCV for project during the 5th semester at DHBW Stuttgart for lecture in digital image processing.

Find Line Detection (Image Processing) Identifying lanes of the road is very common task that human driver performs. It's important to keep the vehicl

LMF 4 Jun 21, 2022
From the basics to slightly more interesting applications of Tensorflow

TensorFlow Tutorials You can find python source code under the python directory, and associated notebooks under notebooks. Source code Description 1 b

Parag K Mital 5.6k Jan 09, 2023
High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

Image Completion Transformer (ICT) Project Page | Paper (ArXiv) | Pre-trained Models | Supplemental Material This repository is the official pytorch i

Ziyu Wan 243 Jan 03, 2023
Animate molecular orbital transitions using Psi4 and Blender

Molecular Orbital Transitions (MOT) Animate molecular orbital transitions using Psi4 and Blender Author: Maximilian Paradiz Dominguez, University of A

3 Feb 01, 2022
A package for music online and offline rhythmic information analysis including music Beat, downbeat, tempo and meter tracking.

BeatNet A package for music online and offline rhythmic information analysis including music Beat, downbeat, tempo and meter tracking. This repository

Mojtaba Heydari 157 Dec 27, 2022
A web-based application for quick, scalable, and automated hyperparameter tuning and stacked ensembling in Python.

Xcessiv Xcessiv is a tool to help you create the biggest, craziest, and most excessive stacked ensembles you can think of. Stacked ensembles are simpl

Reiichiro Nakano 1.3k Nov 17, 2022
Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation.

MosaicOS Mosaic of Object-centric Images as Scene-centric Images (MosaicOS) for long-tailed object detection and instance segmentation. Introduction M

Cheng Zhang 27 Oct 12, 2022