Multi-query Video Retrieval

This repository contains the code for the paper:

@misc{wang2022multiquery,
      title={Multi-query Video Retrieval}, 
      author={Zeyu Wang and Yu Wu and Karthik Narasimhan and Olga Russakovsky},
      year={2022},
      eprint={2201.03639},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Data Preparation

Download raw videos for MSR-VTT, MSVD and VATEX, and put them into data/{dataset}/raw_videos folder.
Run the script data/extract_frames.sh to extract frames from raw videos.

The resulting data folder structures like this:

├── data
    ├── msrvtt
        ├── msrvtt_train.json
        ├── msrvtt_test.json
        ├── msrvtt_test_varying_query_sample_1-20.json
        ├── raw_videos
            ├── video0.mp4
            ├── ...
        ├── extracted_frames
            ├── video0.mp4
                ├── 0.jpg
                ├── ...
            ├── ...
    ├── msvd
        ├── ...
    ├── vatex
        ├── ...

For Frozen model, download the pretrained checkpoint provided by the original authors here, and put into record/pretrained folder.

Training

Run command: python train.py -c configs/{config_path}

Evaluation

Run command: python evaluate.py -c configs/{config_path}

Acknowledgements

The structure of this repository is based on https://github.com/victoresque/pytorch-template. Some of the code are adpated from https://github.com/m-bain/frozen-in-time and https://github.com/ArrowLuo/CLIP4Clip.

Multi-query Video Retreival

Related tags

Overview

Multi-query Video Retrieval

Data Preparation

Training

Evaluation

Acknowledgements

Owner

Princeton Visual AI Lab

Model-based Reinforcement Learning Improves Autonomous Racing Performance

Neural Point-Based Graphics

GeneralOCR is open source Optical Character Recognition based on PyTorch.

Key information extraction from invoice document with Graph Convolution Network

a short visualisation script for pyvideo data

Qlib is an AI-oriented quantitative investment platform

Monitor your ML jobs on mobile devices📱, especially for Google Colab / Kaggle

Source Code of NeurIPS21 paper: Recognizing Vector Graphics without Rasterization

Understanding Convolutional Neural Networks from Theoretical Perspective via Volterra Convolution

A Python library for differentiable optimal control on accelerators.

My course projects for the 2021 Spring Machine Learning course at the National Taiwan University (NTU)

Code accompanying the paper "ProxyFL: Decentralized Federated Learning through Proxy Model Sharing"

Official pytorch implementation of "Scaling-up Disentanglement for Image Translation", ICCV 2021.

This repo tries to recognize faces in the dataset you created

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

OverFeat is a Convolutional Network-based image classifier and feature extractor.

Control-Raspberry-Pi-Robot-using-Hand-Gestures - A 4WD Robot car based on Raspberry Pi that controlled by hand gestures(using openCV and mediapipe)

Unofficial implementation of Proxy Anchor Loss for Deep Metric Learning

Bayesian Meta-Learning Through Variational Gaussian Processes

Load What You Need: Smaller Multilingual Transformers for Pytorch and TensorFlow 2.0.