Deep ViT Features as Dense Visual Descriptors

Overview

dino-vit-features

[paper] [project page]

Official implementation of the paper "Deep ViT Features as Dense Visual Descriptors".

teaser

We demonstrate the effectiveness of deep features extracted from a self-supervised, pre-trained ViT model (DINO-ViT) as dense patch descriptors via real-world vision tasks: (a-b) co-segmentation & part co-segmentation: given a set of input images (e.g., 4 input images), we automatically co-segment semantically common foreground objects (e.g., animals), and then further partition them into common parts; (c-d) point correspondence: given a pair of input images, we automatically extract a sparse set of corresponding points. We tackle these tasks by applying only lightweight, simple methodologies such as clustering or binning, to deep ViT features.

Setup

Our code is developed in pytorch on and requires the following modules: tqdm, faiss, timm, matplotlib, pydensecrf, opencv, scikit-learn. We use python=3.9 but our code should be runnable on any version above 3.6. We recomment running our code with any CUDA supported GPU for faster performance. We recommend setting the running environment via Anaconda by running the following commands:

$ conda env create -f env/dino-vit-feats-env.yml
$ conda activate dino-vit-feats-env

Otherwise, run the following commands in your conda environment:

$ conda install pytorch torchvision torchaudio cudatoolkit=11 -c pytorch
$ conda install tqdm
$ conda install -c conda-forge faiss
$ conda install -c conda-forge timm 
$ conda install matplotlib
$ pip install opencv-python
$ pip install git+https://github.com/lucasb-eyer/pydensecrf.git
$ conda install -c anaconda scikit-learn

ViT Extractor

We provide a wrapper class for a ViT model to extract dense visual descriptors in extractor.py. You can extract descriptors to .pt files using the following command:

python extractor.py --image_path 
   
     --output_path 
    

    
   

You can specify the pretrained model using the --model flag with the following options:

  • dino_vits8, dino_vits16, dino_vitb8, dino_vitb16 from the DINO repo.
  • vit_small_patch8_224, vit_small_patch16_224, vit_base_patch8_224, vit_base_patch16_224 from the timm repo.

You can specify the stride of patch extracting layer to increase resolution using the --stride flag.

Part Co-segmentation Open In Colab

We provide a notebook for running on a single example in part_cosegmentation.ipynb.

To run on several image sets, arrange each set in a directory, inside a data root directory:


   
    
|
|_ 
    
     
|  |
|  |_ img1.png
|  |_ img2.png
|   
|_ 
     
      
   |
   |_ img1.png
   |_ img2.png
   |_ img3.png
...

     
    
   

The following command will produce results in the specified :

python part_cosegmentation.py --root_dir 
   
     --save_dir 
    

    
   

Note: The default configuration in part_cosegmentation.ipynb is suited for running on small sets (e.g. < 10). Increase amount of num_crop_augmentations for more stable results (and increased runtime). The default configuration in part_cosegmentation.py is suited for larger sets (e.g. >> 10).

Co-segmentation Open In Colab

We provide a notebook for running on a single example in cosegmentation.ipynb.

To run on several image sets, arrange each set in a directory, inside a data root directory:


   
    
|
|_ 
    
     
|  |
|  |_ img1.png
|  |_ img2.png
|   
|_ 
     
      
   |
   |_ img1.png
   |_ img2.png
   |_ img3.png
...

     
    
   

The following command will produce results in the specified :

python cosegmentation.py --root_dir 
   
     --save_dir 
    

    
   

Point Correspondences Open In Colab

We provide a notebook for running on a single example in correpondences.ipynb.

To run on several image pairs, arrange each image pair in a directory, inside a data root directory:


   
    
|
|_ 
    
     
|  |
|  |_ img1.png
|  |_ img2.png
|   
|_ 
     
      
   |
   |_ img1.png
   |_ img2.png
...

     
    
   

The following command will produce results in the specified :

python correspondences.py --root_dir 
   
     --save_dir 
    

    
   

Citation

If you found this repository useful please consider starring and citing :

@article{amir2021deep,
    author    = {Shir Amir and Yossi Gandelsman and Shai Bagon and Tali Dekel},
    title     = {Deep ViT Features as Dense Visual Descriptors},
    journal   = {arXiv preprint arXiv:2112.05814},
    year      = {2021}
}
Owner
Shir Amir
Graduate Student @ Weizmann Institute of Science
Shir Amir
A C implementation for creating 2D voronoi diagrams

Branch OSX/Linux Windows master dev jc_voronoi A fast C/C++ header only implementation for creating 2D Voronoi diagrams from a point set Uses Fortune'

Mathias Westerdahl 481 Dec 29, 2022
Code for CVPR 2021 paper TransNAS-Bench-101: Improving Transferrability and Generalizability of Cross-Task Neural Architecture Search.

TransNAS-Bench-101 This repository contains the publishable code for CVPR 2021 paper TransNAS-Bench-101: Improving Transferrability and Generalizabili

Yawen Duan 17 Nov 20, 2022
Source code for "MusCaps: Generating Captions for Music Audio" (IJCNN 2021)

MusCaps: Generating Captions for Music Audio Ilaria Manco1 2, Emmanouil Benetos1, Elio Quinton2, Gyorgy Fazekas1 1 Queen Mary University of London, 2

Ilaria Manco 57 Dec 07, 2022
Orthogonal Over-Parameterized Training

The inductive bias of a neural network is largely determined by the architecture and the training algorithm. To achieve good generalization, how to effectively train a neural network is of great impo

Weiyang Liu 11 Apr 18, 2022
Winners of DrivenData's Overhead Geopose Challenge

Winners of DrivenData's Overhead Geopose Challenge

DrivenData 22 Aug 04, 2022
Official PyTorch implementation of "Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble" (NeurIPS'21)

Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble This is the code for reproducing the results of the paper Uncertainty-Bas

43 Nov 23, 2022
Deep Convolutional Generative Adversarial Networks

Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks Alec Radford, Luke Metz, Soumith Chintala All images in t

Alec Radford 3.4k Dec 29, 2022
Code release for DS-NeRF (Depth-supervised Neural Radiance Fields)

Depth-supervised NeRF: Fewer Views and Faster Training for Free Project | Paper | YouTube Pytorch implementation of our method for learning neural rad

524 Jan 08, 2023
An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

An implementation of the methods presented in Causal-BALD: Deep Bayesian Active Learning of Outcomes to Infer Treatment-Effects from Observational Data.

Andrew Jesson 9 Apr 04, 2022
Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation"

EgoNet Official project website for the CVPR 2021 paper "Exploring intermediate representation for monocular vehicle pose estimation". This repo inclu

Shichao Li 138 Dec 09, 2022
A PyTorch implementation of SIN: Superpixel Interpolation Network

SIN: Superpixel Interpolation Network This is is a PyTorch implementation of the superpixel segmentation network introduced in our PRICAI-2021 paper:

6 Sep 28, 2022
Compositional and Parameter-Efficient Representations for Large Knowledge Graphs

NodePiece - Compositional and Parameter-Efficient Representations for Large Knowledge Graphs NodePiece is a "tokenizer" for reducing entity vocabulary

Michael Galkin 107 Jan 04, 2023
Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data

1 Meta-FDMIxup Repository for the paper : Meta-FDMixup: Cross-Domain Few-Shot Learning Guided byLabeled Target Data. (ACM MM 2021) paper News! the rep

Fu Yuqian 44 Nov 18, 2022
Robocop is your personal mini voice assistant made using Python.

Robocop-VoiceAssistant To use this project, you should have python installed in your system. If you don't have python installed, install it beforehand

Sohil Khanduja 3 Feb 26, 2022
SynNet - synthetic tree generation using neural networks

SynNet This repo contains the code and analysis scripts for our amortized approach to synthetic tree generation using neural networks. Our model can s

Wenhao Gao 60 Dec 29, 2022
Pytorch implementation for "Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion" (NeurIPS 2021)

Density-aware Chamfer Distance This repository contains the official PyTorch implementation of our paper: Density-aware Chamfer Distance as a Comprehe

Tong WU 93 Dec 15, 2022
QueryInst: Parallelly Supervised Mask Query for Instance Segmentation

QueryInst is a simple and effective query based instance segmentation method driven by parallel supervision on dynamic mask heads, which outperforms previous arts in terms of both accuracy and speed.

Hust Visual Learning Team 386 Jan 08, 2023
You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

You can draw the corresponding bounding box into the image and save it according to the result file (txt format) run by the tracker.

Huiyiqianli 42 Dec 06, 2022
Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Mo

Abhinav Kumar 76 Jan 02, 2023
PyTorch implementation of SIFT descriptor

This is an differentiable pytorch implementation of SIFT patch descriptor. It is very slow for describing one patch, but quite fast for batch. It can

Dmytro Mishkin 150 Dec 24, 2022