Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Last update: Dec 27, 2022

Related tags

Overview

Space-Time Correspondence as a Contrastive Random Walk

This is the repository for Space-Time Correspondence as a Contrastive Random Walk, published at NeurIPS 2020.

[Paper] [Project Page] [Slides] [Poster] [Talk]

@inproceedings{jabri2020walk,
    Author = {Allan Jabri and Andrew Owens and Alexei A. Efros},
    Title = {Space-Time Correspondence as a Contrastive Random Walk},
    Booktitle = {Advances in Neural Information Processing Systems},
    Year = {2020},
}

Consider citing our work or acknowledging this repository if you found this code to be helpful :)

Requirements

pytorch (>1.3)
torchvision (0.6.0)
cv2
matplotlib
skimage
imageio

For visualization (--visualize):

wandb
visdom
sklearn

Train

An example training command is:

python -W ignore train.py --data-path /path/to/kinetics/ \
--frame-aug grid --dropout 0.1 --clip-len 4 --temp 0.05 \
--model-type scratch --workers 16 --batch-size 20  \
--cache-dataset --data-parallel --visualize --lr 0.0001

This yields a model with performance on DAVIS as follows (see below for evaluation instructions), provided as pretrained.pth:

 J&F-Mean    J-Mean  J-Recall  J-Decay    F-Mean  F-Recall   F-Decay
  0.67606  0.645902  0.758043   0.2031  0.706219   0.83221  0.246789

Arguments of interest:

--dropout: The rate of edge dropout (default 0.1).
--clip-len: Length of video sequence.
--temp: Softmax temperature.
--model-type: Type of encoder. Use scratch or scratch_zeropad if training from scratch. Use imagenet18 to load an Imagenet-pretrained network. Use scratch with --resume if reloading a checkpoint.
--batch-size: I've managed to train models with batch sizes between 6 and 24. If you have can afford a larger batch size, consider increasing the --lr from 0.0001 to 0.0003.
--frame-aug: grid samples a grid of patches to get nodes; none will just use a single image and use embeddings in the feature map as nodes.
--visualize: Log diagonistics to wandb and data visualizations to visdom.

Data

We use the official torchvision.datasets.Kinetics400 class for training. You can find directions for downloading Kinetics here. In particular, the code expects the path given for kinetics to contain a train_256 subdirectory.

You can also provide --data-path with a file with a list of directories of images, or a path to a directory of directory of images. In this case, clips are randomly subsampled from the directory.

Visualization

By default, the training script will log diagnostics to wandb and data visualizations to visdom.

Pretrained Model

You can find the model resulting from the training command above at pretrained.pth. We are still training updated ablation models and will post them when ready.

Evaluation: Label Propagation

The label propagation algorithm is described in test.py. The output of test.py (predicted label maps) must be post-processed for evaluation.

DAVIS

To evaluate a trained model on the DAVIS task, clone the davis2017-evaluation repository, and prepare the data by downloading the 2017 dataset and modifying the paths provided in eval/davis_vallist.txt. Then, run:

Label Propagation:

python test.py --filelist /path/to/davis/vallist.txt \
--model-type scratch --resume ../pretrained.pth --save-path /save/path \
--topk 10 --videoLen 20 --radius 12  --temperature 0.05  --cropSize -1

Though test.py expects a model file created with train.py, it can easily be modified to be used with other networks. Note that we simply use the same temperature used at training time.

You can also run the ImageNet baseline with the command below.

python test.py --filelist /path/to/davis/vallist.txt \
--model-type imagenet18 --save-path /save/path \
--topk 10 --videoLen 20 --radius 12  --temperature 0.05  --cropSize -1

Post-Process:

# Convert
python eval/convert_davis.py --in_folder /save/path/ --out_folder /converted/path --dataset /davis/path/

# Compute metrics
python /path/to/davis2017-evaluation/evaluation_method.py \
--task semi-supervised   --results_path /converted/path --set val \
--davis_path /path/to/davis/

You can generate the above commands with the script below, where removing --dryrun will actually run them in sequence.

python eval/run_test.py --model-path /path/to/model --L 20 --K 10  --T 0.05 --cropSize -1 --dryrun

Test-time Adaptation

To do.

Repository for "Space-Time Correspondence as a Contrastive Random Walk" (NeurIPS 2020)

Related tags

Overview

Space-Time Correspondence as a Contrastive Random Walk

Requirements

Train

Data

Visualization

Pretrained Model

Evaluation: Label Propagation

DAVIS

Test-time Adaptation

Owner

A. Jabri

Datasets for new state-of-the-art challenge in disentanglement learning

LaneDet is an open source lane detection toolbox based on PyTorch that aims to pull together a wide variety of state-of-the-art lane detection models

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

Script that attempts to force M1 macs into RGB mode when used with monitors that are defaulting to YPbPr.

A PyTorch implementation of "CoAtNet: Marrying Convolution and Attention for All Data Sizes".

Code of Periodic Activation Functions Induce Stationarity

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

Continual reinforcement learning baselines: experiment specifications, implementation of existing methods, and common metrics. Easily extensible to new methods.

simple demo codes for Learning to Teach with Dynamic Loss Functions

This is a computer vision based implementation of the popular childhood game 'Hand Cricket/Odd or Even' in python

Removing Inter-Experimental Variability from Functional Data in Systems Neuroscience

PyTorch implementation of the end-to-end coreference resolution model with different higher-order inference methods.

Awesome Deep Graph Clustering is a collection of SOTA, novel deep graph clustering methods

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Official Implementation of "LUNAR: Unifying Local Outlier Detection Methods via Graph Neural Networks"

Sionna: An Open-Source Library for Next-Generation Physical Layer Research

Code for our NeurIPS 2021 paper Mining the Benefits of Two-stage and One-stage HOI Detection

Implementation of 'X-Linear Attention Networks for Image Captioning' [CVPR 2020]

CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

Source Code for Simulations in the Publication "Can the brain use waves to solve planning problems?"