A fast model to compute optical flow between two input images.

Related tags

Deep LearningDCVNet
Overview

DCVNet: Dilated Cost Volumes for Fast Optical Flow

This repository contains our implementation of the paper:

@InProceedings{jiang2021dcvnet,
  title={DCVNet: Dilated Cost Volumes for Fast Optical Flow},
  author={Jiang, Huaizu and Learned-Miller, Erik},
  booktitle={arXiv},
  year={2021}
}

Need a fast optical flow model? Try DCVNet

  • Fast. On a mid-end GTX 1080ti GPU, DCVNet runs in real time at 71 fps (frames-per-second) to process images with sizes of 1024 × 436.
  • Compact and accurate. DCVNet has 4.94M parameters and consumes 1.68GB GPU memory during inference. It achieves comparable accuracy to state-of-the-art approaches on the MPI Sintel benchmark.

In the figure above, for each model, the circle radius indicates the number of parameters (larger radius means more parameters). The center of a circle corresponds to a model’s EPE (end-point-error).

Requirements

This code has been tested with Python 3.7, PyTorch 1.6.0, and CUDA 9.2. We suggest to use a conda environment.

conda create -n dcvnet
conda activate dcvnet
conda install pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.1 matplotlib tensorboardX scipy opencv -c pytorch
pip install yacs

We use an open-source implementation https://github.com/ClementPinard/Pytorch-Correlation-extension to compute dilated cost volumes. Follow the instructions there to install this module.

Demos

Pretrained models can be downloaded by running

./scripts/download_models.sh

or downloaded from Google drive.

You can demo a pre-trained model on a sequence of frames

python demo.py --weights-path pretrained_models/sceneflow_dcvnet.pth --path demo-frames

Required data

The following datasets are required to train and evaluate DCVNet.

We borrow the data loaders used in RAFT. By default, dcvnet/data/raft/datasets.py will search for the datasets in these locations. You can create symbolic links to wherever the datasets were downloaded in the datasets folder

|-- datasets
    |-- Driving
        |-- frames_cleanpass
        |-- optical_flow
    |-- FlyingThings3D_subset
        |-- train
            |-- flow
            |-- image_clean
        |-- val
            |-- flow
            |-- image_clean
    |-- Monkaa
        |-- frames_cleanpass
        |-- optical_flow
    |-- MPI_Sintel
        |-- test
        |-- training
    |-- KITTI2012
        |-- testing
        |-- training
    |-- KITTI2015
        |-- testing
        |-- training
    |-- HD1K
        |-- hd1k_flow_gt
        |-- hd1k_input

Evaluation

You can evaluate a pre-trained model using tools/evaluate_optical_flow.py

python evaluate_optical_flow.py --weights_path models/dcvnet-sceneflow.pth --dataset sintel

You can optionally add the --amp switch to do inference in mixed precision to reduce GPU memory usage.

Training

We used 8 GTX 1080ti GPUs for training. Training logs will be written to the output folder, which can be visualized using tensorboard.

# train on the synthetic scene flow dataset
python tools/train_optical_flow.py --config-file configs/sceneflow_dcvnet.yaml 

# fine-tune it on the MPI-Sintel dataset
# 4 GPUs are sufficient, but here we use 8 GPUs for fast training
python tools/train_optical_flow.py --config-file configs/sintel_dcvnet.yaml --pretrain-weights output/SceneFlow/sceneflow_dcvnet/default/train_epoch_50.pth

# fine-tune it on the KITTI 2012 and 2015 dataset
# we only use 6 GPUs (3 GPUs are sufficient) since the batch size is 6
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5 python tools/train_optical_flow.py --config-file configs/kitti12+15_dcvnet.yaml --pretrain-weights output/Sintel+SceneFlow/sintel_dcvnet/default/train_epoch_5.pth

Note on the inference speed

In the main branch, the computation of the dilated cost volumes can be further optimized without using the for loop. Checkout the efficient branch for details. If you are interested in testing the inference speed, we suggest to switch to the efficient branch.

git checkout efficient
CUDA_VISIBLE_DEVICES=0 python tools/evaluate_optical_flow.py --dry-run

We haven't fixed this problem because our pre-trained models are based on the implementation in the main branch, which are not compatible with the resizing in the efficient branch. We need to re-train all our models. It will be fixed soon.

To-do

  • Fix the problem of efficient cost volume computation.
  • Train the model on the AutoFlow dataset.

Acknowledgment

Our implementation is built on top of RAFT, Pytorch-Correlation-extension, yacs, Detectron2, and semseg. We thank the authors for releasing and maintaining the code.

Owner
Huaizu Jiang
Assistant Professor at Northeastern University.
Huaizu Jiang
Monitora la qualità della ricezione dei segnali radio nelle province siciliane.

FMap-server Monitora la qualità della ricezione dei segnali radio nelle province siciliane. Conversion data Frequency - StationName maps are stored in

Triglie 5 May 24, 2021
Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization (Salesforce Research) This is a PyTorch implementation of the CoMatch paper [B

Salesforce 107 Dec 14, 2022
Converts geometry node attributes to built-in attributes

Attribute Converter Simplifies converting attributes created by geometry nodes to built-in attributes like UVs or vertex colors, as a single click ope

Ivan Notaros 12 Dec 22, 2022
Github project for Attention-guided Temporal Coherent Video Object Matting.

Attention-guided Temporal Coherent Video Object Matting This is the Github project for our paper Attention-guided Temporal Coherent Video Object Matti

71 Dec 19, 2022
No Code AI/ML platform

NoCodeAIML No Code AI/ML platform - Community Edition Video credits: Uday Kiran Typical No Code AI/ML Platform will have features like drag and drop,

Bhagvan Kommadi 5 Jan 28, 2022
Finetune SSL models for MOS prediction

Finetune SSL models for MOS prediction This is code for our paper under review for ICASSP 2022: "Generalization Ability of MOS Prediction Networks" Er

Yamagishi and Echizen Laboratories, National Institute of Informatics 32 Nov 22, 2022
Official implementation of Representer Point Selection via Local Jacobian Expansion for Post-hoc Classifier Explanation of Deep Neural Networks and Ensemble Models at NeurIPS 2021

Representer Point Selection via Local Jacobian Expansion for Classifier Explanation of Deep Neural Networks and Ensemble Models This repository is the

Yi(Amy) Sui 2 Dec 01, 2021
RealTime Emotion Recognizer for Machine Learning Study Jam's demo

Emotion recognizer Table of contents Clone project Dataset Install dependencies Main program Demo 1. Clone project git clone https://github.com/GDSC20

Google Developer Student Club - UIT 1 Oct 05, 2021
Code for paper "Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs"

This is the codebase for the paper: Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs Directory Structur

Peter Hase 19 Aug 21, 2022
NOMAD - A blackbox optimization software

################################################################################### #

Blackbox Optimization 78 Dec 29, 2022
Pytorch implementation of AREL

Status: Archive (code is provided as-is, no updates expected) Agent-Temporal Attention for Reward Redistribution in Episodic Multi-Agent Reinforcement

8 Nov 25, 2022
Details about the wide minima density hypothesis and metrics to compute width of a minima

wide-minima-density-hypothesis Details about the wide minima density hypothesis and metrics to compute width of a minima This repo presents the wide m

Nikhil Iyer 9 Dec 27, 2022
A torch implementation of "Pixel-Level Domain Transfer"

Pixel Level Domain Transfer A torch implementation of "Pixel-Level Domain Transfer". based on dcgan.torch. Dataset The dataset used is "LookBook", fro

Fei Xia 260 Sep 02, 2022
Unofficial & improved implementation of NeRF--: Neural Radiance Fields Without Known Camera Parameters

[Unofficial code-base] NeRF--: Neural Radiance Fields Without Known Camera Parameters [ Project | Paper | Official code base ] ⬅️ Thanks the original

Jianfei Guo 239 Dec 22, 2022
ICLR21 Tent: Fully Test-Time Adaptation by Entropy Minimization

⛺️ Tent: Fully Test-Time Adaptation by Entropy Minimization This is the official project repository for Tent: Fully-Test Time Adaptation by Entropy Mi

Dequan Wang 204 Dec 25, 2022
Python tools for 3D face: 3DMM, Mesh processing(transform, camera, light, render), 3D face representations.

face3d: Python tools for processing 3D face Introduction This project implements some basic functions related to 3D faces. You can use this to process

Yao Feng 2.3k Dec 30, 2022
🏅 Top 5% in 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지

AI_SPARK_CHALLENG_Object_Detection 제2회 연구개발특구 인공지능 경진대회 AI SPARK 챌린지 🏅 Top 5% in mAP(0.75) (443명 중 13등, mAP: 0.98116) 대회 설명 Edge 환경에서의 가축 Object Dete

3 Sep 19, 2022
Virtual Dance Reality Stage is a feature that offers you to share a stage with another user virtually.

Virtual Dance Reality Stage is a feature that offers you to share a stage with another user virtually. It uses the concept of Image Background Removal using DeepLab Architecture (based on Semantic Se

Devashi Choudhary 5 Aug 24, 2022
Harmonic Memory Networks for Graph Completion

HMemNetworks Code and documentation for Harmonic Memory Networks, a series of models for compositionally assembling representations of graph elements

mlalisse 0 Oct 27, 2021
InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing

InterFaceGAN - Interpreting the Latent Space of GANs for Semantic Face Editing Figure: High-quality facial attributes editing results with InterFaceGA

GenForce: May Generative Force Be with You 1.3k Jan 09, 2023