MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

Overview

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images

Codes for the following paper:

MatryODShka: Real-time 6DoF Video View Synthesis using Multi-Sphere Images
Benjamin Attal, Selena Ling, Aaron Gokaslan, Christian Richardt, James Tompkin
ECCV 2020

High-level overview of approach.

See more at our project page.

If you use these codes, please cite:

@inproceedings{Attal:2020:ECCV,
    author    = "Benjamin Attal and Selena Ling and Aaron Gokaslan and Christian Richardt and James Tompkin",
    title     = "{MatryODShka}: Real-time {6DoF} Video View Synthesis using Multi-Sphere Images",
    booktitle = "European Conference on Computer Vision (ECCV)",
    month     = aug,
    year      = "2020",
    url       = "https://visual.cs.brown.edu/matryodshka"
}

Note that our codes are based on the code from the paper "Stereo Maginification: Learning View Synthesis using Multiplane Images" by Zhou et al. [1], and on the code from the paper "Pixel2mesh: Generating 3D Mesh Models from Single RGB Images." by Wang et al. [3]. Please also cite their work.

Setup

  • Create a conda environment from the matryodshka-gpu.yml file.
  • Run ./download_glob.sh to download the files needed for training and testing.
  • Download the dataset as in Section Replica dataset.

Training the model

See train.py for training the model.

  • To train with transform inverse regularization, use --transform_inverse_reg flag.

  • To train with CoordNet, use --coord_net flag.

  • To experiment with different losses (elpips or l2), use --which_loss flag.

    • To train with spherical weighting on loss maps, use --spherical_attention flag.
  • To train with graph convolution network (GCN), use --gcn flag. Note the particular GCN architecture definition we used is from the Pixel2Mesh repo [3].

  • The current scripts support training on Replica 360 and cubemap dataset and RealEstate10K dataset. Use --input_type to switch between these types of inputs (ODS, PP, REALESTATE_PP).

See scripts/train/*.sh for some sample scripts.

Testing the model

See test.py for testing the model with replica-360 test set.

  • When testing on video frames, e.g. test_video_640x320, include on_video in --test_type flag.
  • When testing on high-resolution images, include high_res in --test_type flag.

See scripts/test/*.sh for sample scripts.

Evaluation

See eval.py for evaluating the model, which saves the metric scores into a json file. We evaluate our models on

  • third-view reconstruction quality

    • See scripts/eval/*-reg.sh for a sample script.
  • frame-to-frame reconstruction differences on video sequences to evaluate the effect of transform inverse regularization on temporal consistency.

    • Include on_video when specifying the --eval_type flag.
    • See scripts/eval/*-video.sh for a sample script.

Pre-trained model

Download models pre-trained with and without transform inverse regularization by running ./download_model.sh. These can also be found here at the Brown library for archival purposes.

Replica dataset

We rendered a 360 and a cubemap dataset for training from the Facebook Replica Dataset [2]. This data can be found here at the Brown library for archival purposes. You should have access to the following datasets.

  • train_640x320
  • test_640x320
  • test_video_640x320

You can also find the camera pose information here that were used to render the training dataset. Each line of the txt fileach line of the txt file is formatted as below:

camera_position_x camera_position_y camera_position_z ods_baseline target1_offset_x target1_offset_y target1_offset_z target2_offset_x target2_offset_y target2_offset_z target3_offset_x target3_offset_y target3_offset_z

We also have a fork of the Replica dataset codebase which can regenerate our data from scratch. This contains customized rendering scripts that allow output of ODS, equirectangular, and cubemap projection spherical imagery, along with corresponding depth maps.

Note that the 360 dataset we release for download was rendered with an incorrect 90-degree camera rotation around the up vector and a horizontal flip. Regenerating the dataset from our released code fork with the customized rendering scripts will not include this coordinate change. The output model performance should be approximately the same.

Exporting the model to ONNX

We export our model to ONNX by firstly converting the checkpoint into a pb file, which then gets converted to an onnx file with the tf2onnx module. See export.py for exporting the model into .pb file.

See scripts/export/model-name.sh for a sample script to run export.py, and scripts/export/pb2onnx.sh for a sample script to run pb-to-onnx conversion.

Unity Application + ONNX to TensorRT Conversion

We are still working on releasing the real-time Unity application and onnx2trt conversion scripts. Please bear with us!

References

[1] Zhou, Tinghui, et al. "Stereo magnification: Learning view synthesis using multiplane images." arXiv preprint arXiv:1805.09817 (2018). https://github.com/google/stereo-magnification

[2] Straub, Julian, et al. "The Replica dataset: A digital replica of indoor spaces." arXiv preprint arXiv:1906.05797 (2019). https://github.com/facebookresearch/Replica-Dataset

[3] Wang, Nanyang, et al. "Pixel2mesh: Generating 3d mesh models from single rgb images." Proceedings of the European Conference on Computer Vision (ECCV). 2018. https://github.com/nywang16/Pixel2Mesh

Owner
Brown University Visual Computing Group
Brown University Visual Computing Group
Graph Neural Networks with Keras and Tensorflow 2.

Welcome to Spektral Spektral is a Python library for graph deep learning, based on the Keras API and TensorFlow 2. The main goal of this project is to

Daniele Grattarola 2.2k Jan 08, 2023
DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

DeOldify - A Deep Learning based project for colorizing and restoring old images (and video!)

Jason Antic 15.8k Jan 04, 2023
Notification Triggers for Python

Notipyer Notification triggers for Python Send async email notifications via Python. Get updates/crashlogs from your scripts with ease. Installation p

Chirag Jain 17 May 16, 2022
An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

deepbci 272 Jan 08, 2023
R3Det based on mmdet 2.19.0

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object Installation # install mmdetection first if you haven't installed it

SJTU-Thinklab-Det 38 Dec 15, 2022
Official PyTorch implementation of U-GAT-IT: Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Image Translation

U-GAT-IT — Official PyTorch Implementation : Unsupervised Generative Attentional Networks with Adaptive Layer-Instance Normalization for Image-to-Imag

Hyeonwoo Kang 2.4k Jan 04, 2023
Course materials for Fall 2021 "CIS6930 Topics in Computing for Data Science" at New College of Florida

Fall 2021 CIS6930 Topics in Computing for Data Science This repository hosts course materials used for a 13-week course "CIS6930 Topics in Computing f

Yoshi Suhara 101 Nov 30, 2022
Multiple style transfer via variational autoencoder

ST-VAE Multiple style transfer via variational autoencoder By Zhi-Song Liu, Vicky Kalogeiton and Marie-Paule Cani This repo only provides simple testi

13 Oct 29, 2022
A simple implementation of Kalman filter in single object tracking

kalman-filter-in-single-object-tracking A simple implementation of Kalman filter in single object tracking https://www.bilibili.com/video/BV1Qf4y1J7D4

130 Dec 26, 2022
A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

LPM_Python A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching. The code is established ac

AoxiangFan 11 Nov 07, 2022
Technical Analysis library in pandas for backtesting algotrading and quantitative analysis

bta-lib - A pandas based Technical Analysis Library bta-lib is pandas based technical analysis library and part of the backtrader family. Links Main P

DRo 393 Dec 20, 2022
Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Spiking Neural Network for Computer Vision using SpikingJelly framework and Pytorch-Lightning

Sami BARCHID 2 Oct 20, 2022
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
HandTailor: Towards High-Precision Monocular 3D Hand Recovery

HandTailor This repository is the implementation code and model of the paper "HandTailor: Towards High-Precision Monocular 3D Hand Recovery" (arXiv) G

Lv Jun 113 Jan 06, 2023
Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Diabet Feature Engineering - Predict whether people have diabetes when their characteristics are specified

Şebnem 6 Jan 18, 2022
Official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspective with Transformer"

[AAAI2022] UCTransNet This repo is the official implementation of "UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-wise Perspectiv

Haonan Wang 199 Jan 03, 2023
Tooling for the Common Objects In 3D dataset.

CO3D: Common Objects In 3D This repository contains a set of tools for working with the Common Objects in 3D (CO3D) dataset. Download the dataset The

Facebook Research 724 Jan 06, 2023
Research into Forex price prediction from price history using Deep Sequence Modeling with Stacked LSTMs.

Forex Data Prediction via Recurrent Neural Network Deep Sequence Modeling Research Paper Our research paper can be viewed here Installation Clone the

Alex Taradachuk 2 Aug 07, 2022
This is the official PyTorch implementation for "Mesa: A Memory-saving Training Framework for Transformers".

A Memory-saving Training Framework for Transformers This is the official PyTorch implementation for Mesa: A Memory-saving Training Framework for Trans

Zhuang AI Group 105 Dec 06, 2022
Behavioral "black-box" testing for recommender systems

RecList RecList Free software: MIT license Documentation: https://reclist.readthedocs.io. Overview RecList is an open source library providing behavio

Jacopo Tagliabue 375 Dec 30, 2022