[CVPR 2022] Official PyTorch Implementation for "Reference-based Video Super-Resolution Using Multi-Camera Video Triplets"

Overview

Reference-based Video Super-Resolution (RefVSR)
Official PyTorch Implementation of the CVPR 2022 Paper
Project | arXiv | RealMCVSR Dataset
Hugging Face Spaces License CC BY-NC
PWC

This repo contains training and evaluation code for the following paper:

Reference-based Video Super-Resolution Using Multi-Camera Video Triplets
Junyong Lee, Myeonghee Lee, Sunghyun Cho, and Seungyong Lee
POSTECH
IEEE Computer Vision and Pattern Recognition (CVPR) 2022


Getting Started

Prerequisites

Tested environment

Ubuntu Python PyTorch CUDA

1. Environment setup

$ git clone https://github.com/codeslake/RefVSR.git
$ cd RefVSR

$ conda create -y name RefVSR python 3.8 && conda activate RefVSR

# Install pytorch
$ conda install pytorch==1.11.0 torchvision==0.12.0 torchaudio==0.11.0 cudatoolkit=11.3 -c pytorch

# Install requirements
$ ./install/install_cudnn113.sh

It is recommended to install PyTorch >= 1.10.0 with CUDA11.3 for running small models using Pytorch AMP, because PyTorch < 1.10.0 is known to have a problem in running amp with torch.nn.functional.grid_sample() needed for inter-frame alignment.

For the other models, PyTorch 1.8.0 is verified. To install requirements with PyTorch 1.8.0, run ./install/install_cudnn102.sh for CUDA10.2 or ./install/install_cudnn111.sh for CUDA11.1

2. Dataset

Download and unzip the proposed RealMCVSR dataset under [DATA_OFFSET]:

[DATA_OFFSET]
    └── RealMCVSR
        ├── train                       # a training set
        │   ├── HR                      # videos in original resolution 
        │   │   ├── T                   # telephoto videos
        │   │   │   ├── 0002            # a video clip 
        │   │   │   │   ├── 0000.png    # a video frame
        │   │   │   │   └── ...         
        │   │   │   └── ...            
        │   │   ├── UW                  # ultra-wide-angle videos
        │   │   └── W                   # wide-angle videos
        │   ├── LRx2                    # 2x downsampled videos
        │   └── LRx4                    # 4x downsampled videos
        ├── test                        # a testing set
        └── valid                       # a validation set

[DATA_OFFSET] can be modified with --data_offset option in the evaluation script.

3. Pre-trained models

Download pretrained weights (Google Drive | Dropbox) under ./ckpt/:

RefVSR
├── ...
├── ./ckpt
│   ├── edvr.pytorch                    # weights of EDVR modules used for training Ours-IR
│   ├── SPyNet.pytorch                  # weights of SpyNet used for inter-frame alignment
│   ├── RefVSR_small_L1.pytorch         # weights of Ours-small-L1
│   ├── RefVSR_small_MFID.pytorch       # weights of Ours-small
│   ├── RefVSR_small_MFID_8K.pytorch    # weights of Ours-small-8K
│   ├── RefVSR_L1.pytorch               # weights of Ours-L1
│   ├── RefVSR_MFID.pytorch             # weights of Ours
│   ├── RefVSR_MFID_8K.pytorch.pytorch  # weights of Ours-8K
│   ├── RefVSR_IR_MFID.pytorch          # weights of Ours-IR
│   └── RefVSR_IR_L1.pytorch            # weights of Ours-IR-L1
└── ...

For the testing and training of your own model, it is recommended to go through wiki pages for
logging and details of testing and training scripts before running the scripts.

Testing models of CVPR 2022

Evaluation script

CUDA_VISIBLE_DEVICES=0 python -B run.py \
    --mode _RefVSR_MFID_8K \                       # name of the model to evaluate
    --config config_RefVSR_MFID_8K \               # name of the configuration file in ./configs
    --data RealMCVSR \                             # name of the dataset
    --ckpt_abs_name ckpt/RefVSR_MFID_8K.pytorch \  # absolute path for the checkpoint
    --data_offset /data1/junyonglee \              # offset path for the dataset (e.g., [DATA_OFFSET]/RealMCVSR)
    --output_offset ./result                       # offset path for the outputs

Real-world 4x video super-resolution (HD to 8K resolution)

# Evaluating the model 'Ours' (Fig. 8 in the main paper).
$ ./scripts_eval/eval_RefVSR_MFID_8K.sh

# Evaluating the model 'Ours-small'.
$ ./scripts_eval/eval_amp_RefVSR_small_MFID_8K.sh

For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

For the model Ours-small,

  • We use Nvidia GeForce RTX 3090 (24GB) in practice.
  • It is the model Ours-small in Table 2 further trained with the adaptation stage.
  • The model requires PyTorch >= 1.10.0 with CUDA 11.3 for using PyTorch AMP.

Quantitative evaluation (models trained with the pre-training stage)

## Table 2 in the main paper
# Ours
$ ./scripts_eval/eval_RefVSR_MFID.sh

# Ours-l1
$ ./scripts_eval/eval_RefVSR_L1.sh

# Ours-small
$ ./scripts_eval/eval_amp_RefVSR_small_MFID.sh

# Ours-small-l1
$ ./scripts_eval/eval_amp_RefVSR_small_L1.sh

# Ours-IR
$ ./scripts_eval/eval_RefVSR_IR_MFID.sh

# Ours-IR-l1
$ ./scripts_eval/eval_RefVSR_IR_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

To obtain quantitative results measured with the varying FoV ranges as shown in Table 3 of the main paper, modify the script and specify --eval_mode FOV.

Training models with the proposed two-stage training strategy

The pre-training stage (Sec. 4.1)

# To train the model 'Ours':
$ ./scripts_train/train_RefVSR_MFID.sh

# To train the model 'Ours-small':
$ ./scripts_train/train_amp_RefVSR_small_MFID.sh

For both models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

  • We use the total batch size of 4, the multiplication of numbers in options --nproc_per_node and -b.

The adaptation stage (Sec. 4.2)

  1. Set the path of the checkpoint of a model trained with the pre-training stage.
    For the model Ours-small, for example,

    $ vim ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh
    #!/bin/bash
    
    py3clean ./
    CUDA_VISIBLE_DEVICES=0,1 ...
        ...
        -ra [LOG_OFFSET]/RefVSR_CVPR2022/amp_RefVSR_small_MFID/checkpoint/train/epoch/ckpt/amp_RefVSR_small_MFID_00xxx.pytorch
        ...
    

    Checkpoint path is [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/[mode]_00xxx.pytorch.

    • PSNR is recorded in [LOG_OFFSET]/RefVSR_CVPR2022/[mode]/checkpoint/train/epoch/checkpoint.txt.
    • [LOG_OFFSET] can be modified with config.log_offset in ./configs/config.py.
    • [mode] is the name of the model assigned with --mode in the script used for the pre-training stage.
  2. Start the adaptation stage.

    # Training the model 'Ours'.
    $ ./scripts_train/train_RefVSR_MFID_8K.sh
    
    # Training the model 'Ours-small'.
    $ ./scripts_train/train_amp_RefVSR_small_MFID_8K.sh

    For the model Ours, we use Nvidia Quadro 8000 (48GB) in practice.

    For the model Ours-small, we use Nvidia GeForce RTX 3090 (24GB) in practice.

    Be sure to modify the script file to set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

    • We use the total batch size of 2, the multiplication of numbers in options --nproc_per_node and -b.

Training models with L1 loss

# To train the model 'Ours-l1':
$ ./scripts_train/train_RefVSR_L1.sh

# To train the model 'Ours-small-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

# To train the model 'Ours-IR-l1':
$ ./scripts_train/train_amp_RefVSR_small_L1.sh

For all models, we use Nvidia GeForce RTX 3090 (24GB) in practice.

Be sure to modify the script file and set proper GPU devices, number of GPUs, and batch size by modifying CUDA_VISIBLE_DEVICES, --nproc_per_node and -b options, respectively.

  • We use the total batch size of 8, the multiplication of numbers in options --nproc_per_node and -b.

Wiki

Contact

Open an issue for any inquiries. You may also have contact with [email protected]

License

License CC BY-NC

This software is being made available under the terms in the LICENSE file. Any exemptions to these terms require a license from the Pohang University of Science and Technology.

Acknowledgment

We thank the authors of BasicVSR and DCSR for sharing their code.

BibTeX

@InProceedings{Lee2022RefVSR,
    author    = {Junyong Lee and Myeonghee Lee and Sunghyun Cho and Seungyong Lee},
    title     = {Reference-based Video Super-Resolution Using Multi-Camera Video Triplets},
    booktitle = {Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    year      = {2022}
}
Owner
Junyong Lee
Ph.D. candidate at POSTECH
Junyong Lee
Repository of continual learning papers

Continual learning paper repository This repository contains an incomplete (but dynamically updated) list of papers exploring continual learning in ma

29 Jan 05, 2023
Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images"

GANInversion_with_ConsecutiveImgs Official code for our ICCV paper: "From Continuity to Editability: Inverting GANs with Consecutive Images" https://a

QingyangXu 38 Dec 07, 2022
Learning nonlinear operators via DeepONet

DeepONet: Learning nonlinear operators The source code for the paper Learning nonlinear operators via DeepONet based on the universal approximation th

Lu Lu 239 Jan 02, 2023
Learning to Self-Train for Semi-Supervised Few-Shot

Learning to Self-Train for Semi-Supervised Few-Shot Classification This repository contains the TensorFlow implementation for NeurIPS 2019 Paper "Lear

86 Dec 29, 2022
QuALITY: Question Answering with Long Input Texts, Yes!

QuALITY: Question Answering with Long Input Texts, Yes! Authors: Richard Yuanzhe Pang,* Alicia Parrish,* Nitish Joshi,* Nikita Nangia, Jason Phang, An

ML² AT CILVR 61 Jan 02, 2023
Create time-series datacubes for supervised machine learning with ICEYE SAR images.

ICEcube is a Python library intended to help organize SAR images and annotations for supervised machine learning applications. The library generates m

ICEYE Ltd 65 Jan 03, 2023
Code for BMVC2021 "MOS: A Low Latency and Lightweight Framework for Face Detection, Landmark Localization, and Head Pose Estimation"

MOS-Multi-Task-Face-Detect Introduction This repo is the official implementation of "MOS: A Low Latency and Lightweight Framework for Face Detection,

104 Dec 08, 2022
Most popular metrics used to evaluate object detection algorithms.

Most popular metrics used to evaluate object detection algorithms.

Rafael Padilla 4.4k Dec 25, 2022
Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pretrained SOTA Deep Learning models, callbacks and more for research and production with PyTorch Lightning and PyTorch

Pytorch Lightning 1.4k Jan 01, 2023
Code for the paper "Adversarial Generator-Encoder Networks"

This repository contains code for the paper "Adversarial Generator-Encoder Networks" (AAAI'18) by Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. Pr

Dmitry Ulyanov 279 Jun 26, 2022
A library for uncertainty quantification based on PyTorch

Torchuq [logo here] TorchUQ is an extensive library for uncertainty quantification (UQ) based on pytorch. TorchUQ currently supports 10 representation

TorchUQ 96 Dec 12, 2022
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

Federico Galatolo 172 Dec 22, 2022
An NLP library with Awesome pre-trained Transformer models and easy-to-use interface, supporting wide-range of NLP tasks from research to industrial applications.

简体中文 | English News [2021-10-12] PaddleNLP 2.1版本已发布!新增开箱即用的NLP任务能力、Prompt Tuning应用示例与生成任务的高性能推理! 🎉 更多详细升级信息请查看Release Note。 [2021-08-22]《千言:面向事实一致性的生

6.9k Jan 01, 2023
This is a repository with the code for the ACL 2019 paper

The Story of Heads This is the official repo for the following papers: (ACL 2019) Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy

231 Nov 15, 2022
This app is a simple example of using Strealit to create a financial data web app.

Streamlit Demo: Finance Chart This app is a simple example of using Streamlit to create a financial data web app. This demo use streamlit, pandas and

91 Jan 02, 2023
Some code of the implements of Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network

3D-GMPDCNN Geological Modeling Using 3D Pixel-Adaptive and Deformable Convolutional Neural Network PyTorch implementation of "Geological Modeling Usin

5 Nov 21, 2022
Collection of machine learning related notebooks to share.

ML_Notebooks Collection of machine learning related notebooks to share. Notebooks GAN_distributed_training.ipynb In this Notebook, TensorFlow's tutori

Sascha Kirch 14 Dec 22, 2022
Deep Learning Emotion decoding using EEG data from Autism individuals

Deep Learning Emotion decoding using EEG data from Autism individuals This repository includes the python and matlab codes using for processing EEG 2D

Juan Manuel Mayor Torres 12 Dec 08, 2022
Create Data & AI apps in 20 lines of code with Shimoku

Install with: pip install shimoku-api-python Start with: from os import getenv import shimoku_api_python.client as Shimoku

Shimoku 5 Nov 07, 2022
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 08, 2023