Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Overview

Temporal copying and local hallucination for video inpainting

This repository contains the implementation of my master's thesis "Temporal copying and local hallucination for video inpainting". The code has been built using PyTorch Lightning, read its documentation to get a complete overview of how this repository is structured.

Disclaimer: The version published here might contain small differences with the thesis because of the refactoring.

About the data

The thesis uses three different datasets: GOT-10k for the background sequences, YouTube-VOS for realistic mask shapes and DAVIS to test the models with real masked sequences. Some pre-processing steps, which are not published in this repository, have been applied to the data. You can download the exact datasets used in the paper from this link.

The first step is to clone this repository, install its dependencies and other required system packages:

git clone https://github.com/davidalvarezdlt/master_thesis.git
cd master_thesis
pip install -r requirements.txt

apt-get update
apt-get install libturbojpeg ffmpeg libsm6 libxext6

Unzip the file downloaded from the previous link inside ./data. The resulting folder structure should look like this:

master_thesis/
    data/
        DAVIS-2017/
        GOT10k/
        YouTubeVOS/
    lightning_logs/
    master_thesis/
    .gitignore
    .pre-commit-config.yaml
    LICENSE
    README.md
    requirements.txt

Training the Dense Flow Prediction Network (DFPN) model

In short, you can train the model by calling:

python -m master_thesis

You can modify the default parameters of the code by using CLI parameters. Get a complete list of the available parameters by calling:

python -m master_thesis --help

For instance, if we want to train the model using 2 frames, with a batch size of 8 and using one GPUs, we would call:

python -m master_thesis --frames_n 2 --batch_size 8 --gpus 1

Every time you train the model, a new folder inside ./lightning_logs will be created. Each folder represents a different version of the model, containing its checkpoints and auxiliary files.

Training the Copy-and-Hallucinate Network (CHN) model

In this case, you will need to specify that you want to train the CHN model. To do so:

python -m master_thesis --chn --chn_aligner <chn_aligner> --chn_aligner_checkpoint <chn_aligner_checkpoint>

Where --chn_aligner is the model used to align the frames (either cpn or dfpn) and --chn_aligner_checkpoint is the path to its checkpoint.

You can download the checkpoint of the CPN from its original repository (file named weight.pth).

Testing the Dense Flow Prediction Network (DFPN) model

You can align samples from the test split and store them in TensorBoard by calling:

python -m samplernn_pase --test --test_checkpoint <test_checkpoint>

Where --test_checkpoint is a valid path to the model checkpoint that should be used.

Testing the Copy-and-Hallucinate Network (CHN) model

You can inpaint test sequences (they will be stored in a folder) using the three algorithms by calling:

python -m master_thesis --chn --chn_aligner <chn_aligner> --chn_aligner_checkpoint <chn_aligner_checkpoint> --test --test_checkpoint <test_checkpoint>

Notice that now the value of --test_checkpoint must be a valid path to a CHN checkpoint, while --chn_aligner_checkpoint might be the path to a checkpoint of either CPN or DFPN.

Citation

If you find this thesis useful, please use the following citation:

@thesis{Alvarez2020,
    type = {Master's Thesis},
    author = {David Álvarez de la Torre},
    title = {Temporal copying and local hallucination for video onpainting},
    school = {ETH Zürich},
    year = 2020,
}
Owner
David Álvarez de la Torre
Founder of @lemonplot. Alumni of UPC and ETH.
David Álvarez de la Torre
Official implementation for paper: A Latent Transformer for Disentangled Face Editing in Images and Videos.

A Latent Transformer for Disentangled Face Editing in Images and Videos Official implementation for paper: A Latent Transformer for Disentangled Face

InterDigital 108 Dec 09, 2022
Styleformer - Official Pytorch Implementation

Styleformer -- Official PyTorch implementation Styleformer: Transformer based Generative Adversarial Networks with Style Vector(https://arxiv.org/abs/

Jeeseung Park 159 Dec 12, 2022
Recurrent Scale Approximation (RSA) for Object Detection

Recurrent Scale Approximation (RSA) for Object Detection Codebase for Recurrent Scale Approximation for Object Detection in CNN published at ICCV 2017

Yu Liu (Louis) 239 Dec 28, 2022
Train a deep learning net with OpenStreetMap features and satellite imagery.

DeepOSM Classify roads and features in satellite imagery, by training neural networks with OpenStreetMap (OSM) data. DeepOSM can: Download a chunk of

TrailBehind, Inc. 1.3k Nov 24, 2022
A Unified Framework and Analysis for Structured Knowledge Grounding

UnifiedSKG 📚 : Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models Code for paper UnifiedSKG: Unifying and Mu

HKU NLP Group 370 Dec 21, 2022
A texturizer that I just made. Nothing special here.

texturizer This is a little project that I did with an hour's time. It texturizes an image given a image and a texture to texturize it with. There is

1 Nov 11, 2021
[v1 (ISBI'21) + v2] MedMNIST: A Large-Scale Lightweight Benchmark for 2D and 3D Biomedical Image Classification

MedMNIST Project (Website) | Dataset (Zenodo) | Paper (arXiv) | MedMNIST v1 (ISBI'21) Jiancheng Yang, Rui Shi, Donglai Wei, Zequan Liu, Lin Zhao, Bili

683 Dec 28, 2022
This is an unofficial PyTorch implementation of Meta Pseudo Labels

This is an unofficial PyTorch implementation of Meta Pseudo Labels. The official Tensorflow implementation is here.

Jungdae Kim 320 Jan 08, 2023
Simulation of Self Driving Car

In this repository, the code to use Udacity's self driving car simulator as a testbed for training an autonomous car are provided.

Shyam Das Shrestha 1 Nov 21, 2021
Learning Lightweight Low-Light Enhancement Network using Pseudo Well-Exposed Images

Learning Lightweight Low-Light Enhancement Network using Pseudo Well-Exposed Images This repository contains the implementation of the following paper

Seonggwan Ko 9 Jul 30, 2022
[NeurIPS 2021] “Improving Contrastive Learning on Imbalanced Data via Open-World Sampling”,

Improving Contrastive Learning on Imbalanced Data via Open-World Sampling Introduction Contrastive learning approaches have achieved great success in

VITA 24 Dec 17, 2022
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 02, 2022
Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation

Grad2Task: Improved Few-shot Text Classification Using Gradients for Task Representation Prerequisites This repo is built upon a local copy of transfo

Jixuan Wang 10 Sep 28, 2022
This is the official repository of the paper Stocastic bandits with groups of similar arms (NeurIPS 2021). It contains the code that was used to compute the figures and experiments of the paper.

Experiments How to reproduce experimental results of Stochastic bandits with groups of similar arms submitted paper ? Section 5 of the paper To reprod

Fabien 0 Oct 25, 2021
CCP dataset from Clothing Co-Parsing by Joint Image Segmentation and Labeling

Clothing Co-Parsing (CCP) Dataset Clothing Co-Parsing (CCP) dataset is a new clothing database including elaborately annotated clothing items. 2, 098

Wei Yang 434 Dec 24, 2022
Time Dependent DFT in Tamm-Dancoff Approximation

Density Function Theory Program - kspy-tddft(tda) This is an implementation of Time-Dependent Density Functional Theory(TDDFT) using the Tamm-Dancoff

Peter Borthwick 2 Nov 17, 2022
Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization (Salesforce Research) This is a PyTorch implementation of the CoMatch paper [B

Salesforce 107 Dec 14, 2022
Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CVPR 2021)

Semi-supervised Semantic Segmentation with Directional Context-aware Consistency (CAC) Xin Lai*, Zhuotao Tian*, Li Jiang, Shu Liu, Hengshuang Zhao, Li

Jia Research Lab 137 Dec 14, 2022
Pytorch ImageNet1k Loader with Bounding Boxes.

ImageNet 1K Bounding Boxes For some experiments, you might wanna pass only the background of imagenet images vs passing only the foreground. Here, I'v

Amin Ghiasi 11 Oct 15, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022