Implementation of the master's thesis "Temporal copying and local hallucination for video inpainting".

Overview

Temporal copying and local hallucination for video inpainting

This repository contains the implementation of my master's thesis "Temporal copying and local hallucination for video inpainting". The code has been built using PyTorch Lightning, read its documentation to get a complete overview of how this repository is structured.

Disclaimer: The version published here might contain small differences with the thesis because of the refactoring.

About the data

The thesis uses three different datasets: GOT-10k for the background sequences, YouTube-VOS for realistic mask shapes and DAVIS to test the models with real masked sequences. Some pre-processing steps, which are not published in this repository, have been applied to the data. You can download the exact datasets used in the paper from this link.

The first step is to clone this repository, install its dependencies and other required system packages:

git clone https://github.com/davidalvarezdlt/master_thesis.git
cd master_thesis
pip install -r requirements.txt

apt-get update
apt-get install libturbojpeg ffmpeg libsm6 libxext6

Unzip the file downloaded from the previous link inside ./data. The resulting folder structure should look like this:

master_thesis/
    data/
        DAVIS-2017/
        GOT10k/
        YouTubeVOS/
    lightning_logs/
    master_thesis/
    .gitignore
    .pre-commit-config.yaml
    LICENSE
    README.md
    requirements.txt

Training the Dense Flow Prediction Network (DFPN) model

In short, you can train the model by calling:

python -m master_thesis

You can modify the default parameters of the code by using CLI parameters. Get a complete list of the available parameters by calling:

python -m master_thesis --help

For instance, if we want to train the model using 2 frames, with a batch size of 8 and using one GPUs, we would call:

python -m master_thesis --frames_n 2 --batch_size 8 --gpus 1

Every time you train the model, a new folder inside ./lightning_logs will be created. Each folder represents a different version of the model, containing its checkpoints and auxiliary files.

Training the Copy-and-Hallucinate Network (CHN) model

In this case, you will need to specify that you want to train the CHN model. To do so:

python -m master_thesis --chn --chn_aligner <chn_aligner> --chn_aligner_checkpoint <chn_aligner_checkpoint>

Where --chn_aligner is the model used to align the frames (either cpn or dfpn) and --chn_aligner_checkpoint is the path to its checkpoint.

You can download the checkpoint of the CPN from its original repository (file named weight.pth).

Testing the Dense Flow Prediction Network (DFPN) model

You can align samples from the test split and store them in TensorBoard by calling:

python -m samplernn_pase --test --test_checkpoint <test_checkpoint>

Where --test_checkpoint is a valid path to the model checkpoint that should be used.

Testing the Copy-and-Hallucinate Network (CHN) model

You can inpaint test sequences (they will be stored in a folder) using the three algorithms by calling:

python -m master_thesis --chn --chn_aligner <chn_aligner> --chn_aligner_checkpoint <chn_aligner_checkpoint> --test --test_checkpoint <test_checkpoint>

Notice that now the value of --test_checkpoint must be a valid path to a CHN checkpoint, while --chn_aligner_checkpoint might be the path to a checkpoint of either CPN or DFPN.

Citation

If you find this thesis useful, please use the following citation:

@thesis{Alvarez2020,
    type = {Master's Thesis},
    author = {David Álvarez de la Torre},
    title = {Temporal copying and local hallucination for video onpainting},
    school = {ETH Zürich},
    year = 2020,
}
Owner
David Álvarez de la Torre
Founder of @lemonplot. Alumni of UPC and ETH.
David Álvarez de la Torre
This porject is intented to build the most accurate model for predicting the porbability of loan default

Estimating-Loan-Default-Probability IBA ML2 Mid-project / Kaggle Competition This porject is intented to build the most accurate model for predicting

Adil Gahramanov 1 Jan 24, 2022
Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic

Pytorch Implementation of Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic [Paper] [Colab is coming soon] Approach Example Usage To r

170 Jan 03, 2023
A semantic segmentation toolbox based on PyTorch

Introduction vedaseg is an open source semantic segmentation toolbox based on PyTorch. Features Modular Design We decompose the semantic segmentation

407 Dec 15, 2022
Unofficial PyTorch Implementation of Multi-Singer

Multi-Singer Unofficial PyTorch Implementation of Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus. Requirements See re

SunMail-hub 123 Dec 28, 2022
PyTorch implementation of Munchausen Reinforcement Learning based on DQN and SAC. Handles discrete and continuous action spaces

Exploring Munchausen Reinforcement Learning This is the project repository of my team in the "Advanced Deep Learning for Robotics" course at TUM. Our

Mohamed Amine Ketata 10 Mar 10, 2022
Generic Event Boundary Detection: A Benchmark for Event Segmentation

Generic Event Boundary Detection: A Benchmark for Event Segmentation We release our data annotation & baseline codes for detecting generic event bound

47 Nov 22, 2022
A Real-World Benchmark for Reinforcement Learning based Recommender System

RL4RS: A Real-World Benchmark for Reinforcement Learning based Recommender System RL4RS is a real-world deep reinforcement learning recommender system

121 Dec 01, 2022
Python library for analysis of time series data including dimensionality reduction, clustering, and Markov model estimation

deeptime Releases: Installation via conda recommended. conda install -c conda-forge deeptime pip install deeptime Documentation: deeptime-ml.github.io

495 Dec 28, 2022
A deep neural networks for images using CNN algorithm.

Example-CNN-Project This is a simple project showing how to implement deep neural networks using CNN algorithm. The dataset is taken from this link: h

Mohammad Amin Dadgar 3 Sep 16, 2022
A set of Deep Reinforcement Learning Agents implemented in Tensorflow.

Deep Reinforcement Learning Agents This repository contains a collection of reinforcement learning algorithms written in Tensorflow. The ipython noteb

Arthur Juliani 2.2k Jan 01, 2023
The first dataset on shadow generation for the foreground object in real-world scenes.

Object-Shadow-Generation-Dataset-DESOBA Object Shadow Generation is to deal with the shadow inconsistency between the foreground object and the backgr

BCMI 105 Dec 30, 2022
Emblaze - Interactive Embedding Comparison

Emblaze - Interactive Embedding Comparison Emblaze is a Jupyter notebook widget for visually comparing embeddings using animated scatter plots. It bun

CMU Data Interaction Group 77 Nov 24, 2022
A framework for annotating 3D meshes using the predictions of a 2D semantic segmentation model.

Semantic Meshes A framework for annotating 3D meshes using the predictions of a 2D semantic segmentation model. Paper If you find this framework usefu

Florian 40 Dec 09, 2022
AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data

AdaSpeech 2: Adaptive Text to Speech with Untranscribed Data [WIP] Unofficial Pytorch implementation of AdaSpeech 2. Requirements : All code written i

Rishikesh (ऋषिकेश) 63 Dec 28, 2022
PyTorch implementation of DreamerV2 model-based RL algorithm

PyDreamer Reimplementation of DreamerV2 model-based RL algorithm in PyTorch. The official DreamerV2 implementation can be found here. Features ... Run

118 Dec 15, 2022
OpenCVのGrabCut()を利用したセマンティックセグメンテーション向けアノテーションツール(Annotation tool using GrabCut() of OpenCV. It can be used to create datasets for semantic segmentation.)

[Japanese/English] GrabCut-Annotation-Tool GrabCut-Annotation-Tool.mp4 OpenCVのGrabCut()を利用したアノテーションツールです。 セマンティックセグメンテーション向けのデータセット作成にご使用いただけます。 ※Grab

KazuhitoTakahashi 30 Nov 18, 2022
Pansharpening by convolutional neural networks in the full resolution framework

Z-PNN: Zoom Pansharpening Neural Network Pansharpening by convolutional neural networks in the full resolution framework is a deep learning method for

20 Nov 24, 2022
you can add any codes in any language by creating its respective folder (if already not available).

HACKTOBERFEST-2021-WEB-DEV Beginner-Hacktoberfest Need Your first pr for hacktoberfest 2k21 ? come on in About This is repository of Responsive Portfo

Suman Sharma 8 Oct 17, 2022
Implementation of Shape Generation and Completion Through Point-Voxel Diffusion

Shape Generation and Completion Through Point-Voxel Diffusion Project | Paper Implementation of Shape Generation and Completion Through Point-Voxel Di

Linqi Zhou 103 Dec 29, 2022
Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini!

ConversorDeMedidas_CapuccinoGelado Este conversor criará a medida exata para sua receita de capuccino gelado da grandiosa Rafaella Ballerini! Requirem

Arthur Ottoni Ribeiro 48 Nov 15, 2022