Deep Illuminator is a data augmentation tool designed for image relighting. It can be used to easily and efficiently generate a wide range of illumination variants of a single image.

Overview

Deep Illuminator

Deep Illuminator is a data augmentation tool designed for image relighting. It can be used to easily and efficiently generate a wide range of illumination variants of a single image. It has been tested with several datasets and models and has been shown to succesfully improve performance. It has a built in visualizer created with Streamlit to preview how the target image can be relit. This tool has an accompanying paper.

Example Augmentations

Usage

The simplest method to use this tool is through Docker Hub:

docker pull kartvel/deep-illuminator

Visualizer

Once you have the Deep Illuminator image run the following command to launch the visualizer:

docker run -it --rm  --gpus all \
-p 8501:8501 --entrypoint streamlit \ 
kartvel/deep-illuminator run streamlit/streamlit_app.py

You will be able to interact with it on localhost:8501. Note: If you do not have NVIDIA gpu support enabled for docker simply remove the --gpus all option.

Generating Variants

It is possible to quickly generate multiple variants for images contained in a directory by using the following command:

docker run -it --rm --gpus all \                                                                                               ─╯
-v /path/to/input/images:/app/probe_relighting/originals \
-v /path/to/save/directory:/app/probe_relighting/output \
kartvel/deep-illuminator --[options]

Options

Option Values Description
mode ['synthetic', 'mid'] Selecting the style of probes used as a relighting guide.
step int Increment for the granularity of relighted images. max mid: 24, max synthetic: 360

Buidling Docker image or running without a container

Please read the following for other options: instructions

Benchmarks

Improved performance of R2D2 for [email protected] on HPatches

Training Dataset Overall Viewpoint Illumination
COCO - Original 71.0 65.4 77.1
COCO - Augmented 72.2 (+1.7%) 65.7 (+0.4%) 79.2 (+2.7%)
VIDIT - Original 66.7 60.5 73.4
VIDIT - Augmented 69.2 (+3.8%) 60.9 (+0.6%) 78.1 (+6.4%)
Aachen - Original 69.4 64.1 75.0
Aachen - Augmented 72.6 (+4.6%) 66.1 (+3.1%) 79.6 (+6.1%)

Improved performance of R2D2 for the Long-Term Visual Localization challenge on Aachen v1.1

Training Dataset 0.25m, 2° 0.5m, 5° 5m, 10°
COCO - Original 62.3 77.0 79.5
COCO - Augmented 65.4 (+5.0%) 83.8 (+8.8%) 92.7 (+16%)
VIDIT - Original 40.8 53.4 61.3
VIDIT - Augmented 53.9 (+32%) 71.2 (+33%) 83.2(+36%)
Aachen - Original 60.7 72.8 83.8
Aachen - Augmented 63.4 (+4.4%) 81.7 (+12%) 92.1 (+9.9%)

Acknowledgment

The developpement of the VAE for the visualizer was made possible by the PyTorch-VAE repository.

Bibtex

If you use this code in your project, please consider citing the following paper:

@misc{chogovadze2021controllable,
      title={Controllable Data Augmentation Through Deep Relighting}, 
      author={George Chogovadze and Rémi Pautrat and Marc Pollefeys},
      year={2021},
      eprint={2110.13996},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Shitty gaze mouse controller

demo.mp4 shitty_gaze_mouse_cotroller install tensofflow, cv2 run the main.py and as it starts it will collect data so first raise your left eyebrow(bo

16 Aug 30, 2022
MAGMA - a GPT-style multimodal model that can understand any combination of images and language

MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning Authors repo (alphabetical) Constantin (CoEich), Mayukh (Mayukh

Aleph Alpha GmbH 331 Jan 03, 2023
MassiveSumm: a very large-scale, very multilingual, news summarisation dataset

MassiveSumm: a very large-scale, very multilingual, news summarisation dataset This repository contains links to data and code to fetch and reproduce

Daniel Varab 19 Dec 16, 2022
An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

An open software package to develop BCI based brain and cognitive computing technology for recognizing user's intention using deep learning

deepbci 272 Jan 08, 2023
3D AffordanceNet is a 3D point cloud benchmark consisting of 23k shapes from 23 semantic object categories, annotated with 56k affordance annotations and covering 18 visual affordance categories.

3D AffordanceNet This repository is the official experiment implementation of 3D AffordanceNet benchmark. 3D AffordanceNet is a 3D point cloud benchma

49 Dec 01, 2022
Flexible Networks for Learning Physical Dynamics of Deformable Objects (2021)

Flexible Networks for Learning Physical Dynamics of Deformable Objects (2021) By Jinhyung Park, Dohae Lee, In-Kwon Lee from Yonsei University (Seoul,

Jinhyung Park 0 Jan 09, 2022
A multi-mode modulator for multi-domain few-shot classification (ICCV)

A multi-mode modulator for multi-domain few-shot classification (ICCV)

Yanbin Liu 8 Apr 28, 2022
Cancer metastasis detection with neural conditional random field (NCRF)

NCRF Prerequisites Data Whole slide images Annotations Patch images Model Training Testing Tissue mask Probability map Tumor localization FROC evaluat

Baidu Research 731 Jan 01, 2023
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020) Introduction AdaShare is a novel and differentiable approach fo

94 Dec 22, 2022
The dataset of tweets pulling from Twitters with keyword: Hydroxychloroquine, location: US, Time: 2020

HCQ_Tweet_Dataset: FREE to Download. Keywords: HCQ, hydroxychloroquine, tweet, twitter, COVID-19 This dataset is associated with the paper "Understand

2 Mar 16, 2022
Code for our paper A Transformer-Based Feature Segmentation and Region Alignment Method For UAV-View Geo-Localization,

FSRA This repository contains the dataset link and the code for our paper A Transformer-Based Feature Segmentation and Region Alignment Method For UAV

Dmmm 32 Dec 18, 2022
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning

This is a release of our VIMPAC paper to illustrate the implementations. The pretrained checkpoints and scripts will be soon open-sourced in HuggingFace transformers.

Hao Tan 74 Dec 03, 2022
Godot RL Agents is a fully Open Source packages that allows video game creators

Godot RL Agents The Godot RL Agents is a fully Open Source packages that allows video game creators, AI researchers and hobbiest the opportunity to le

Edward Beeching 326 Dec 30, 2022
For storing the complete exploration of Visual Question Answering for our B.Tech Project

Multi-Image vqa @authors: Akhilesh, Janhavi, Harsh Paper summary, Ideas tried and their corresponding results: on wiki Other discussions: on discussio

Harsh Raj 3 Jun 16, 2022
Data augmentation for NLP, accepted at EMNLP 2021 Findings

AEDA: An Easier Data Augmentation Technique for Text Classification This is the code for the EMNLP 2021 paper AEDA: An Easier Data Augmentation Techni

Akbar Karimi 81 Dec 09, 2022
Lacmus is a cross-platform application that helps to find people who are lost in the forest using computer vision and neural networks.

lacmus The program for searching through photos from the air of lost people in the forest using Retina Net neural nwtwork. The project is being develo

Lacmus Foundation 168 Dec 27, 2022
Automated Evidence Collection for Fake News Detection

Automated Evidence Collection for Fake News Detection This is the code repo for the Automated Evidence Collection for Fake News Detection paper accept

Mrinal Rawat 2 Apr 12, 2022
Cross Quality LFW: A database for Analyzing Cross-Resolution Image Face Recognition in Unconstrained Environments

Cross-Quality Labeled Faces in the Wild (XQLFW) Here, we release the database, evaluation protocol and code for the following paper: Cross Quality LFW

Martin Knoche 10 Dec 12, 2022
This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT).

Dynamic-Vision-Transformer (Pytorch) This repo contains the official code and pre-trained models for the Dynamic Vision Transformer (DVT). Not All Ima

210 Dec 18, 2022
MDETR: Modulated Detection for End-to-End Multi-Modal Understanding

MDETR: Modulated Detection for End-to-End Multi-Modal Understanding Website • Colab • Paper This repository contains code and links to pre-trained mod

Aishwarya Kamath 770 Dec 28, 2022