2021:"Bridging Global Context Interactions for High-Fidelity Image Completion"

Related tags

Deep LearningTFill
Overview

TFill

arXiv | Project

This repository implements the training, testing and editing tools for "Bridging Global Context Interactions for High-Fidelity Image Completion" by Chuanxia Zheng, Tat-Jen Cham, Jianfei Cai and Dinh Phung. Given masked images, the proposed TFill model is able to generate high-fidelity plausible results on various settings.

Examples

teaser

Framework

We propose the two-stages image completion framework, where the upper content inference network (TFill-Coarse) generates semantically correct content using a transformer encoder to directly capture the global context information; the lower appearance refinement network (TFill-refined) copies global visible and generated features to holes.

teaser

Getting started

  • Clone this repo:
git clone https://github.com/lyndonzheng/TFill
cd TFill

Requirements

The original model is trained and evaluated with Pytorch v1.9.1, which cannot be visited in current PyTorch. Therefore, we create a new environment with Pytorch v1.10.0 to test the model, where the performance is the same.

A suitable conda environment named Tfill can be created and activated with:

conda env create -f environment.yaml
conda activate TFill

Runing pretrained models

Download the pre-trained models using the following links (CelebA-HQ, FFHQ, ImageNet, Plcases2 ) and put them undercheckpoints/ directory. It should have the following structure:

./checkpoints/
├── celeba
│   ├── latest_net_D.pth
│   ├── latest_net_D_Ref.pth
│   ├── latest_net_E.pth
│   ├── latest_net_G.pth
│   ├── latest_net_G_Ref.pth
│   ├── latest_net_T.pth
├── ffhq
│   ├── ...
├── ...
  • Test the model
sh ./scripts/test.sh

For different models, the users just need to modify lines 2-4, including name,img_file,mask_file. For instance, we can replace the celeba to imagenet.

The default results will be stored under the results/ folder, in which:

  • examples/: shows original and masked images;
  • img_out/: shows upsampled Coarse outputs;
  • img_ref_out/: shows the final Refined outputs.

Datasets

  • face dataset:
    • 24,183 training images and 2,824 test images from CelebA and use the algorithm of Growing GANs to get the high-resolution CelebA-HQ dataset.
    • 60,000 training images and 10,000 test images from FFHQ provided by StyleGAN.
  • natural scenery: original training and val images from Places2.
  • object original training images from ImageNet.

Traning

  • Train a model (two stage: Coarse and Refinement)
sh ./scripts/train.sh

The default setting is for the top Coarse training. The users just need to replace the coarse with refine at line 6. Then, the model can continue training for high-resolution image completion. More hyper-parameter can be in options/.

The coarse results using transformer and restrictive CNN is impressive, which provides plausible results for both foreground objects and background scene.

teaser teaser

GUI

The GUI operation is similar to our previous GUI in PIC, where steps are also the same.

Basic usage is:

sh ./scripts/ui.sh 

In gui/ui_model.py, users can modify the img_root(line 30) and the corresponding img_files(line 31) to randomly edit images from the testing dataset.

Editing Examples

  • Results (original, output) for face editing

teaser

  • Results (original, masked input, output) for nature scene editing

teaser

Next

  • Higher-resolution pluralistic image completion

License

This work is licensed under a MIT License.

This software is for educational and academic research purpose only. If you wish to obtain a commercial royalty bearing license to this software, please contact us at [email protected].

Citation

The code also uses our previous PIC. If you use this code for your research, please cite our papers.

@misc{zheng2021tfill,
      title={Bridging Global Context Interactions for High-Fidelity Image Completion},
      author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei and Phung, Dinh},
      year={2021},
      eprint={2104.00845},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@inproceedings{zheng2019pluralistic,
  title={Pluralistic Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
  pages={1438--1447},
  year={2019}
}

@article{zheng2021pluralistic,
  title={Pluralistic Free-From Image Completion},
  author={Zheng, Chuanxia and Cham, Tat-Jen and Cai, Jianfei},
  journal={International Journal of Computer Vision},
  pages={1--20},
  year={2021},
  publisher={Springer}
}
Owner
Chuanxia Zheng
Chuanxia Zheng
Official Implementation of Swapping Autoencoder for Deep Image Manipulation (NeurIPS 2020)

Swapping Autoencoder for Deep Image Manipulation Taesung Park, Jun-Yan Zhu, Oliver Wang, Jingwan Lu, Eli Shechtman, Alexei A. Efros, Richard Zhang UC

449 Dec 27, 2022
PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations

Thalles Silva 1.7k Dec 28, 2022
A small tool to joint picture including gif

README 做设计的时候遇到拼接长图的情况,但是发现没有什么好用的能拼接gif的工具。 于是自己写了个gif拼接小工具。 可以自动拼接gif、png和jpg等常见格式。 效果 从上至下 从下至上 从左至右 从右至左 使用 克隆仓库 git clone https://github.com/Dels

3 Dec 15, 2021
code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`

Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation (CVPR 2021) Introduction PBR is a conceptually simple yet effective

H.Chen 143 Jan 05, 2023
An offline deep reinforcement learning library

d3rlpy: An offline deep reinforcement learning library d3rlpy is an offline deep reinforcement learning library for practitioners and researchers. imp

Takuma Seno 817 Jan 02, 2023
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

Federico Galatolo 172 Dec 22, 2022
Learning-based agent for Google Research Football

TiKick 1.Introduction Learning-based agent for Google Research Football Code accompanying the paper "TiKick: Towards Playing Multi-agent Football Full

Tsinghua AI Research Team for Reinforcement Learning 90 Dec 26, 2022
Code for the paper "On the Power of Edge Independent Graph Models"

Edge Independent Graph Models Code for the paper: "On the Power of Edge Independent Graph Models" Sudhanshu Chanpuriya, Cameron Musco, Konstantinos So

Konstantinos Sotiropoulos 0 Oct 26, 2021
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
[CVPR2021 Oral] End-to-End Video Instance Segmentation with Transformers

VisTR: End-to-End Video Instance Segmentation with Transformers This is the official implementation of the VisTR paper: Installation We provide instru

Yuqing Wang 687 Jan 07, 2023
Open source implementation of "A Self-Supervised Descriptor for Image Copy Detection" (SSCD).

A Self-Supervised Descriptor for Image Copy Detection (SSCD) This is the open-source codebase for "A Self-Supervised Descriptor for Image Copy Detecti

Meta Research 68 Jan 04, 2023
[3DV 2020] PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction

PeeledHuman: Robust Shape Representation for Textured 3D Human Body Reconstruction International Conference on 3D Vision, 2020 Sai Sagar Jinka1, Rohan

Rohan Chacko 39 Oct 12, 2022
This code is for eCaReNet: explainable Cancer Relapse Prediction Network.

eCaReNet This code is for eCaReNet: explainable Cancer Relapse Prediction Network. (Towards Explainable End-to-End Prostate Cancer Relapse Prediction

Institute of Medical Systems Biology 2 Jul 28, 2022
This library contains a Tensorflow implementation of the paper Stability Analysis of Unfolded WMMSE for Power Allocation

UWMMSE-stability Tensorflow implementation of Stability Analysis of UWMMSE Overview This library contains a Tensorflow implementation of the paper Sta

Arindam Chowdhury 1 Nov 16, 2022
Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021)

Monocular 3D Object Detection: An Extrinsic Parameter Free Approach (CVPR2021) Yunsong Zhou, Yuan He, Hongzi Zhu, Cheng Wang, Hongyang Li, Qinhong Jia

Yunsong Zhou 51 Dec 14, 2022
Pgn2tex - Scripts to convert pgn files to latex document. Useful to build books or pdf from pgn studies

Pgn2Latex (WIP) A simple script to make pdf from pgn files and studies. It's sti

12 Jul 23, 2022
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Mohamed Afham 152 Dec 23, 2022
Film review classification

Film review classification Решение задачи классификации отзывов на фильмы на положительные и отрицательные с помощью рекуррентных нейронных сетей 1. З

Nikita Dukin 3 Jan 21, 2022
Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

Rodrigo Arenas 36 Dec 12, 2022
PyTorch Implementation of Vector Quantized Variational AutoEncoders.

Pytorch implementation of VQVAE. This paper combines 2 tricks: Vector Quantization (check out this amazing blog for better understanding.) Straight-Th

Vrushank Changawala 2 Oct 06, 2021