[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Last update: Nov 09, 2022

Related tags

Overview

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Installation

pip install -r requirements.txt

Dataset Preparation

Given the dataset, please prepare the images paths in a folder named by the dataset with the following folder strcuture.

    flist/dataset_name
        ├── train.flist    # paths of training images
        ├── valid.flist    # paths of validation images
        └── test.flist     # paths of testing images

In this work, we use CelebA-HQ (Download availbale here), Places2 (Download availbale here), ParisStreet View (need author's permission to download)

ImageNet K-means Cluster: The kmeans_centers.npy is downloaded from image-gpt, it's used to quantitize the low-resolution images.

Testing with Pre-trained Models

Download pre-trained models:

CelebA-HQ: BAT ; Upsmapler
Places2: BAT ; Upsmapler
Paris-StreetView: BAT ; Upsmapler

Put the pre-trained model under the checkpoints folder, e.g.

    checkpoints
        ├── celebahq_bat_pretrain
            ├── latest_net_G.pth

Prepare the input images and masks to test.

python bat_sample.py --num_sample [1] --tran_model [bat name] --up_model [upsampler name] --input_dir [dir of input] --mask_dir [dir of mask] --save_dir [dir to save results]

Training New Models

Pretrained VGG model Download from here, move it to models/. This model is used to calculate training loss for the upsampler.

New models can be trained with the following commands.

Prepare dataset. Use --dataroot option to locate the directory of file lists, e.g. ./flist, and specify the dataset name to train with --dataset_name option. Identify the types and mask ratio using --mask_type and --pconv_level options.
Train the transformer.

# To specify your own dataset or settings in the bash file.
bash train_bat.sh

Please note that some of the transformer settings are defined in train_bat.py instead of options/, and this script will take every available gpus for training, please define the GPUs via CUDA_VISIBLE_DEVICES instead of --gpu_ids, which is used for the upsampler.

Train the upsampler.

# To specify your own dataset or settings in the bash file.
bash train_up.sh

The upsampler is typically trained by the low-resolution ground truth, we find that using some samples from the trained BAT might be helpful to improve the performance i.e. PSNR, SSIM. But the sampling process is quite time consuming, training with ground truth also could yield reasonable results.

Citation

If you find this code helpful for your research, please cite our papers.

@inproceedings{yu2021diverse,
  title={Diverse Image Inpainting with Bidirectional and Autoregressive Transformers},
  author={Yu, Yingchen and Zhan, Fangneng and Wu, Rongliang and Pan, Jianxiong and Cui, Kaiwen and Lu, Shijian and Ma, Feiying and Xie, Xuansong and Miao, Chunyan},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  year={2021}
}

Acknowledgments

This code borrows heavily from SPADE and minGPT, we apprecite the authors for sharing their codes.

[ACM MM 2021] Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Related tags

Overview

Diverse Image Inpainting with Bidirectional and Autoregressive Transformers

Installation

Dataset Preparation

Testing with Pre-trained Models

Training New Models

Citation

Acknowledgments

Owner

Yingchen Yu

A chemical analysis of lipophilicities & molecule drawings including ML

Commonsense Ability Tests

Kaggle Feedback Prize - Evaluating Student Writing 15th solution

Node Dependent Local Smoothing for Scalable Graph Learning

Release of the ConditionalQA dataset

A Python implementation of the Locality Preserving Matching (LPM) method for pruning outliers in image matching.

Deep-learning-roadmap - All You Need to Know About Deep Learning - A kick-starter

Let Python optimize the best stop loss and take profits for your TradingView strategy.

(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

The official repository for "Intermediate Layers Matter in Momentum Contrastive Self Supervised Learning" paper.

A New Approach to Overgenerating and Scoring Abstractive Summaries

Vector.ai assignment

Machine Learning with JAX Tutorials

An energy estimator for eyeriss-like DNN hardware accelerator

This repository is for Competition for ML_data class

Code for the paper: Fighting Fake News: Image Splice Detection via Learned Self-Consistency

Code for the paper: "On the Bottleneck of Graph Neural Networks and Its Practical Implications"

Torch-ngp - A pytorch implementation of the hash encoder proposed in instant-ngp

Tensorflow Implementation of Pixel Transposed Convolutional Networks (PixelTCN and PixelTCL)

In this repo we reproduce and extend results of Learning in High Dimension Always Amounts to Extrapolation by Balestriero et al. 2021