Implementation of Diverse Semantic Image Synthesis via Probability Distribution Modeling

Related tags

Deep LearningINADE
Overview

Diverse Semantic Image Synthesis via Probability Distribution Modeling (CVPR 2021)

Architecture

Paper

Zhentao Tan, Menglei Chai, Dongdong Chen, Jing Liao, Qi Chu, [Bin Liu], Gang Hua, Nenghai Yu

Abstract

Semantic image synthesis, translating semantic layouts to photo-realistic images, is a one-to-many mapping problem. Though impressive progress has been recently made, diverse semantic synthesis that can efficiently produce semantic-level multimodal results, still remains a challenge. In this paper, we propose a novel diverse semantic image synthesis framework from the perspective of semantic class distributions, which naturally supports diverse generation at semantic or even instance level. We achieve this by modeling class-level conditional modulation parameters as continuous probability distributions instead of discrete values, and sampling per-instance modulation parameters through instance-adaptive stochastic sampling that is consistent across the network. Moreover, we propose prior noise remapping, through linear perturbation parameters encoded from paired references, to facilitate supervised training and exemplar-based instance style control at test time. Extensive experiments on multiple datasets show that our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.

Installation

Clone this repo.

git clone https://github.com/tzt101/INADE.git
cd INADE/

This code requires PyTorch 1.6 and python 3+. Please install dependencies by

pip install -r requirements.txt

Dataset Preparation

The Cityscapes and ADE20K dataset can be downloaded and prepared following SPADE. The CelebAMask-HQ can be downloaded from CelebAMask-HQ, you need to to integrate the separated annotations into an image file (the format like other datasets, e.g. Cityscapes and ADE20K). The DeepFashion can be downloaded from SMIS, and the version with two persons can be downloaded from OneDrive.

To make or reid the instance map, you can use the following commands:

python make_instances.py --path [Path_to_dataset] --dataset [ade20k | cityscapes | celeba | deepfashion]

Generating Images Using Pretrained Model

Once the dataset is ready, the result images can be generated using pretrained models.

  1. Download the pretrained models from the OneDrive, save it in checkpoints/. The structure is as follows:
./checkpoints/
    inade_ade20k/
        best_net_G.pth
        best_net_IE.pth
    inade_celeba/
        best_net_G.pth
        best_net_IE.pth
    inade_cityscapes/
        best_net_G.pth
        best_net_IE.pth
    inade_deepfashion/
        best_net_G.pth
        best_net_IE.pth

The noise_nc is 64 for all pretrained models except which on deepfashion (set to 8). Because we find that it's enough for quality and diversity.

  1. Generate the images on the test dataset.
python test.py --name [model_name] --norm_mode inade --batchSize 1 --gpu_ids 0 --which_epoch best --dataset_mode [dataset] --dataroot [Path_to_dataset]

[model_name] is the directory name of the checkpoint file downloaded in Step 1, such as inade_ade20k and inade_cityscapes. [dataset] can be on of ade20k, celeba, cityscapes and deepfashion. [Path_to_dataset] is the path to the dataset. If you want to use encoder, you can add the another option --use_vae.

Training New Models

You can train your own model with the following command:

# To train CLADE and CLADE-ICPE.
python train.py --name [experiment_name] --dataset_mode [dataset] --norm_mode inade --use_vae --dataroot [Path_to_dataset]

If you want to test the model during the training step, please set --train_eval. By default, the model every 10 epoch will be test in terms of FID. Finally, the model with best FID score will be saved as best_net_G.pth.

Calculate FID

We provide the code to calculate the FID which is based on rpo. We have pre-calculated the distribution of real images (all images are resized to 256×256 except cityscapes is 512×256) in training set of each dataset and saved them in ./datasets/train_mu_si/. You can run the following command:

python fid_score.py [Path_to_real_image] [Path_to_fake_image] --batch-size 1 --gpu 0 --load_np_name [dataset] --resize_size [Size]

The provided [dataset] are: ade20k, celeba, cityscapes, coco and deepfashion. You can save the new dataset by replacing --load_np_name [dataset] with --save_np_name [dataset].

New Useful Options

The new options are as follows:

  • --use_amp: if specified, use AMP training mode.
  • --train_eval: if sepcified, evaluate the model during training.
  • --eval_dims: the default setting is 2048, Dimensionality of Inception features to use.
  • --eval_epoch_freq: the default setting is 10, frequency of calculate fid score at the end of epochs.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • trainers/pix2pix_trainer.py: harnesses and reports the progress of training.
  • models/pix2pix_model.py: creates the networks, and compute the losses
  • models/networks/: defines the architecture of all models
  • options/: creates option lists using argparse package. More individuals are dynamically added in other files as well. Please see the section below.
  • data/: defines the class for loading images and label maps.

Citation

If you use this code for your research, please cite our papers.

@inproceedings{tan2021diverse,
  title={Diverse Semantic Image Synthesis via Probability Distribution Modeling},
  author={Tan, Zhentao and Chai, Menglei and Chen, Dongdong and Liao, Jing and Chu, Qi and Liu, Bin and Hua, Gang and Yu, Nenghai},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7962--7971},
  year={2021}
}

Acknowledgments

This code borrows heavily from SPADE.

Code repository for EMNLP 2021 paper 'Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods'

Adversarial Attacks on Knowledge Graph Embeddings via Instance Attribution Methods This is the code repository to accompany the EMNLP 2021 paper on ad

Peru Bhardwaj 7 Sep 25, 2022
MetaBalance: High-Performance Neural Networks for Class-Imbalanced Data

This repository is the official PyTorch implementation of Meta-Balance. Find the paper on arxiv MetaBalance: High-Performance Neural Networks for Clas

Arpit Bansal 20 Oct 18, 2021
Code for ICCV2021 paper SPEC: Seeing People in the Wild with an Estimated Camera

SPEC: Seeing People in the Wild with an Estimated Camera [ICCV 2021] SPEC: Seeing People in the Wild with an Estimated Camera, Muhammed Kocabas, Chun-

Muhammed Kocabas 187 Dec 26, 2022
Multispectral Object Detection with Yolov5

Multispectral-Object-Detection Intro Official Code for Cross-Modality Fusion Transformer for Multispectral Object Detection. Multispectral Object Dete

Richard Fang 121 Jan 01, 2023
Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition

Similarity-based Gray-box Adversarial Attack Against Deep Face Recognition Introduction Run attack: SGADV.py Objective function: foolbox/attacks/gradi

1 Jul 18, 2022
Code for "OctField: Hierarchical Implicit Functions for 3D Modeling (NeurIPS 2021)"

OctField(Jittor): Hierarchical Implicit Functions for 3D Modeling Introduction This repository is code release for OctField: Hierarchical Implicit Fun

55 Dec 08, 2022
SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images

SymmetryNet SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images ACM Transactions on Gra

26 Dec 05, 2022
Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

Self-Supervised Multi-Frame Monocular Scene Flow 3D visualization of estimated depth and scene flow (overlayed with input image) from temporally conse

Visual Inference Lab @TU Darmstadt 85 Dec 22, 2022
DeepStochlog Package For Python

DeepStochLog Installation Installing SWI Prolog DeepStochLog requires SWI Prolog to run. Run the following commands to install: sudo apt-add-repositor

KU Leuven Machine Learning Research Group 17 Dec 23, 2022
Six - a Python 2 and 3 compatibility library

Six is a Python 2 and 3 compatibility library. It provides utility functions for smoothing over the differences between the Python versions with the g

Benjamin Peterson 919 Dec 28, 2022
Exadel CompreFace is a free and open-source face recognition GitHub project

Exadel CompreFace is a leading free and open-source face recognition system Exadel CompreFace is a free and open-source face recognition service that

Exadel 2.6k Jan 04, 2023
Entity-Based Knowledge Conflicts in Question Answering.

Entity-Based Knowledge Conflicts in Question Answering Run Instructions | Paper | Citation | License This repository provides the Substitution Framewo

Apple 35 Oct 19, 2022
Implementation for Shape from Polarization for Complex Scenes in the Wild

sfp-wild Implementation for Shape from Polarization for Complex Scenes in the Wild project website | paper Code and dataset will be released soon. Int

Chenyang LEI 41 Dec 23, 2022
Semantic similarity computation with different state-of-the-art metrics

Semantic similarity computation with different state-of-the-art metrics Description • Installation • Usage • License Description TaxoSS is a semantic

6 Jun 22, 2022
Implementation of CVAE. Trained CVAE on faces from UTKFace Dataset to produce synthetic faces with a given degree of happiness/smileyness.

Conditional Smiles! (SmileCVAE) About Implementation of AE, VAE and CVAE. Trained CVAE on faces from UTKFace Dataset. Using an encoding of the Smile-s

Raúl Ortega 3 Jan 09, 2022
Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works

GDAP Code for Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework that Works Environment Python (verified: v3.8) CUDA

45 Oct 29, 2022
Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks

Assessing the Influence of Models on the Performance of Reinforcement Learning Algorithms applied on Continuous Control Tasks This is the master thesi

Giacomo Arcieri 1 Mar 21, 2022
Code for our paper "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021

SimCLS Code for our paper: "SimCLS: A Simple Framework for Contrastive Learning of Abstractive Summarization", ACL 2021 1. How to Install Requirements

Yixin Liu 150 Dec 12, 2022
Official Implementation of Few-shot Visual Relationship Co-localization

VRC Official implementation of the Few-shot Visual Relationship Co-localization (ICCV 2021) paper project page | paper Requirements Use python = 3.8.

22 Oct 13, 2022
aka "Bayesian Methods for Hackers": An introduction to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view. All in pure Python ;)

Bayesian Methods for Hackers Using Python and PyMC The Bayesian method is the natural approach to inference, yet it is hidden from readers behind chap

Cameron Davidson-Pilon 25.1k Jan 02, 2023