Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style disentanglement in image generation and translation" (ICCV 2021)

Overview

DiagonalGAN

Official Pytorch Implementation of "Diagonal Attention and Style-based GAN for Content-Style Disentanglement in Image Generation and Translation" (ICCV 2021)

Arxiv : link CVF : link

Contact

If you have any question,

e-mail : [email protected]

Abstract

One of the important research topics in image generative models is to disentangle the spatial contents and styles for their separate control. Although StyleGAN can generate content feature vectors from random noises, the resulting spatial content control is primarily intended for minor spatial variations, and the disentanglement of global content and styles is by no means complete. Inspired by a mathematical understanding of normalization and attention, here we present a novel hierarchical adaptive Diagonal spatial ATtention (DAT) layers to separately manipulate the spatial contents from styles in a hierarchical manner. Using DAT and AdaIN, our method enables coarse-to-fine level disentanglement of spatial contents and styles. In addition, our generator can be easily integrated into the GAN inversion framework so that the content and style of translated images from multi-domain image translation tasks can be flexibly controlled. By using various datasets, we confirm that the proposed method not only outperforms the existing models in disentanglement scores, but also provides more flexible control over spatial features in the generated images.

Models9

Environment Settings

Python 3.6.7 +

Pytorch 1.5.0 +

Dataset

For faster training, we recommend .jpg file format.

Download Link: CelebA-HQ / AFHQ

Unzip the files and put the folder into the data directory (./data/Celeb/data1024 , ./data/afhq)

To process the data for multidomain Diagonal GAN, run

./data/Celeb/Celeb_proc.py 

After download the CelebA-HQ dataset to save males / females images in different folders.

We randomly selected 1000 images as validation set for each domain (1000 males / 1000 females).

Save validation files into ./data/Celeb/val/males , ./data/Celeb/val/females

Train

Train Basic Diagonal GAN

For full-resolution CelebA-HQ training,

python train.py --datapath ./data/Celeb/data1024 --sched --max_size 1024 --loss r1

For full-resolution AFHQ training,

python train.py --datapath ./data/afhq --sched --max_size 512 --loss r1

Train Multidomain Diagonal GAN

For training multidomain (Males/ Females) models, run

python train_multidomain.py --datapath ./data/Celeb/mult --sched --max_size 256

Train IDInvert Encoders on pre-trained Multidomain Diagonal GAN

For training IDInvert on pre-trained model,

python train_idinvert.py --ckpt $MODEL_PATH$ 

or you can download the pre-trained Multidomain model.

Save the model in ./checkpoint/train_mult/CelebAHQ_mult.model

and set $MODEL_PATH$ as above.

Additional latent code optimization ( for inference )

To further optimize the latent codes,

python train_idinvert_opt.py --ckpt $MODEL_PATH$ --enc_ckpt $ENC_MODEL_PATH$

MODEL_PATH is pre-trained multidomain model directory, and

ENC_MODEL_PATH is IDInvert encoder model directory.

You can download the pre-trained IDInvert encoder models.

We also provide optimized latent codes.

Pre-trained model Download

Pre-trained Diagonal GAN on 1024x1024 CelebA-HQ : Link save to ./checkpoint/train_basic

Pre-trained Diagonal GAN on 512x512 AFHQ : Link save to ./checkpoint/train_basic

Pre-trained Multidomain Diagonal GAN on 256x256 CelebA-HQ : Link save to ./checkpoint/train_mult

Pre-trained IDInvert Encoders on 256x256 CelebA-HQ : Link save to ./checkpoint/train_idinvert

Optimized latent codes : Link save to ./codes

Generate Images

To generate the images from the pre-trained model,

python generate.py --mode $MODE$ --domain $DOM$ --target_layer $TARGET$

for $MODE$, there is three choices (sample , mixing, interpolation).

using 'sample' just sample random samples,

for 'mixing', generate images with random code on target layer $TARGET$

for 'interpolate', generate with random interpolation on target layer $TARGET$

also, we can choose style or content with setting $DOM$ with 'style' or 'content'

Generate Images on Inverted model

To generate the images from the pre-trained IDInvert,

python generate_idinvert.py --mode $MODE$ --domain $DOM$ --target_layer $TARGET$

for $MODE$, there is three choices (sample , mixing, encode).

using 'sample' just sample random samples,

for 'mixing', generate images with random code on target layer $TARGET$

for 'encode', generate auto-encoder reconstructions

we can choose style or content with setting $DOM$ with 'style' or 'content'

To use additional optimized latent codes, activate --use_code

Examples

python generate.py --mode sample 

03_content_sample

8x8 resolution content

python generate.py --mode mixing --domain content --target_layer 2 3

03_content_mixing

High resolution style

python generate.py --mode mixing --domain style --target_layer 14 15 16 17

02_style_mixing

Dictionary Learning with Uniform Sparse Representations for Anomaly Detection

Dictionary Learning with Uniform Sparse Representations for Anomaly Detection Implementation of the Uniform DL Representation for AD algorithm describ

Paul Irofti 1 Nov 23, 2022
Repository for paper "Non-intrusive speech intelligibility prediction from discrete latent representations"

Non-Intrusive Speech Intelligibility Prediction from Discrete Latent Representations Official repository for paper "Non-Intrusive Speech Intelligibili

Alex McKinney 5 Oct 25, 2022
Segmentation models with pretrained backbones. Keras and TensorFlow Keras.

Python library with Neural Networks for Image Segmentation based on Keras and TensorFlow. The main features of this library are: High level API (just

Pavel Yakubovskiy 4.2k Jan 09, 2023
ICSS - Interactive Continual Semantic Segmentation

Presentation This repository contains the code of our paper: Weakly-supervised c

Alteia 9 Jul 23, 2022
pixelNeRF: Neural Radiance Fields from One or Few Images

pixelNeRF: Neural Radiance Fields from One or Few Images Alex Yu, Vickie Ye, Matthew Tancik, Angjoo Kanazawa UC Berkeley arXiv: http://arxiv.org/abs/2

Alex Yu 1k Jan 04, 2023
A hobby project which includes a hand-gesture based virtual piano using a mobile phone camera and OpenCV library functions

Overview This is a hobby project which includes a hand-gesture controlled virtual piano using an android phone camera and some OpenCV library. My moti

Abhinav Gupta 1 Nov 19, 2021
Official Implementation of DE-DETR and DELA-DETR in "Towards Data-Efficient Detection Transformers"

DE-DETRs By Wen Wang, Jing Zhang, Yang Cao, Yongliang Shen, and Dacheng Tao This repository is an official implementation of DE-DETR and DELA-DETR in

Wen Wang 61 Dec 12, 2022
ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS.

ONNX Runtime Web demo is an interactive demo portal showing real use cases running ONNX Runtime Web in VueJS. It currently supports four examples for you to quickly experience the power of ONNX Runti

Microsoft 58 Dec 18, 2022
This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021.

inverse_attention This repository provides the official implementation of 'Learning to ignore: rethinking attention in CNNs' accepted in BMVC 2021. Le

Firas Laakom 5 Jul 08, 2022
Predicting Event Memorability from Contextual Visual Semantics

Predicting Event Memorability from Contextual Visual Semantics

0 Oct 06, 2021
Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022
IMBENS: class-imbalanced ensemble learning in Python.

IMBENS: class-imbalanced ensemble learning in Python. Links: [Documentation] [Gallery] [PyPI] [Changelog] [Source] [Download] [知乎/Zhihu] [中文README] [a

Zhining Liu 176 Jan 04, 2023
Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

How Well Do Self-Supervised Models Transfer? This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Mod

Linus Ericsson 157 Dec 16, 2022
A Simple and Versatile Framework for Object Detection and Instance Recognition

SimpleDet - A Simple and Versatile Framework for Object Detection and Instance Recognition Major Features FP16 training for memory saving and up to 2.

TuSimple 3k Dec 12, 2022
Predicts an answer in yes or no.

Oui-ou-non-prediction Predicts an answer in 'yes' or 'no'. It is based on the game 'effeuiller la marguerite' in which the person plucks flower petals

Ananya Gupta 1 Jan 15, 2022
Physics-Informed Neural Networks (PINN) and Deep BSDE Solvers of Differential Equations for Scientific Machine Learning (SciML) accelerated simulation

NeuralPDE NeuralPDE.jl is a solver package which consists of neural network solvers for partial differential equations using scientific machine learni

SciML Open Source Scientific Machine Learning 680 Jan 02, 2023
Pytorch-3dunet - 3D U-Net model for volumetric semantic segmentation written in pytorch

pytorch-3dunet PyTorch implementation 3D U-Net and its variants: Standard 3D U-Net based on 3D U-Net: Learning Dense Volumetric Segmentation from Spar

Adrian Wolny 1.3k Dec 28, 2022
Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

Ursa Zrimsek 2 Dec 14, 2022
Low-code/No-code approach for deep learning inference on devices

EzEdgeAI A concept project that uses a low-code/no-code approach to implement deep learning inference on devices. It provides a componentized framewor

On-Device AI Co., Ltd. 7 Apr 05, 2022
Python library containing BART query generation and BERT-based Siamese models for neural retrieval.

Neural Retrieval Embedding-based Zero-shot Retrieval through Query Generation leverages query synthesis over large corpuses of unlabeled text (such as

Amazon Web Services - Labs 35 Apr 14, 2022