AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

Overview

License CC BY-NC-SA 4.0 Python 3.6 Packagist Last Commit Maintenance Contributing Ask Me Anything !

AsymmetricGAN for Image-to-Image Translation

AsymmetricGAN Framework for Multi-Domain Image-to-Image Translation

UN_Framework

AsymmetricGAN Framework for Hand Gesture-to-Gesture Translation

SU_Framework

Conference paper | Extended paper | Project page | Slides | Poster

Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation.
Hao Tang1, Dan Xu2, Wei Wang3, Yan Yan4 and Nicu Sebe1.
1University of Trento, Italy, 2University of Oxford, UK, 3EPFL, Switzerland, 4Texas State University, USA.
In ACCV 2018 (Oral).
The repository offers the official implementation of our paper in PyTorch.

License

Copyright (C) 2019 University of Trento, Italy.

All rights reserved. Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International)

The code is released for academic research use only. For commercial use, please contact [email protected].

Installation

Clone this repo.

git clone https://github.com/Ha0Tang/AsymmetricGAN
cd AsymmetricGAN/

This code requires PyTorch 0.4.1 and python 3.6+. Please install dependencies by

pip install -r requirements.txt (for pip users)

or

./scripts/conda_deps.sh (for Conda users)

To reproduce the results reported in the paper, you would need two NVIDIA GeForce GTX 1080 Ti GPUs or two NVIDIA TITAN Xp GPUs.

Dataset Preparation

For hand gesture-to-gesture translation task, we use NTU Hand Digit and Creative Senz3D datasets. Both datasets must be downloaded beforehand. Please download them on the respective webpages. In addition, follow GestureGAN to prepare both datasets. Please cite their papers if you use the data.

Preparing NTU Hand Digit Dataset. The dataset can be downloaded in this paper. After downloading it we adopt OpenPose to generate hand skeletons and use them as training and testing data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Creative Senz3D Dataset. The dataset can be downloaded here. After downloading it we adopt OpenPose to generate hand skeletons and use them as training data in our experiments. Note that we filter out failure cases in hand gesture estimation for training and testing. Please cite their papers if you use this dataset. Train/Test splits for Creative Senz3D dataset can be downloaded from here.

Preparing Your Own Datasets. Each training sample in the dataset will contain {Ix,Iy,Cx,Cy}, where Ix=image x, Iy=image y, Cx=Controllable structure of image x, and Cy=Controllable structure of image y. Of course, you can use AsymmetricGAN for your own datasets and tasks.

Generating Images Using Pretrained Model

Once the dataset is ready. The result images can be generated using pretrained models.

  1. You can download a pretrained model (e.g. ntu_asymmetricgan) with the following script:
bash ./scripts/download_asymmetricgan_model.sh ntu_asymmetricgan

The pretrained model is saved at ./checkpoints/[type]_pretrained. Check here for all the available AsymmetricGAN models.

  1. Generate images using the pretrained model.

For NTU Dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D Dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan_pretrained \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

If you are running on CPU mode, change --gpu_ids 0 to --gpu_ids -1. Note that testing requires a lot of time and large amount of disk storage space. If you don't have enough space, append --saveDisk on the command line.

  1. The outputs images are stored at ./results/[type]_pretrained/ by default. You can view them using the autogenerated HTML file in the directory.

Training New Models

New models can be trained with the following commands.

  1. Prepare dataset.

  2. Train.

For NTU dataset:

export CUDA_VISIBLE_DEVICES=3,4;
python train.py --dataroot ./datasets/ntu \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

For Senz3D dataset:

export CUDA_VISIBLE_DEVICES=5,7;
python train.py --dataroot ./datasets/senz3d \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0,1 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip \
	--lambda_L1 800 \
	--cyc_L1 0.1 \
	--lambda_identity 0.01 \
	--lambda_feat 1000 \
	--display_id 0 \
	--niter 10 \
	--niter_decay 10

There are many options you can specify. Please use python train.py --help. The specified options are printed to the console. To specify the number of GPUs to utilize, use export CUDA_VISIBLE_DEVICES=[GPU_ID].

To view training results and loss plots on local computers, set --display_id to a non-zero value and run python -m visdom.server on a new terminal and click the URL http://localhost:8097. On a remote server, replace localhost with your server's name, such as http://server.trento.cs.edu:8097.

Can I continue/resume my training?

To fine-tune a pre-trained model, or resume the previous training, use the --continue_train --which_epoch --epoch_count flag. The program will then load the model based on epoch you set in --which_epoch . Set --epoch_count to specify a different starting epoch count.

Testing

Testing is similar to testing pretrained models.

For NTU dataset:

python test.py --dataroot [path_to_NTU_dataset] \
	--name ntu_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

For Senz3D dataset:

python test.py --dataroot [path_to_Senz3D_dataset] \
	--name senz3d_asymmetricgan \
	--model asymmetricgan \
	--which_model_netG resnet_9blocks \
	--which_direction AtoB \
	--dataset_mode aligned \
	--norm instance \
	--gpu_ids 0 \
	--ngf_t 64 \
	--ngf_r 4 \
	--batchSize 4 \
	--loadSize 286 \
	--fineSize 256 \
	--no_flip

Use --how_many to specify the maximum number of images to generate. By default, it loads the latest checkpoint. It can be changed using --which_epoch.

Code Structure

  • train.py, test.py: the entry point for training and testing.
  • models/asymmetricgan_model.py: creates the networks, and compute the losses.
  • models/networks/: defines the architecture of all models for GestureGAN.
  • options/: creates option lists using argparse package.
  • data/: defines the class for loading images and controllable structures.

Evaluation Code

We use several metrics to evaluate the quality of the generated images:

To Do List

  • Upload supervised AsymmetricGAN code for hand gesture-to-gesture translation
  • Upload unsupervised AsymmetricGAN code for multi-domain image-to-image translation: code

Citation

If you use this code for your research, please cite our papers.

@article{tang2019asymmetric,
  title={Asymmetric Generative Adversarial Networks for Image-to-Image Translation},
  author={Hao Tang and Dan Xu and Hong Liu and Nicu Sebe},
  journal={arXiv preprint arXiv:1912.06931},
  year={2019}
}

@inproceedings{tang2018dual,
  title={Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation},
  author={Tang, Hao and Xu, Dan and Wang, Wei and Yan, Yan and Sebe, Nicu},
  booktitle={ACCV},
  year={2018}
}

Acknowledgments

This source code is inspired by Pix2pix and GestureGAN.

Related Projects

Contributions

If you have any questions/comments/bug reports, feel free to open a github issue or pull a request or e-mail to the author Hao Tang ([email protected]).

Owner
Hao Tang
To develop a complete mind: Study the science of art; Study the art of science. Learn how to see. Realize that everything connects to everything else.
Hao Tang
A treasure chest for visual recognition powered by PaddlePaddle

简体中文 | English PaddleClas 简介 飞桨图像识别套件PaddleClas是飞桨为工业界和学术界所准备的一个图像识别任务的工具集,助力使用者训练出更好的视觉模型和应用落地。 近期更新 2021.11.1 发布PP-ShiTu技术报告,新增饮料识别demo 2021.10.23 发

4.6k Dec 31, 2022
Soft actor-critic is a deep reinforcement learning framework for training maximum entropy policies in continuous domains.

This repository is no longer maintained. Please use our new Softlearning package instead. Soft Actor-Critic Soft actor-critic is a deep reinforcement

Tuomas Haarnoja 752 Jan 07, 2023
HDMapNet: A Local Semantic Map Learning and Evaluation Framework

HDMapNet_devkit Devkit for HDMapNet. HDMapNet: A Local Semantic Map Learning and Evaluation Framework Qi Li, Yue Wang, Yilun Wang, Hang Zhao [Paper] [

Tsinghua MARS Lab 421 Jan 04, 2023
Reverse engineer your pytorch vision models, in style

🔍 Rover Reverse engineer your CNNs, in style Rover will help you break down your CNN and visualize the features from within the model. No need to wri

Mayukh Deb 32 Sep 24, 2022
Tutorial materials for Part of NSU Intro to Deep Learning with PyTorch.

Intro to Deep Learning Materials are part of North South University (NSU) Intro to Deep Learning with PyTorch workshop series. (Slides) Related materi

Hasib Zunair 9 Jun 08, 2022
Instance-wise Occlusion and Depth Orders in Natural Scenes (CVPR 2022)

Instance-wise Occlusion and Depth Orders in Natural Scenes Official source code. Appears at CVPR 2022 This repository provides a new dataset, named In

27 Dec 27, 2022
Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

Zoey Li 87 Dec 26, 2022
Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

Neural Scene Flow Fields PyTorch implementation of paper "Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes", CVPR 2021 [Projec

Zhengqi Li 583 Dec 30, 2022
Pytorch implementation of Implicit Behavior Cloning.

Implicit Behavior Cloning - PyTorch (wip) Pytorch implementation of Implicit Behavior Cloning. Install conda create -n ibc python=3.8 pip install -r r

Kevin Zakka 49 Dec 25, 2022
Individual Tree Crown classification on WorldView-2 Images using Autoencoder -- Group 9 Weak learners - Final Project (Machine Learning 2020 Course)

Created by Olga Sutyrina, Sarah Elemili, Abduragim Shtanchaev and Artur Bille Individual Tree Crown classification on WorldView-2 Images using Autoenc

2 Dec 08, 2022
[CVPR 2022] CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation

CoTTA Code for our CVPR 2022 paper Continual Test-Time Domain Adaptation Prerequisite Please create and activate the following conda envrionment. To r

Qin Wang 87 Jan 08, 2023
Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation'

OD-Rec Codes for SIGIR'22 Paper 'On-Device Next-Item Recommendation with Self-Supervised Knowledge Distillation' Paper, saved teacher models and Andro

Xin Xia 11 Nov 22, 2022
[ICCV 2021 Oral] Deep Evidential Action Recognition

DEAR (Deep Evidential Action Recognition) Project | Paper & Supp Wentao Bao, Qi Yu, Yu Kong International Conference on Computer Vision (ICCV Oral), 2

Wentao Bao 80 Jan 03, 2023
This repository contains the source code of our work on designing efficient CNNs for computer vision

Efficient networks for Computer Vision This repo contains source code of our work on designing efficient networks for different computer vision tasks:

Sachin Mehta 386 Nov 26, 2022
The codebase for our paper "Generative Occupancy Fields for 3D Surface-Aware Image Synthesis" (NeurIPS 2021)

Generative Occupancy Fields for 3D Surface-Aware Image Synthesis (NeurIPS 2021) Project Page | Paper Xudong Xu, Xingang Pan, Dahua Lin and Bo Dai GOF

xuxudong 97 Nov 10, 2022
HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps.

HMLLDB is a collection of LLDB commands to assist in the debugging of iOS apps. 中文介绍 Features Non-intrusive. Your iOS project does not need to be modi

mao2020 47 Oct 22, 2022
Identifying Stroke Indicators Using Rough Sets

Identifying Stroke Indicators Using Rough Sets With the spirit of reproducible research, this repository contains all the codes required to produce th

Muhammad Salman Pathan 0 Jun 09, 2022
Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring

Enhancing Column Generation by a Machine-Learning-BasedPricing Heuristic for Graph Coloring (to appear at AAAI 2022) We propose a machine-learning-bas

YunzhuangS 2 May 02, 2022
This a classic fintech problem that introduces real life difficulties such as data imbalance. Check out the notebook to find out more!

Credit Card Fraud Detection Introduction Online transactions have become a crucial part of any business over the years. Many of those transactions use

Jonathan Hasbani 0 Jan 20, 2022