A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

Last update: Jul 26, 2022

Overview

PokeGAN

A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

Dataset

The model has been trained on dataset that includes 819 pokémon.
You can download dataset from this kaggle link.

Dependencies

I have used the following versions for code work:

python==3.8.8
tensorflow==2.4.1
tensorflow-gpu==2.4.1
numpy==1.19.1
h5py==2.10.0

Note

There are several difficulties in pokemon generation using GAN :

The difficulty of GAN training is well known; changing a hyperparameter can greatly change the results.
The dataset size is too small! 819 different pokemon images are not enough. For this reason, I applied data augmentation on the data; these are the transformations applied :

img_transf = tf.keras.Sequential([
            	tf.keras.layers.experimental.preprocessing.RandomContrast(factor=(0.05, 0.15)),
                image_aug.RandomBrightness(brightness_delta=(-0.15, 0.15)),
                image_aug.PowerLawTransform(gamma=(0.8,1.2)),
                image_aug.RandomSaturation(sat=(0, 2)),
                image_aug.RandomHue(hue=(0, 0.15)),
                tf.keras.layers.experimental.preprocessing.RandomFlip("horizontal"),
	    	tf.keras.layers.experimental.preprocessing.RandomTranslation(height_factor=(-0.10, 0.10), width_factor=(-0.10, 0.10)),
		tf.keras.layers.experimental.preprocessing.RandomZoom(height_factor=(-0.10, 0.10), width_factor=(-0.10, 0.10)),
		tf.keras.layers.experimental.preprocessing.RandomRotation(factor=(-0.10, 0.10))])

StyleGAN training is very expensive! I trained the model starting from a 4x4 resolution up to the final resolution of 256x256. The model was trained for 8 days using a Tesla V100 32GB SXM2.
To get better results you need to use higher resolutions and train for longer time.

Results

These are some examples of new pokémon generated by the model :

New Generated Pokémon

More results

You can see hundreds of new pokemon here.
I repeat again it : to get better results (better details in pokemon) is necessary to train for more time.

References

This code implementation is inspired by the unofficial keras implementation of styleGAN.

A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

Related tags

Overview

PokeGAN

Dataset

Dependencies

Note

Results

More results

References

Owner

X-modaler is a versatile and high-performance codebase for cross-modal analytics.

Towards Multi-Camera 3D Human Pose Estimation in Wild Environment

[ICCV'21] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

ML-Decoder: Scalable and Versatile Classification Head

一个目标检测的通用框架(不需要cuda编译)，支持Yolo全系列(v2~v5)、EfficientDet、RetinaNet、Cascade-RCNN等SOTA网络。

Curriculum Domain Adaptation for Semantic Segmentation of Urban Scenes, ICCV 2017

TensorFlow-based implementation of "ICNet for Real-Time Semantic Segmentation on High-Resolution Images".

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

Official pytorch implementation of Active Learning for deep object detection via probabilistic modeling (ICCV 2021)

Shape-aware Semi-supervised 3D Semantic Segmentation for Medical Images

PyTorch Implementation of Backbone of PicoDet

Flexible Option Learning - NeurIPS 2021

Disturbing Target Values for Neural Network regularization: attacking the loss layer to prevent overfitting

[ICCV 2021] FaPN: Feature-aligned Pyramid Network for Dense Image Prediction

Official PyTorch implementation of Retrieve in Style: Unsupervised Facial Feature Transfer and Retrieval.

atmaCup #11 の Public 4th / Pricvate 5th Solution のリポジトリです。

This is an official repository of CLGo: Learning to Predict 3D Lane Shape and Camera Pose from a Single Image via Geometry Constraints

implementation for paper "ShelfNet for fast semantic segmentation"

This repo is the official implementation of "L2ight: Enabling On-Chip Learning for Optical Neural Networks via Efficient in-situ Subspace Optimization".

PyTorch implementation of "A Two-Stage End-to-End System for Speech-in-Noise Hearing Aid Processing"