CharacterGAN: Few-Shot Keypoint Character Animation and Reposing

Overview

CharacterGAN

Implementation of the paper "CharacterGAN: Few-Shot Keypoint Character Animation and Reposing" by Tobias Hinz, Matthew Fisher, Oliver Wang, Eli Shechtman, and Stefan Wermter (open with Adobe Acrobat or similar to see visualizations).

Supplementary material can be found here.

Our model can be trained on only a few images (e.g. 10) of a given character labeled with user-chosen keypoints. The resulting model can be used to animate the character on which it was trained by interpolating between its poses specified by their keypoints. We can also repose characters by simply moving the keypoints into the desired positions. To train the model all we need are few images depicting the character in diverse poses from the same viewpoint, keypoints, a file that describes how the keypoints are connected (the characters skeleton) and which keypoints lie in the same layer.

Examples

Animation: For all examples the model was trained on 8-15 images (see first row) of the given character.

Training Images 12 15 9 12 15 15 8
Animation dog_animation maddy_animation ostrich_animation man_animation robot_animation man_animation cow_animation



Frame interpolation: Example of interpolations between two poses with the start and end keypoints highlighted.

man man man man man man man man man man man man man
dog dog dog dog dog dog dog dog dog dog dog dog dog



Reposing: You can use our interactive GUI to easily repose a given character based on keypoints.

Interactive dog_gui man_gui
Gui cow_gui man_gui

Installation

  • python 3.8
  • pytorch 1.7.1
pip install -r requirements.txt

Training

Training Data

All training data for a given character should be in a single folder. We used this website to label our images but there are of course other possibilities.

The folder should contain:

  • all training images (all in the same resolution),
  • a file called keypoints.csv (containing the keypoints for each image),
  • a file called keypoints_skeleton.csv (containing skeleton information, i.e. how keypoints are connected with each other), and
  • a file called keypoints_layers.csv (containing the information about which layer each keypoint resides in).

The structure of the keypoints.csv file is (no header): keypoint_label,x_coord,y_coord,file_name. The first column describes the keypoint label (e.g. head), the next two columns give the location of the keypoint, and the final column states which training image this keypoint belongs to.

The structure of the keypoints_skeleton.csv file is (no header): keypoint,connected_keypoint,connected_keypoint,.... The first column describes which keypoint we are describing in this line, the following columns describe which keypoints are connected to that keypoint (e.g. elbow, shoulder, hand would state that the elbow keypoint should be connected to the shoulder keypoint and the hand keypoint).

The structure of the keypoints_layers.csv file is (no header): keypoint,layer. "Keypoint" is the keypoint label (same as used in the previous two files) and "layer" is an integer value desribing which layer the keypoint resides in.

See our example training data in datasets for examples of both files.

We provide two examples (produced by Zuzana Studená) for training, located in datasets. Our other examples were trained on data from Adobe Stock or from Character Animator and I currently have no license to distribute them. You can purchase the Stock data here:

  • Man: we used all images
  • Dog: we used all images
  • Ostrich: we used the first nine images
  • Cow: we used the first eight images

There are also several websites where you can download Sprite sheets for free.

Train a Model

To train a model with the default parameters from our paper run:

python train.py --gpu_ids 0 --num_keypoints 14 --dataroot datasets/Watercolor-Man --fp16 --name Watercolor-Man

Training one model should take about 60 (FP16) to 90 (FP32) minutes on an NVIDIA GeForce GTX 2080Ti. You can usually use fewer iterations for training and still achieve good results (see next section).

Training Parameters

You can adjust several parameters at train time to possibly improve your results.

  • --name to change the name of the folder in which the results are stored (default is CharacterGAN-Timestamp)
  • --niter 4000 and --niter_decay 4000 to adjust the number of training steps (niter_decayis the number of training steps during which we reduce the learning rate linearly; default is 8000 for both, but you can get good results with fewer iterations)
  • --mask True --output_nc 4 to train with a mask
  • --skeleton False to train without skeleton information
  • --bkg_color 0 to set the background color of the training images to black (default is white, only important if you train with a mask)
  • --batch_size 10 to train with a different batch size (default is 5)

The file options/keypoints.py lets you modify/add/remove keypoints for your characters.

Results

The output is saved to checkpoints/ and we log the training process with Tensorboard. To monitor the progress go to the respective folder and run

 tensorboard --logdir .

Testing

At test time you can either use the model to animate the character or use our interactive GUI to change the position of individual keypoints.

Animate Character

To animate a character (or create interpolations between two images):

python animate_example.py --gpu_ids 0 --model_path checkpoints/Watercolor-Man-.../ --img_animation_list datasets/Watercolor-Man/animation_list.txt --dataroot datasets/Watercolor-Man

--img_animation_list points to a file that lists the images that should be used for animation. The file should contain one file name per line pointing to an image in dataroot. The model then generates an animation by interpolating between the images in the given order. See datasets/Watercolor-Man/animation_list.txt for an example.

You can add --draw_kps to visualize the keypoints in the animation. You can specifiy the gif parameters by setting --num_interpolations 10 and --fps 5. num_interpolations specifies how many images are generated between two real images (from img_animation_list), fps determines the frames per second of the generated gif.

Modify Individual Keypoints

To run the interactive GUI:

python visualizer.py --gpu_ids 0 --model_path checkpoints/Watercolor-Man-.../

Set --gpu_ids -1 to run the model on a CPU. You can also scale the images during visualization, e.g. use --scale 2.

Patch-based Refinement

We use this implementation to run the patch-based refinement step on our generated images. The easiest way to do this is to merge all your training images into a single large image file and use this image file as the style and source image.

Acknowledgements

Our implementation uses code from Pix2PixHD, the TPS augmentation from DeepSIM, and the patch-based refinement code from https://ebsynth.com/ (GitHub).

We would also like to thank Zuzana Studená who produced some of the artwork used in this work.

Citation

If you found this code useful please consider citing:

@article{hinz2021character,
    author    = {Hinz, Tobias and Fisher, Matthew and Wang, Oliver and Shechtman, Eli and Wermter, Stefan},
    title     = {CharacterGAN: Few-Shot Keypoint Character Animation and Reposing},
    journal = {arXiv preprint arXiv:2102.03141},
    year      = {2021}
}
Owner
Tobias Hinz
Research Associate at University of Hamburg
Tobias Hinz
A keras implementation of ENet (abandoned for the foreseeable future)

ENet-keras This is an implementation of ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation, ported from ENet-training (lua-t

Pavlos 115 Nov 23, 2021
My usage of Real-ESRGAN to upscale anime, some test and results in the test_img folder

anime upscaler My usage of Real-ESRGAN to upscale anime, I hope to use this on a proper GPU cuz doing this on CPU is completely shit 😂 , I even tried

Shangar Muhunthan 29 Jan 07, 2023
Official codebase for Decision Transformer: Reinforcement Learning via Sequence Modeling.

Decision Transformer Lili Chen*, Kevin Lu*, Aravind Rajeswaran, Kimin Lee, Aditya Grover, Michael Laskin, Pieter Abbeel, Aravind Srinivas†, and Igor M

Kevin Lu 1.4k Jan 07, 2023
[ICCV 2021] Official PyTorch implementation for Deep Relational Metric Learning.

Ranking Models in Unlabeled New Environments Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch 1.7.0 + torchivision 0.8.1

Borui Zhang 39 Dec 10, 2022
Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Leaderboard, taxonomy, and curated list of few-shot object detection papers.

Gabriel Huang 70 Jan 07, 2023
PyTorch reimplementation of Diffusion Models

PyTorch pretrained Diffusion Models A PyTorch reimplementation of Denoising Diffusion Probabilistic Models with checkpoints converted from the author'

Patrick Esser 265 Jan 01, 2023
source code of “Visual Saliency Transformer” (ICCV2021)

Visual Saliency Transformer (VST) source code for our ICCV 2021 paper “Visual Saliency Transformer” by Nian Liu, Ni Zhang, Kaiyuan Wan, Junwei Han, an

89 Dec 21, 2022
SEJE Pytorch implementation

SEJE is a prototype for the paper Learning Text-Image Joint Embedding for Efficient Cross-Modal Retrieval with Deep Feature Engineering. Contents Inst

0 Oct 21, 2021
Privacy-Preserving Portrait Matting [ACM MM-21]

Privacy-Preserving Portrait Matting [ACM MM-21] This is the official repository of the paper Privacy-Preserving Portrait Matting. Jizhizi Li∗, Sihan M

Jizhizi_Li 212 Dec 27, 2022
Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes

Naive-Bayes Predict Breast Cancer Wisconsin (Diagnostic) using Naive Bayes Downloading Data Set Use our Breast Cancer Wisconsin Data Set Also you can

Faeze Habibi 0 Apr 06, 2022
Official code for the CVPR 2021 paper "How Well Do Self-Supervised Models Transfer?"

How Well Do Self-Supervised Models Transfer? This repository hosts the code for the experiments in the CVPR 2021 paper How Well Do Self-Supervised Mod

Linus Ericsson 157 Dec 16, 2022
[NeurIPS-2021] Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

MosaicKD Code for NeurIPS-21 paper "Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data" 1. Motivation Natural images share common l

ZJU-VIPA 37 Nov 10, 2022
基于Paddle框架的fcanet复现

fcanet-Paddle 基于Paddle框架的fcanet复现 fcanet 本项目基于paddlepaddle框架复现fcanet,并参加百度第三届论文复现赛,将在2021年5月15日比赛完后提供AIStudio链接~敬请期待 参考项目: frazerlin-fcanet 数据准备 本项目已挂

QuanHao Guo 7 Mar 07, 2022
Keras attention models including botnet,CoaT,CoAtNet,CMT,cotnet,halonet,resnest,resnext,resnetd,volo,mlp-mixer,resmlp,gmlp,levit

Keras_cv_attention_models Keras_cv_attention_models Usage Basic Usage Layers Model surgery AotNet ResNetD ResNeXt ResNetQ BotNet VOLO ResNeSt HaloNet

319 Dec 28, 2022
A Tensorflow implementation of BicycleGAN.

BicycleGAN implementation in Tensorflow As part of the implementation series of Joseph Lim's group at USC, our motivation is to accelerate (or sometim

Cognitive Learning for Vision and Robotics (CLVR) lab @ USC 97 Dec 02, 2022
Official repo for our 3DV 2021 paper "Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements".

Monocular 3D Reconstruction of Interacting Hands via Collision-Aware Factorized Refinements Yu Rong, Jingbo Wang, Ziwei Liu, Chen Change Loy Paper. Pr

Yu Rong 41 Dec 13, 2022
PyTorch implementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

MoCo: Momentum Contrast for Unsupervised Visual Representation Learning This is a PyTorch implementation of the MoCo paper: @Article{he2019moco, aut

Meta Research 3.7k Jan 02, 2023
The official re-implementation of the Neurips 2021 paper, "Targeted Neural Dynamical Modeling".

Targeted Neural Dynamical Modeling Note: This is a re-implementation (in Tensorflow2) of the original TNDM model. We do not plan to further update the

6 Oct 05, 2022
Generative Adversarial Networks(GANs)

Generative Adversarial Networks(GANs) Vanilla GAN ClusterGAN Vanilla GAN Model Structure Final Generator Structure A MLP with 2 hidden layers of hidde

Zhenbang Feng 2 Nov 05, 2021