Anime Face Detector using mmdet and mmpose

Last update: Jan 07, 2023

Overview

Anime Face Detector

This is an anime face detector using mmdetection and mmpose.

(To avoid copyright issues, I use generated images by the TADNE model here.)

The model detects near-frontal anime faces and predicts 28 landmark points.

The result of k-means clustering of landmarks detected in real images:

The mean images of real images belonging to each cluster:

Installation

pip install openmim
mim install mmcv-full
mim install mmdet
mim install mmpose

pip install anime-face-detector

This package is tested only on Ubuntu.

Usage

import cv2

from anime_face_detector import create_detector

detector = create_detector('yolov3')
image = cv2.imread('assets/input.jpg')
preds = detector(image)
print(preds[0])

{'bbox': array([2.2450244e+03, 1.5940223e+03, 2.4116030e+03, 1.7458063e+03,
        9.9987185e-01], dtype=float32),
 'keypoints': array([[2.2593938e+03, 1.6680436e+03, 9.3236601e-01],
        [2.2825300e+03, 1.7051841e+03, 8.7208068e-01],
        [2.3412151e+03, 1.7281011e+03, 1.0052248e+00],
        [2.3941377e+03, 1.6825046e+03, 5.9705663e-01],
        [2.4039426e+03, 1.6541921e+03, 8.7139702e-01],
        [2.2625220e+03, 1.6330233e+03, 9.7608268e-01],
        [2.2804077e+03, 1.6408495e+03, 1.0021354e+00],
        [2.2969380e+03, 1.6494972e+03, 9.7812974e-01],
        [2.3357908e+03, 1.6453258e+03, 9.8418534e-01],
        [2.3475276e+03, 1.6355408e+03, 9.5060223e-01],
        [2.3612463e+03, 1.6262626e+03, 9.0553057e-01],
        [2.2682278e+03, 1.6631940e+03, 9.5465249e-01],
        [2.2814783e+03, 1.6616484e+03, 9.0782022e-01],
        [2.2987590e+03, 1.6692812e+03, 9.0256405e-01],
        [2.2833625e+03, 1.6879142e+03, 8.0303693e-01],
        [2.2934949e+03, 1.6909009e+03, 8.9718056e-01],
        [2.3021218e+03, 1.6863715e+03, 9.3882143e-01],
        [2.3471826e+03, 1.6636573e+03, 9.5727938e-01],
        [2.3677822e+03, 1.6540554e+03, 9.4890594e-01],
        [2.3889211e+03, 1.6611255e+03, 9.5125675e-01],
        [2.3575544e+03, 1.6800433e+03, 8.5919142e-01],
        [2.3688926e+03, 1.6800665e+03, 8.3275074e-01],
        [2.3804905e+03, 1.6761322e+03, 8.4160626e-01],
        [2.3165366e+03, 1.6947096e+03, 9.1840971e-01],
        [2.3282458e+03, 1.7104808e+03, 8.8045174e-01],
        [2.3380054e+03, 1.7114034e+03, 8.8357794e-01],
        [2.3485500e+03, 1.7080273e+03, 8.6284375e-01],
        [2.3378748e+03, 1.7118135e+03, 9.7880816e-01]], dtype=float32)}

Pretrained models

Here are the pretrained models. (They will be automatically downloaded when you use them.)

Demo (using Gradio)

Run locally

pip install gradio
git clone https://github.com/hysts/anime-face-detector
cd anime-face-detector

python demo_gradio.py

Links

General

Anime face detection

Anime face landmark detection

https://github.com/kanosawa/anime_face_landmark_detection

Others

Comments

How do you implement clustering of face landmarks?

Thank you for sharing this wonderful project. I am curious about how do you implement clustering of face landmarks. Can you describe that in detail? Or can you sharing some related papers or projects? Thanks in advance.

opened by Adenialzz 8
Citation Issue

Hi, @hysts

First of all, thank you so much for the great work!

I'm a graduate student and have used your pretrained model to generate landmark points as ground truth. I'm currently finishing up my thesis writing and want to cite your github repo.

I don't known if I overlooked something, but I couldn't find the citation information in the README page. Is there anyway to cite this repo?

Thank you.

opened by zeachkstar 2

colab notebook encounters problem while installing dependencies

Hi, the colab notebook looks broken. I used it about 2 weeks ago with out any problem. Basically in dependcie installing phase, when executing "mim install mmcv-full", colab will ask if I want to use an older version to replace pre-installed newer version. I had to choose to install older version to make the detector works.

I retried the colab notebook yesterday, this time if I still chose to replace preinstalled v1.5.0 by v.1.4.2, it will stuck at "building wheel for mmcv-full" for 20 mins and fail. If I chose not to replace preinstalled version and skip mmcv-full, the dependcie installing phase could be completed without error. But when I ran the detector, I got an error "KeyError: 'center'"

Please help.

KeyError                                  Traceback (most recent call last)
[<ipython-input-8-2cb6d21c10b9>](https://localhost:8080/#) in <module>()
     12 image = cv2.imread(input)
     13 
---> 14 preds = detector(image)

6 frames
[/content/anime-face-detector/anime_face_detector/detector.py](https://localhost:8080/#) in __call__(self, image_or_path, boxes)
    145                 boxes = [np.array([0, 0, w - 1, h - 1, 1])]
    146         box_list = [{'bbox': box} for box in boxes]
--> 147         return self._detect_landmarks(image, box_list)

[/content/anime-face-detector/anime_face_detector/detector.py](https://localhost:8080/#) in _detect_landmarks(self, image, boxes)
    101             format='xyxy',
    102             dataset_info=self.dataset_info,
--> 103             return_heatmap=False)
    104         return preds
    105 

[/usr/local/lib/python3.7/dist-packages/mmcv/utils/misc.py](https://localhost:8080/#) in new_func(*args, **kwargs)
    338 
    339             # apply converted arguments to the decorated method
--> 340             output = old_func(*args, **kwargs)
    341             return output
    342 

[/usr/local/lib/python3.7/dist-packages/mmpose/apis/inference.py](https://localhost:8080/#) in inference_top_down_pose_model(model, imgs_or_paths, person_results, bbox_thr, format, dataset, dataset_info, return_heatmap, outputs)
    385             dataset_info=dataset_info,
    386             return_heatmap=return_heatmap,
--> 387             use_multi_frames=use_multi_frames)
    388 
    389         if return_heatmap:

[/usr/local/lib/python3.7/dist-packages/mmpose/apis/inference.py](https://localhost:8080/#) in _inference_single_pose_model(model, imgs_or_paths, bboxes, dataset, dataset_info, return_heatmap, use_multi_frames)
    245                 data['image_file'] = imgs_or_paths
    246 
--> 247         data = test_pipeline(data)
    248         batch_data.append(data)
    249 

[/usr/local/lib/python3.7/dist-packages/mmpose/datasets/pipelines/shared_transform.py](https://localhost:8080/#) in __call__(self, data)
    105         """
    106         for t in self.transforms:
--> 107             data = t(data)
    108             if data is None:
    109                 return None

[/usr/local/lib/python3.7/dist-packages/mmpose/datasets/pipelines/top_down_transform.py](https://localhost:8080/#) in __call__(self, results)
    287         joints_3d = results['joints_3d']
    288         joints_3d_visible = results['joints_3d_visible']
--> 289         c = results['center']
    290         s = results['scale']
    291         r = results['rotation']

KeyError: 'center'

opened by zhongzishi 2

Question about the annotation tool for landmark

Thanks for your great work! May I ask which tool do you use to annotate the landmarks? I find the detector seems to perform not so well on the manga images. So I want to manually annotate some manga images. Besides, when you trained the landmarks detector, did you train the model from scratch or fine-tune on the pretrained mmpose model?

opened by mrbulb 2
Question About Training Dataset

Thanks for your work! It’s very interesting!! May I ask you some questions? Did you manually annotate landmarks for the images generated by the TADNE model? And how many images does your training dataset include?

opened by GrayNiwako 2
how to implement anime face identification with this detector

Thanks for sharing such a nice work! I was wondering if it is possible to implement anime face identification based on this detector. Do you have any plan on this? Will we have a good identification accuracy using this detector? Many thanks!

opened by rsindper 1
There is an error in demo.ipynb

First of all, thank you for sharing your program.

Today I tried to run the program in GoogleColab and got the following error in the import anime_face_detector section. Do you know any solutions?

Thank you.

ImportError Traceback (most recent call last) in () 5 import numpy as np 6 ----> 7 import anime_face_detector

7 frames /usr/lib/python3.7/importlib/init.py in import_module(name, package) 125 break 126 level += 1 --> 127 return _bootstrap._gcd_import(name[level:], package, level) 128 129

ImportError: /usr/local/lib/python3.7/dist-packages/mmcv/_ext.cpython-37m-x86_64-linux-gnu.so: undefined symbol:_ZNK3c1010TensorImpl36is_contiguous_nondefault_policy_implENS_12MemoryFormatE

opened by 283pm 1
Gradio demo on blocks organization

Hi, thanks for making a gradio demo for this on Huggingface https://huggingface.co/spaces/hysts/anime-face-detector, looks great with the new 3.0 design as well. Gradio has a event for the new Blocks API https://huggingface.co/Gradio-Blocks, it would be great if you can join to make a blocks version of this demo or another demo thanks!

opened by AK391 1
Re-thinking anime(Illustration/draw/manga) character face detection

awesome work!

especially face clustering very neat

this work reminds me of

How can Illustration be aligned and what can I do with these 2d landmark?

Scaling and rotating images and crop: FFHQ aligned code and webtoon result

Artstation-Artistic-face-HQ which counts as Illustration Use FFHQ aligned

and new FFHQ aligned https://arxiv.org/abs/2109.09378

but anime Illustration is not the same as real FFHQ, where perspective-related (pose) means destroying the centre, and local parts exaggeration destroying the global

[DO.1] directly k-mean dictionary (run a dataset) proximity aligned

Mention this analysis

[DO.2] because there are not many features can use, add continuous 2D spatial feature (pred), more point and even beyond

this need hack model (might proposed)

[DO.3] Should be used directly as a filter to assist with edge extraction (maximum reserve features)

guide VAE, SGF generation, or anime cross image Synthesis

if the purpose is not to train the generation model, probably use is to extend the dataset. if training to generate models, will greatly effect generated eye+chin centre aligned visual lines don't keeping real image features just polylines

Or need more key points in clustering, det box pts (easy [DO.4]), and beyond to the whole image

and thank for your reading this

opened by koke2c95 1
add polylines visualize and video test on colab demo.ipynb
result

by MPEG encoded that can't play properly (transcoded)

https://user-images.githubusercontent.com/26929386/141799892-0b496ada-66b4-4349-ab72-49aae2317ce4.mp4

comments

not yet tested on gpu

cleared all output

didn't remove function detect , just copy the from demo_gradio.py

polylines visualize function can be simplified

polylines visualize function can be customize (color, thickness, groups)
opened by koke2c95 1

Releases(v0.0.5)

v0.0.5(Nov 15, 2021)

Source code(tar.gz)
Source code(zip)
v0.0.4(Nov 8, 2021)

Source code(tar.gz)
Source code(zip)
v0.0.1(Nov 3, 2021)

Source code(tar.gz)
Source code(zip)
mean_pts.npy(576 bytes)
mmdet_anime-face_faster-rcnn.pth(158.04 MB)
mmdet_anime-face_yolov3.pth(235.04 MB)
mmpose_anime-face_hrnetv2.pth(37.51 MB)

Owner

GitHub Repository

A distributed deep learning framework that supports flexible parallelization strategies.

FlexFlow FlexFlow is a deep learning framework that accelerates distributed DNN training by automatically searching for efficient parallelization stra

528 Dec 25, 2022

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting This is the origin Pytorch implementation of Informer in the followin

3.1k Dec 29, 2022

Release of the ConditionalQA dataset

ConditionalQA Datasets accompanying the paper ConditionalQA: A Complex Reading Comprehension Dataset with Conditional Answers. Disclaimer This dataset

14 Oct 17, 2022

DUE: End-to-End Document Understanding Benchmark

This is the repository that provide tools to download data, reproduce the baseline results and evaluation. What can you achieve with this guide Based

21 Dec 29, 2022

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

Bib Parser Convenient script to parse .bib files with the ACM Digital Library li

1 Jan 26, 2022

Build and run Docker containers leveraging NVIDIA GPUs

NVIDIA Container Toolkit Introduction The NVIDIA Container Toolkit allows users to build and run GPU accelerated Docker containers. The toolkit includ

15.6k Jan 01, 2023

BoxInst: High-Performance Instance Segmentation with Box Annotations

Introduction This repository is the code that needs to be submitted for OpenMMLab Algorithm Ecological Challenge, the paper is BoxInst: High-Performan

88 Dec 21, 2022

Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

GPT-Neo-2.7B Fine-Tuning Example Using HuggingFace & DeepSpeed Installation cd venv/bin ./pip install -r ../../requirements.txt ./pip install deepspe

180 Jan 05, 2023

Human Action Controller - A human action controller running on different platforms.

Human Action Controller (HAC) Goal A human action controller running on different platforms. Fun Easy-to-use Accurate Anywhere Fun Examples Mouse Cont

27 Jul 20, 2022

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set This is the repository for the Deep Learning proje

3 Feb 06, 2022

Semi-Supervised Learning for Fine-Grained Classification

Semi-Supervised Learning for Fine-Grained Classification This repo contains the code of: A Realistic Evaluation of Semi-Supervised Learning for Fine-G

25 Nov 08, 2022

A library for answering questions using data you cannot see

A library for computing on data you do not own and cannot see PySyft is a Python library for secure and private Deep Learning. PySyft decouples privat

8.5k Jan 02, 2023

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

OCTIS : Optimizing and Comparing Topic Models is Simple! OCTIS (Optimizing and Comparing Topic models Is Simple) aims at training, analyzing and compa

478 Jan 01, 2023

Anime Face Detector using mmdet and mmpose

Related tags

Overview

Anime Face Detector

Installation

Usage

Pretrained models

Demo (using Gradio)

Run locally

Links

General

Anime face detection

Anime face landmark detection

Others

Comments

this work reminds me of

[DO.1] directly k-mean dictionary (run a dataset) proximity aligned

[DO.2] because there are not many features can use, add continuous 2D spatial feature (pred), more point and even beyond

[DO.3] Should be used directly as a filter to assist with edge extraction (maximum reserve features)

Or need more key points in clustering, det box pts (easy [DO.4]), and beyond to the whole image

result

comments

Releases(v0.0.5)

v0.0.5(Nov 15, 2021)

v0.0.4(Nov 8, 2021)

v0.0.1(Nov 3, 2021)

Owner

A distributed deep learning framework that supports flexible parallelization strategies.

Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting

Release of the ConditionalQA dataset

DUE: End-to-End Document Understanding Benchmark

Bib-parser - Convenient script to parse .bib files with the ACM Digital Library like metadata

Build and run Docker containers leveraging NVIDIA GPUs

BoxInst: High-Performance Instance Segmentation with Box Annotations

Fine-Tune EleutherAI GPT-Neo to Generate Netflix Movie Descriptions in Only 47 Lines of Code Using Hugginface And DeepSpeed

Human Action Controller - A human action controller running on different platforms.

Explaining Deep Neural Networks - A comparison of different CAM methods based on an insect data set

Semi-Supervised Learning for Fine-Grained Classification

A library for answering questions using data you cannot see

OCTIS: Comparing Topic Models is Simple! A python package to optimize and evaluate topic models (accepted at EACL2021 demo track)

Beyond imagenet attack (accepted by ICLR 2022) towards crafting adversarial examples for black-box domains.

This provides the R code and data to replicate results in "The USS Trustee’s risky strategy"

🦕 NanoSaur is a little tracked robot ROS2 enabled, made for an NVIDIA Jetson Nano

vit for few-shot classification

Framework for estimating the structures and parameters of Bayesian networks (DAGs) at per-sample resolution

Deep High-Resolution Representation Learning for Human Pose Estimation

Few-Shot Graph Learning for Molecular Property Prediction