A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

Overview

A Light and Fast Face Detector for Edge Devices

Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended to use LFD instead !!! Visit LFD Repo here. This repo will not be maintained from now on.

Recent Update

  • 2019.07.25 This repos is first online. Face detection code and trained models are released.
  • 2019.08.15 This repos is formally released. Any advice and error reports are sincerely welcome.
  • 2019.08.22 face_detection: latency evaluation on TX2 is added.
  • 2019.08.25 face_detection: RetinaFace-MobileNet-0.25 is added for comparison (both accuracy and latency).
  • 2019.09.09 LFFD is ported to NCNN (link) and MNN (link) by SyGoing, great thanks to SyGoing.
  • 2019.09.10 face_detection: important bug fix: vibration offset should be subtracted by shift in data iterator. This bug may result in lower accuracy, inaccurate bbox prediction and bbox vibration in test phase. We will upgrade v1 and v2 as soon as possible (should have higher accuracy and more stable).
  • 2019.09.17 face_detection: model v2 is upgraded! After fixing the bug, we have fine-tuned the old v2 model. The accuracy on WIDER FACE is improved significantly! Please try new v2.
  • 2019.09.18 pedestrian_detection: preview version of model v1 for Caltech Pedestrian Dataset is released.
  • 2019.09.23 head_detection: model v1 for brainwash dataset is released.
  • 2019.10.02 license_plate_detection: model v1 for CCPD dataset is released. (The accuracy is very high and the latency is very short! Have a try.)
  • 2019.10.02 Currently, we have provided some application-oriented detectors. Subsequently, we will put most energy to next generation framework for single-class detection. Any feedback is welcome.
  • 2019.10.16 face_detection: the preview of PyTorch version is ready (link). Any feedback is welcome.
  • 2019.10.16 Tips: data preparation is important, irrational values of (x,y,w,h) may introduce nan in training; we trained models with convs followed by BNs. But we found that the convergence is not stable, and can not reach a good point.
  • 2019.11.08 face_detection: caffe version of LFFD is provided by vicwer (great thanks). Guys who are familiar with caffe can navigate to /face_detection/caffemodel for details.
  • 2020.03.27 license_plate_detection: model v1_small for CCPD dataset is released. v1_small has much less parameters than v1, hence it is much faster. The AP of v1_small is 0.982 (vs v1-0.989). Please check README.md. Besides, a commercial-ready license plate recognition repo which adopted LFFD as the detector is hightly recommended!

Introduction

This repo releases the source code of paper "LFFD: A Light and Fast Face Detector for Edge Devices". Our paper presents a light and fast face detector (LFFD) for edge devices. LFFD considerably balances both accuracy and latency, resulting in small model size, fast inference speed while achieving excellent accuracy. Understanding the essence of receptive field makes detection networks interpretable.

In practical, we have deployed it in cloud and edge devices (like NVIDIA Jetson series and ARM-based embedding system). The comprehensive performance of LFFD is robust enough to support our applications.

In fact, our method is a general detection framework that applicable to one class detection, such as face detection, pedestrian detection, head detection, vehicle detection and so on. In general, an object class, whose average ratio of the longer side and the shorter side is less than 5, is appropriate to apply our framework for detection.

Several practical advantages:

  1. large scale coverage, and easy to extend to larger scales by adding more layers without much latency gain.
  2. detect small objects (as small as 10 pixels) in images with extremely large resolution (8K or even larger) in only one inference.
  3. easy backbone with very common operators makes it easy to deploy anywhere.

Accuracy and Latency

We train LFFD on train set of WIDER FACE benchmark. All methods are evaluated on val/test sets under the SIO schema (please refer to the paper for details).

  • Accuracy on val set of WIDER FACE (The values in () are results from the original papers):
Method Easy Set Medium Set Hard Set
DSFD 0.949(0.966) 0.936(0.957) 0.850(0.904)
PyramidBox 0.937(0.961) 0.927(0.950) 0.867(0.889)
S3FD 0.923(0.937) 0.907(0.924) 0.822(0.852)
SSH 0.921(0.931) 0.907(0.921) 0.702(0.845)
FaceBoxes 0.840 0.766 0.395
FaceBoxes3.2× 0.798 0.802 0.715
LFFD 0.910 0.881 0.780
  • Accuracy on test set of WIDER FACE (The values in () are results from the original papers):
Method Easy Set Medium Set Hard Set
DSFD 0.947(0.960) 0.934(0.953) 0.845(0.900)
PyramidBox 0.926(0.956) 0.920(0.946) 0.862(0.887)
S3FD 0.917(0.928) 0.904(0.913) 0.821(0.840)
SSH 0.919(0.927) 0.903(0.915) 0.705(0.844)
FaceBoxes 0.839 0.763 0.396
FaceBoxes3.2× 0.791 0.794 0.715
LFFD 0.896 0.865 0.770
  • Accuracy on FDDB:
Method Disc ROC curves score
DFSD 0.984
PyramidBox 0.982
S3FD 0.981
SSH 0.977
FaceBoxes3.2× 0.905
FaceBoxes 0.960
LFFD 0.973

In the paper, three hardware platforms are used for latency evaluation: NVIDIA GTX TITAN Xp, NVIDIA TX2 and Rasberry Pi 3 Model B+ (ARM A53).

We report the latency of inference only (for NVIDIA hardwares, data transfer is included), excluding pre-processing and post-processing. The batchsize is set to 1 for all evaluations.

  • Latency on NVIDIA GTX TITAN Xp (MXNet+CUDA 9.0+CUDNN7.1):
Resolution-> 640×480 1280×720 1920×1080 3840×2160
DSFD 78.08ms(12.81 FPS) 187.78ms(5.33 FPS) 392.82ms(2.55 FPS) 1562.50ms(0.64 FPS)
PyramidBox 50.51ms(19.08 FPS) 143.34ms(6.98 FPS) 331.93ms(3.01 FPS) 1344.07ms(0.74 FPS)
S3FD 21.75ms(45.95 FPS) 55.73ms(17.94 FPS) 119.53ms(8.37 FPS) 471.31ms(2.21 FPS)
SSH 22.44ms(44.47 FPS) 55.29ms(18.09 FPS) 118.43ms(8.44 FPS) 463.10ms(2.16 FPS)
FaceBoxes3.2× 6.80ms(147.00 FPS) 12.96ms(77.19 FPS) 25.37ms(39.41 FPS) 111.98ms(8.93 FPS)
LFFD 7.60ms(131.40 FPS) 16.37ms(61.07 FPS) 31.27ms(31.98 FPS) 87.79ms(11.39 FPS)
  • Latency on NVIDIA TX2 (MXNet+CUDA 9.0+CUDNN7.1) presented in the paper:
Resolution-> 160×120 320×240 640×480
FaceBoxes3.2× 11.20ms(89.29 FPS) 19.62ms(50.97 FPS) 72.74ms(13.75 FPS)
LFFD 7.30ms(136.99 FPS) 19.64ms(50.92 FPS) 64.70ms(15.46 FPS)
  • Latency on Respberry Pi 3 Model B+ (ncnn) presented in the paper:
Resolution-> 160×120 320×240 640×480
FaceBoxes3.2× 167.20ms(5.98 FPS) 686.19ms(1.46 FPS) 3232.26ms(0.31 FPS)
LFFD 118.45ms(8.44 FPS) 409.19ms(2.44 FPS) 4114.15ms(0.24 FPS)

On NVIDIA platform, TensorRT is the best choice for inference. So we conduct additional latency evaluations using TensorRT (the latency is dramatically decreased!!!). As for ARM based platform, we plan to use MNN and Tengine for latency evaluation. Details can be found in the sub-project face_detection.

Getting Started

We implement the proposed method using MXNet Module API.

Prerequirements (global)

  • Python>=3.5
  • numpy>=1.16 (lower versions should work as well, but not tested)
  • MXNet>=1.4.1 (install guide)
  • cv2=3.x (pip3 install opencv-python==3.4.5.20, other version should work as well, but not tested)

Tips:

  • use MXNet with cudnn.
  • build numpy from source with OpenBLAS. This will improve the training efficiency.
  • make sure cv2 links to libjpeg-turbo, not libjpeg. This will improve the jpeg decode efficiency.

Sub-directory description

  • face_detection contains the code of training, evaluation and inference for LFFD, the main content of this repo. The trained models of different versions are provided for off-the-shelf deployment.
  • head_detection contains the trained models for head detection. The models are obtained by the proposed general one class detection framework.
  • pedestrian_detection contains the trained models for pedestrian detection. The models are obtained by the proposed general one class detection framework.
  • vehicle_detection contains the trained models for vehicle detection. The models are obtained by the proposed general one class detection framework.
  • ChasingTrainFramework_GeneralOneClassDetection is a simple wrapper based on MXNet Module API for general one class detection.

Installation

  1. Download the repo:
git clone https://github.com/YonghaoHe/A-Light-and-Fast-Face-Detector-for-Edge-Devices.git
  1. Refer to the corresponding sub-project for detailed usage.

Citation

If you benefit from our work in your research and product, please kindly cite the paper

@inproceedings{LFFD,
title={LFFD: A Light and Fast Face Detector for Edge Devices},
author={He, Yonghao and Xu, Dezhong and Wu, Lifang and Jian, Meng and Xiang, Shiming and Pan, Chunhong},
booktitle={arXiv:1904.10633},
year={2019}
}

To Do List

Contact

Yonghao He

E-mails: [email protected] / [email protected]

If you are interested in this work, any innovative contributions are welcome!!!

Internship is open at NLPR, CASIA all the time. Send me your resumes!

Owner
YonghaoHe
Assistant Professor
YonghaoHe
Submanifold sparse convolutional networks

Submanifold Sparse Convolutional Networks This is the PyTorch library for training Submanifold Sparse Convolutional Networks. Spatial sparsity This li

Facebook Research 1.8k Jan 06, 2023
学习 python3 以来写的一些垃圾玩具……

和东哥做兄弟 Author: chiupam 版权 未经本人同意,仓库内所有资源文件,禁止任何公众号、自媒体、开发者进行任何形式的转载、发布、搬运。 声明 这不是一个开源项目,只是把 GitHub 当作一个代码的存储空间,本项目不接受任何开源要求。 仅用于学习研究,禁止用于商业用途,不能保证其合法性

Chiupam 67 Mar 26, 2022
FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection

FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection This repository contains an implementation of FCAF3D, a 3D object detection method introdu

SamsungLabs 153 Dec 29, 2022
A PyTorch implementation of the paper "Semantic Image Synthesis via Adversarial Learning" in ICCV 2017

Semantic Image Synthesis via Adversarial Learning This is a PyTorch implementation of the paper Semantic Image Synthesis via Adversarial Learning. Req

Seonghyeon Nam 146 Nov 25, 2022
In this tutorial, you will perform inference across 10 well-known pre-trained object detectors and fine-tune on a custom dataset. Design and train your own object detector.

Object Detection Object detection is a computer vision task for locating instances of predefined objects in images or videos. In this tutorial, you wi

Ibrahim Sobh 62 Dec 25, 2022
A small tool to joint picture including gif

README 做设计的时候遇到拼接长图的情况,但是发现没有什么好用的能拼接gif的工具。 于是自己写了个gif拼接小工具。 可以自动拼接gif、png和jpg等常见格式。 效果 从上至下 从下至上 从左至右 从右至左 使用 克隆仓库 git clone https://github.com/Dels

3 Dec 15, 2021
Code for the paper "Jukebox: A Generative Model for Music"

Status: Archive (code is provided as-is, no updates expected) Jukebox Code for "Jukebox: A Generative Model for Music" Paper Blog Explorer Colab Insta

OpenAI 6k Jan 02, 2023
Code release for paper: The Boombox: Visual Reconstruction from Acoustic Vibrations

The Boombox: Visual Reconstruction from Acoustic Vibrations Boyuan Chen, Mia Chiquier, Hod Lipson, Carl Vondrick Columbia University Project Website |

Boyuan Chen 12 Nov 30, 2022
Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

eXtreme Gradient Boosting Community | Documentation | Resources | Contributors | Release Notes XGBoost is an optimized distributed gradient boosting l

Distributed (Deep) Machine Learning Community 23.6k Dec 31, 2022
Code for EMNLP2021 paper "Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training"

VoCapXLM Code for EMNLP2021 paper Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training Environment DockerFile: dancingso

Bo Zheng 15 Jul 28, 2022
Fuzzing the Kernel Using Unicornafl and AFL++

Unicorefuzz Fuzzing the Kernel using UnicornAFL and AFL++. For details, skim through the WOOT paper or watch this talk at CCCamp19. Is it any good? ye

Security in Telecommunications 283 Dec 26, 2022
CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Detection in Remote Sensing Images

CFC-Net This project hosts the official implementation for the paper: CFC-Net: A Critical Feature Capturing Network for Arbitrary-Oriented Object Dete

ming71 55 Dec 12, 2022
Adversarial-autoencoders - Tensorflow implementation of Adversarial Autoencoders

Adversarial Autoencoders (AAE) Tensorflow implementation of Adversarial Autoencoders (ICLR 2016) Similar to variational autoencoder (VAE), AAE imposes

Qian Ge 236 Nov 13, 2022
HybVIO visual-inertial odometry and SLAM system

HybVIO A visual-inertial odometry system with an optional SLAM module. This is a research-oriented codebase, which has been published for the purposes

Spectacular AI 320 Jan 03, 2023
【CVPR 2021, Variational Inference Framework, PyTorch】 From Rain Generation to Rain Removal

From Rain Generation to Rain Removal (CVPR2021) Hong Wang, Zongsheng Yue, Qi Xie, Qian Zhao, Yefeng Zheng, and Deyu Meng [PDF&&Supplementary Material]

Hong Wang 48 Nov 23, 2022
COCO Style Dataset Generator GUI

A simple GUI-based COCO-style JSON Polygon masks' annotation tool to facilitate quick and efficient crowd-sourced generation of annotation masks and bounding boxes. Optionally, one could choose to us

Hans Krupakar 142 Dec 09, 2022
A Model for Natural Language Attack on Text Classification and Inference

TextFooler A Model for Natural Language Attack on Text Classification and Inference This is the source code for the paper: Jin, Di, et al. "Is BERT Re

Di Jin 418 Dec 16, 2022
ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing

ProFuzzBench - A Benchmark for Stateful Protocol Fuzzing ProFuzzBench is a benchmark for stateful fuzzing of network protocols. It includes a suite of

155 Jan 08, 2023
Blender Add-On for slicing meshes with planes

MeshSlicer Blender Add-On for slicing meshes with multiple overlapping planes at once. This is a simple Blender addon to slice a silmple mesh with mul

52 Dec 12, 2022
Continuous Augmented Positional Embeddings (CAPE) implementation for PyTorch

PyTorch implementation of Continuous Augmented Positional Embeddings (CAPE), by Likhomanenko et al. Enhance your Transformer positional embeddings with easy-to-use augmentations!

Guillermo Cámbara 26 Dec 13, 2022