Face detection using deep learning.

Overview

Face Detection Docker Solution Using Faster R-CNN



Dockerface is a deep learning face detector. It deploys a trained Faster R-CNN network on Caffe through an easy to use docker image. Bring your videos and images, run dockerface and obtain videos and images with bounding boxes of face detections and an easy to use face detection annotation text file.

The docker image is large for now because OpenCV has to be compiled and stored in the image to be able to use video and it takes up a lot of space.

Technical details and some experiments are described in the Arxiv Tech Report.

Citing Dockerface

If you find Dockerface useful in your research please consider citing:

@ARTICLE{2017arXiv170804370R,
   author = {{Ruiz}, N. and {Rehg}, J.~M.},
    title = "{Dockerface: an easy to install and use Faster R-CNN face detector in a Docker container}",
  journal = {ArXiv e-prints},
archivePrefix = "arXiv",
   eprint = {1708.04370},
 primaryClass = "cs.CV",
 keywords = {Computer Science - Computer Vision and Pattern Recognition},
     year = 2017,
    month = aug,
   adsurl = {http://adsabs.harvard.edu/abs/2017arXiv170804370R},
  adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}

Instructions

Install NVIDIA CUDA (8 - preferably) and cuDNN (v5 - preferably)

https://developer.nvidia.com/cuda-downloads
https://developer.nvidia.com/cudnn

Install docker

https://docs.docker.com/engine/installation/

Install nvidia-docker

wget -P /tmp https://github.com/NVIDIA/nvidia-docker/releases/download/v1.0.1/nvidia-docker_1.0.1-1_amd64.deb
sudo dpkg -i /tmp/nvidia-docker*.deb && rm /tmp/nvidia-docker*.deb

Go to your working folder and create a directory called data, your videos and images should go here. Also create a folder called output.

cd $WORKING_DIR
mkdir data
mkdir output

Run the docker container

sudo nvidia-docker run -it -v $PWD/data:/opt/py-faster-rcnn/edata -v $PWD/output/video:/opt/py-faster-rcnn/output/video -v $PWD/output/images:/opt/py-faster-rcnn/output/images natanielruiz/dockerface:latest

Now we have to recompile Caffe for it to work on your own machine.

cd caffe-fast-rcnn
rm -rf build
mkdir build
cd build
cmake -DUSE_CUDNN=1 ..
make -j20 && make pycaffe
cd ../..

Finally use this command to process a video

python tools/run_face_detection_on_video.py --gpu 0 --video edata/YOUR_VIDEO_FILENAME --output_string STRING_TO_BE_APPENDED_TO_OUTPUTFILE_NAME --conf_thresh CONFIDENCE_THRESHOLD_FOR_DETECTIONS

Use this command to process an image

python tools/run_face_detection_on_image.py --gpu 0 --image edata/YOUR_IMAGE_FILENAME --output_string STRING_TO_BE_APPENDED_TO_OUTPUTFILE_NAME --conf_thresh CONFIDENCE_THRESHOLD_FOR_DETECTIONS

Also if you are looking to conveniently process all images in one folder use this command

python tools/facedetection_images.py --gpu 0 --image_folder edata/IMAGE_FOLDER_NAME --output_folder OUTPUT_FOLDER_PATH --conf_thresh CONFIDENCE_THRESHOLD_FOR_DETECTIONS

The default confidence threshold is 0.85 which works for high quality videos or images where the faces are clearly visible. You can play around with this value.

The columns contained in the output text files are:

For videos:

frame_number x_min y_min x_max y_max confidence_score

For images:

image_path x_min y_min x_max y_max confidence_score

Where (x_min,y_min) denote the coordinates of the upper-left corner of the bounding box in image intrinsic coordinates and (x_max, y_max) denote the coordinates of the lower-right corner of the bounding box in image intrinsic coordinates. (ref. https://www.mathworks.com/help/images/image-coordinate-systems.html) confidence_score denotes the probability output of the model that the detection is correct (it is a number included in [0,1])

Voila, that easy!

After you're done with the docker container you can exit.

exit

You want to restart and re-attach to this same docker container so as to avoid compiling Caffe again. To do this first get the id for that container.

sudo docker ps -a

It should be the last one that was launched. Take note of CONTAINER ID. Then start and attach to that container.

sudo docker start CONTAINER_ID
sudo docker attach CONTAINER_ID

You can now continue processing videos.

Nataniel Ruiz and James M. Rehg
Georgia Institute of Technology

Credits: Original dockerface logo made by Freepik from Flaticon is licensed by Creative Commons BY 3.0, modified by Nataniel Ruiz.

Owner
Nataniel Ruiz
PhD candidate at Boston University doing Computer Vision and ML. M.S. from Georgia Tech, BA/M.S. from Ecole Polytechnique
Nataniel Ruiz
Statistical and Algorithmic Investing Strategies for Everyone

Eiten - Algorithmic Investing Strategies for Everyone Eiten is an open source toolkit by Tradytics that implements various statistical and algorithmic

Tradytics 2.5k Jan 02, 2023
DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe.

DeepLab Introduction DeepLab is a state-of-art deep learning system for semantic image segmentation built on top of Caffe. It combines densely-compute

Ali 234 Nov 14, 2022
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"

Memory Efficient Attention Pytorch Implementation of a memory efficient multi-head attention as proposed in the paper, Self-attention Does Not Need O(

Phil Wang 180 Jan 05, 2023
Arquitetura e Desenho de Software.

S203 Este é um repositório dedicado às aulas de Arquitetura e Desenho de Software, cuja sigla é "S203". E agora, José? Como não tenho muito a falar aq

Fabio 7 Oct 23, 2021
Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Ceph.

Project Aquarium Project Aquarium is a SUSE-sponsored open source project aiming at becoming an easy to use, rock solid storage appliance based on Cep

Aquarist Labs 73 Jul 21, 2022
Metadata-Extractor - Metadata Extractor Script can be used to read in exif metadata

Metadata Extractor The exifextract script can be used to read in exif metadata f

1 Feb 16, 2022
Code for A Volumetric Transformer for Accurate 3D Tumor Segmentation

VT-UNet This repo contains the supported pytorch code and configuration files to reproduce 3D medical image segmentaion results of VT-UNet. Environmen

Himashi Amanda Peiris 114 Dec 20, 2022
Repo for code associated with Modeling the Mitral Valve.

Project Title Mitral Valve Getting Started Repo for code associated with Modeling the Mitral Valve. See https://arxiv.org/abs/1902.00018 for preprint,

Alex Kaiser 1 May 17, 2022
Nested Graph Neural Network (NGNN) is a general framework to improve a base GNN's expressive power and performance

Nested Graph Neural Networks About Nested Graph Neural Network (NGNN) is a general framework to improve a base GNN's expressive power and performance.

Muhan Zhang 38 Jan 05, 2023
Adversarial Texture Optimization from RGB-D Scans (CVPR 2020).

AdversarialTexture Adversarial Texture Optimization from RGB-D Scans (CVPR 2020). Scanning Data Download Please refer to data directory for details. B

Jingwei Huang 153 Nov 28, 2022
Apollo optimizer in tensorflow

Apollo Optimizer in Tensorflow 2.x Notes: Warmup is important with Apollo optimizer, so be sure to pass in a learning rate schedule vs. a constant lea

Evan Walters 1 Nov 09, 2021
A Keras implementation of CapsNet in the paper: Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules

NOTE This implementation is fork of https://github.com/XifengGuo/CapsNet-Keras , applied to IMDB texts reviews dataset. CapsNet-Keras A Keras implemen

Lauro Moraes 5 Oct 23, 2022
Continuous Time LiDAR odometry

CT-ICP: Elastic SLAM for LiDAR sensors This repository implements the SLAM CT-ICP (see our article), a lightweight, precise and versatile pure LiDAR o

385 Dec 29, 2022
torchlm is aims to build a high level pipeline for face landmarks detection, it supports training, evaluating, exporting, inference(Python/C++) and 100+ data augmentations

💎A high level pipeline for face landmarks detection, supports training, evaluating, exporting, inference and 100+ data augmentations, compatible with torchvision and albumentations, can easily instal

DefTruth 142 Dec 25, 2022
VSR-Transformer - This paper proposes a new Transformer for video super-resolution (called VSR-Transformer).

VSR-Transformer By Jiezhang Cao, Yawei Li, Kai Zhang, Luc Van Gool This paper proposes a new Transformer for video super-resolution (called VSR-Transf

Jiezhang Cao 225 Nov 13, 2022
Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit

STORM Stochastic Tensor Optimization for Robot Motion - A GPU Robot Motion Toolkit [Install Instructions] [Paper] [Website] This package contains code

NVIDIA Research Projects 101 Dec 12, 2022
An atmospheric growth and evolution model based on the EVo degassing model and FastChem 2.0

EVolve Linking planetary mantles to atmospheric chemistry through volcanism using EVo and FastChem. Overview EVolve is a linked mantle degassing and a

Pip Liggins 2 Jan 17, 2022
SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches

SketchEdit: Mask-Free Local Image Manipulation with Partial Sketches [Paper]  [Project Page]  [Interactive Demo]  [Supplementary Material]        Usag

215 Dec 25, 2022
Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]

Offline Meta-Reinforcement Learning with Advantage Weighting (MACAW) MACAW code used for the experiments in the ICML 2021 paper. Installing the enviro

Eric Mitchell 28 Jan 01, 2023
PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)

Asym-Siam: On the Importance of Asymmetry for Siamese Representation Learning This is a PyTorch implementation of the Asym-Siam paper, CVPR 2022: @inp

Meta Research 89 Dec 18, 2022