Unified learning approach for egocentric hand gesture recognition and fingertip detection

Overview

Unified Gesture Recognition and Fingertip Detection

A unified convolutional neural network (CNN) algorithm for both hand gesture recognition and fingertip detection at the same time. The proposed algorithm uses a single network to predict both finger class probabilities for classification and fingertips positional output for regression in one evaluation. From the finger class probabilities, the gesture is recognized, and using both of the information fingertips are localized. Instead of directly regressing the fingertips position from the fully connected (FC) layer of the CNN, we regress the ensemble of fingertips position from a fully convolutional network (FCN) and subsequently take ensemble average to regress the final fingertips positional output.

Update

Included robust real-time hand detection using yolo for better smooth performance in the first stage of the detection system and most of the code has been cleaned and restructured for ease of use. To get the previous versions, please visit the release section.

GitHub stars GitHub forks GitHub issues Version GitHub license

Requirements

  • TensorFlow-GPU==2.2.0
  • OpenCV==4.2.0
  • ImgAug==0.2.6
  • Weights: Download the pre-trained weights files of the unified gesture recognition and fingertip detection model and put the weights folder in the working directory.

Downloads Downloads

The weights folder contains three weights files. The fingertip.h5 is for unified gesture recignition and finertiop detection. yolo.h5 and solo.h5 are for the yolo and solo method of hand detection. (what is solo?)

Paper

Paper Paper

To get more information about the proposed method and experiments, please go through the paper. Cite the paper as:

@article{alam2021unified,
title = {Unified learning approach for egocentric hand gesture recognition and fingertip detection},
author={Alam, Mohammad Mahmudul and Islam, Mohammad Tariqul and Rahman, SM Mahbubur},
journal = {Pattern Recognition},
volume = {121},
pages = {108200},
year = {2021},
publisher={Elsevier},
}

Dataset

The proposed gesture recognition and fingertip detection model is trained by employing Scut-Ego-Gesture Dataset which has a total of eleven different single hand gesture datasets. Among the eleven different gesture datasets, eight of them are considered for experimentation. A detailed explanation about the partition of the dataset along with the list of the images used in the training, validation, and the test set is provided in the dataset/ folder.

Network Architecture

To implement the algorithm, the following network architecture is proposed where a single CNN is utilized for both hand gesture recognition and fingertip detection.

Prediction

To get the prediction on a single image run the predict.py file. It will run the prediction in the sample image stored in the data/ folder. Here is the output for the sample.jpg image.

Real-Time!

To run in real-time simply clone the repository and download the weights file and then run the real-time.py file.

directory > python real-time.py

In real-time execution, there are two stages. In the first stage, the hand can be detected by using either you only look once (yolo) or single object localization (solo) algorithm. By default, yolo will be used here. The detected hand portion is then cropped and fed to the second stage for gesture recognition and fingertip detection.

Output

Here is the output of the unified gesture recognition and fingertip detection model for all of the 8 classes of the dataset where not only each fingertip is detected but also each finger is classified.

Comments
  • Datasets

    Datasets

    Hello, I have a question about the dataset from your readme, I can't download the Scut-Ego-Gesture Dataset ,Because in China, this website has been banned. Can you share it with me in other ways? For example, Google or QQ email: [email protected]

    opened by CVUsers 10
  • how to download the weights, code not contain?

    how to download the weights, code not contain?

    The weights folder contains three weights files. The comparison.h5 is for first five classes and performance.h5 is for first eight classes. solo.h5 is for hand detection. but no link

    opened by mmxuan18 6
  • OSError: Unable to open file (unable to open file: name = 'yolo.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    OSError: Unable to open file (unable to open file: name = 'yolo.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    I use the Mac Os to run thereal-time.py file, and get the OSError, I also search on Google to find others' the same problem. It is probably the Keras problem. But I do not how to solve it

    opened by Hanswanglin 4
  • OSError: Unable to open file (unable to open file: name = 'weights/performance.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    OSError: Unable to open file (unable to open file: name = 'weights/performance.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = 'weights/performance.h5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)

    opened by Jasonmes 2
  • left hand?

    left hand?

    Hi, first it's really cool work!

    Is the left hand included in the training images? I have been playing around with some of my own images and it seems that it doesn't really recognize the left hand in a palm-down position...

    If I want to include the left hand, do you think it would be possible if I train the network with the image flipped?

    opened by myhjiang 1
  • why are there two hand detection provided?

    why are there two hand detection provided?

    A wonderful work!!As mentioned above, the Yolo and Solo detection models are provided. I wonder what is the advatange of each model comparing to the other and what is the dataset to train the detect.

    opened by DanielMao2015 1
  • Difference of classes5.h5 and classes8.h5

    Difference of classes5.h5 and classes8.h5

    Hi, May i know the difference when training classes5 and classes8? are the difference from the dataset used for training by excluding SingleSix, SingleSeven, SingleEight or there are other modification such as changing the model structure or parameters?

    Thanks

    opened by danieltanimanuel 1
  • Using old versions of tensorflow, can't install the dependencies on my macbook and with newer versions it's constatly failing.

    Using old versions of tensorflow, can't install the dependencies on my macbook and with newer versions it's constatly failing.

    When trying to install the required version of tensorflow:

    pip3 install tensorflow==1.15.0
    ERROR: Could not find a version that satisfies the requirement tensorflow==1.15.0 (from versions: 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.2.1, 2.2.2, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0, 2.3.1, 2.3.2, 2.4.0rc0, 2.4.0rc1, 2.4.0rc2, 2.4.0rc3, 2.4.0rc4, 2.4.0, 2.4.1)
    ERROR: No matching distribution found for tensorflow==1.15.0
    

    I even tried downloading the .whl file from the pypi and try manually installing it, but that didn't work too:

    pip3 install ~/Downloads/tensorflow-1.15.0-cp37-cp37m-macosx_10_11_x86_64.whl
    ERROR: tensorflow-1.15.0-cp37-cp37m-macosx_10_11_x86_64.whl is not a supported wheel on this platform.
    

    Tried with both python3.6 and python3.8

    So it would be great to update the dependencies :)

    opened by KoStard 1
  • Custom Model keyword arguments Error

    Custom Model keyword arguments Error

    Change model = Model(input=model.input, outputs=[probability, position]) to model = Model(inputs=model.input, outputs=[probability, position]) on line 22 of net/network.py

    opened by Rohit-Jain-2801 1
  • Problem of weights

    Problem of weights

    Hi,when load the solo.h5(In solo.py line 14:"self.model.load_weights(weights)") it will report errors: Process finished with exit code -1073741819 (0xC0000005) keras2.2.5+tensorflow1.14.0+cuda10.0

    opened by MC-E 1
Releases(v2.0)
Owner
Mohammad
Machine Learning | Graduate Research Assistant at CORAL Lab
Mohammad
Split Variational AutoEncoder

Split-VAE Split Variational AutoEncoder Introduction This repository contains and implemementation of a Split Variational AutoEncoder (SVAE). In a SVA

Andrea Asperti 2 Sep 02, 2022
PyTorch implementation of: Michieli U. and Zanuttigh P., "Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations", CVPR 2021.

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations This is the official PyTorch implementation

Multimedia Technology and Telecommunication Lab 42 Nov 09, 2022
The code release of paper 'Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization' NIPS 2020.

Domain Generalization for Medical Imaging Classification with Linear Dependency Regularization The code release of paper 'Domain Generalization for Me

Yufei Wang 56 Dec 28, 2022
Faster Convex Lipschitz Regression

Faster Convex Lipschitz Regression This reepository provides a python implementation of our Faster Convex Lipschitz Regression algorithm with GPU and

Ali Siahkamari 0 Nov 19, 2021
The Official PyTorch Implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 spotlight paper)

Official PyTorch implementation of "VAEBM: A Symbiosis between Variational Autoencoders and Energy-based Models" (ICLR 2021 Spotlight Paper) Zhisheng

NVIDIA Research Projects 45 Dec 26, 2022
Learning to Self-Train for Semi-Supervised Few-Shot

Learning to Self-Train for Semi-Supervised Few-Shot Classification This repository contains the TensorFlow implementation for NeurIPS 2019 Paper "Lear

86 Dec 29, 2022
Offline Reinforcement Learning with Implicit Q-Learning

Offline Reinforcement Learning with Implicit Q-Learning This repository contains the official implementation of Offline Reinforcement Learning with Im

Ilya Kostrikov 126 Jan 06, 2023
Public implementation of the Convolutional Motif Kernel Network (CMKN) architecture

CMKN Implementation of the convolutional motif kernel network (CMKN) introduced in Ditz et al., "Convolutional Motif Kernel Network", 2021. Testing Yo

1 Nov 17, 2021
NeuralTalk is a Python+numpy project for learning Multimodal Recurrent Neural Networks that describe images with sentences.

#NeuralTalk Warning: Deprecated. Hi there, this code is now quite old and inefficient, and now deprecated. I am leaving it on Github for educational p

Andrej 5.3k Jan 07, 2023
Code for the paper "A Study of Face Obfuscation in ImageNet"

A Study of Face Obfuscation in ImageNet Code for the paper: A Study of Face Obfuscation in ImageNet Kaiyu Yang, Jacqueline Yau, Li Fei-Fei, Jia Deng,

35 Oct 04, 2022
TAUFE: Task-Agnostic Undesirable Feature DeactivationUsing Out-of-Distribution Data

A deep neural network (DNN) has achieved great success in many machine learning tasks by virtue of its high expressive power. However, its prediction can be easily biased to undesirable features, whi

KAIST Data Mining Lab 8 Dec 07, 2022
Autolfads-tf2 - A TensorFlow 2.0 implementation of Latent Factor Analysis via Dynamical Systems (LFADS) and AutoLFADS

autolfads-tf2 A TensorFlow 2.0 implementation of LFADS and AutoLFADS. Installati

Systems Neural Engineering Lab 11 Oct 29, 2022
Image-to-Image Translation with Conditional Adversarial Networks (Pix2pix) implementation in keras

pix2pix-keras Pix2pix implementation in keras. Original paper: Image-to-Image Translation with Conditional Adversarial Networks (pix2pix) Paper Author

William Falcon 141 Dec 30, 2022
Keras-tensorflow implementation of Fully Convolutional Networks for Semantic Segmentation(Unfinished)

Keras-FCN Fully convolutional networks and semantic segmentation with Keras. Models Models are found in models.py, and include ResNet and DenseNet bas

645 Dec 29, 2022
Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals"

The Temporal Robustness of Stochastic Signals Code needed to reproduce the examples found in "The Temporal Robustness of Stochastic Signals" Case stud

0 Oct 28, 2021
Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2

Mask2Former: Masked-attention Mask Transformer for Universal Image Segmentation in TensorFlow 2 Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexan

Phan Nguyen 1 Dec 16, 2021
Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling".

PSSL Source code of CIKM2021 Long Paper "PSSL: Self-supervised Learning for Personalized Search with Contrastive Sampling". It consists of the pre-tra

2 Dec 21, 2021
Learning Continuous Signed Distance Functions for Shape Representation

DeepSDF This is an implementation of the CVPR '19 paper "DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation" by Park et a

Meta Research 1.1k Jan 01, 2023
Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive Learning".

ERICA Source code and dataset for ACL2021 paper: "ERICA: Improving Entity and Relation Understanding for Pre-trained Language Models via Contrastive L

THUNLP 75 Nov 02, 2022
On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition

On the Analysis of French Phonetic Idiosyncrasies for Accent Recognition With the spirit of reproducible research, this repository contains codes requ

0 Feb 24, 2022