Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition.

Overview

Sign Language Recognition Service

This is a Sign Language Recognition service utilizing a deep learning model with Long Short-Term Memory to perform sign language recognition. The service was developed as a part of a bachelor project at Aalborg University.

alt text

Requirements

  • Python 3.7
  • OpenPose 1.6.0
  • CUDA 10.0
  • cuDNN 7.5.0
  • Numpy 1.18.5
  • OpenCV 4.5.1.48
  • Flask 1.1.2
  • Tensorflow 2.0.0
  • Pandas 1.1.5
  • Tensorboard
  • Matplotlib
  • Seaborn
  • Scikit-Learn

How to use

Installing OpenPose

  1. Please install OpenPose 1.6.0 for Python by following the official guide. Note that the newest release on the OpenPose github is 1.7.0 - for this service to work, 1.6.0 must be used.

    A few things to note when installing OpenPose:

    • When cloning the OpenPose repository, use the following git command to get version 1.6.0:
      git clone --depth 1 --branch v1.6.0 https://github.com/CMU-Perceptual-Computing-Lab/openpose
      
    • Remember to run the following command on the newly cloned repository:
      git submodule update --init --recursive --remote
      
    • Use Visual Studio Enterprise 2017 to build the required files. Install this first if you do not already have it.
    • Install CUDA 10.0 and cuDNN 7.5.0 for CUDA 10.0 after installing Visual Studio Enterprise 2017.
    • When generating the files using CMake, make sure that the BUILD_PYTHON flag is enabled, and that the Python version is set to 3.7. Also make sure that the detected CUDA version is 10.0.
    • After building with Visual Studio Enterprise 2017, make sure that all necessary files have been generated.
      • There should be a openpose.dll in /x64/Release/
      • There should be a openpose.exp and openpose.lib in /src/openpose/Release/
      • There should be a pyopenpose.cp37-win_amd64.pyd in /python/openpose/Release/
  2. Install requirements from requirements.txt

  3. Change the path in main/openpose/paths.py to the path of your OpenPose installation:

    # Change this path so it points to your OpenPose path relative to this file
    OPEN_POSE_PATH = get_relative_path(__file__, '../../../../openpose')
    
  4. If you get any errors related to OpenPose when running the service, please go back and make sure that all instructions have been followed - be particularly careful to install the correct CUDA/cuDNN versions, make sure that the BUILD_PYTHON flag was enabled and that Python 3.7 was used when generating the files.

When OpenPose is successfully installed, you can either use the existing model trained on our dataset, or you can choose to make your own dataset and train a model on this instead.

alt text

Using the service

A singular endpoint '/recognize' has been created in order to perform recognition, which allows for POST requests to be made. The endpoint expects a sequence of base64 images, which will get converted into a suitable format recognizable by the classifier.

alt text

alt text

Creating a custom dataset

In order to create a custom dataset, you can access the file create_dataset.py and change the following constant:

DATASET_NAME = 'dsl_dataset'

Such that the path in the constant DATASET_DIR points to a folder where the dataset is located. This folder should contain another folder called 'src', which contains folders for all the desired labels in the dataset. Each of these folders should contain videos of the corresponding sign.

Before running the script, the following constants can be tweaked based on the desired settings:

WINDOW_LENGTH = 60
STRIDE = 5
BATCH_SIZE = 512
VAL_SPLIT = 0.2
TEST_SPLIT = 0.1

Finally, the following constant can be changed:

CREATE_RAW_DATA = True

This is because initial feature extraction by OpenPose can be a fairly lengthy process. This allows for the tweaking of the dataset after features have been extracted, by setting this to False. Note that the raw OpenPose data must be created before the actual dataset can be created, so it is necessary to do this at least once.

Training a custom model

In order to train a custom model you can make use of the train_models.py file. Here, the constant DATASET_NAME can be changed to reflect the name of the dataset you wish to use, such that the DATASET_DIR points to the correct folder. Furthermore, you can specify a tensorboard directory:

DATASET_NAME = 'dsl_dataset'
DATASET_DIR = f'.\\main\\algorithm\\datasets\\{DATASET_NAME}'
MODELS_DIR = f'.\\main\\algorithm\\models\\{DATASET_NAME}'
TENSORBOARD_DIR = f'{MODELS_DIR}\\logs'

Before running the script, you can tweak various training settings as well as the hyper parameters of the model by changing the following constants:

MODEL_NAME = "model"
EPOCHS = 25
LAYER_SIZES = [64]
DENSE_LAYERS = [0]
DENSE_ACTIVATION = "relu"
LSTM_LAYERS = [2]
LSTM_ACTIVATION = "tanh"
OUTPUT_ACTIVATION = "softmax"

Note that the trainer can train multiple models depending on these settings. Changing the LAYER_SIZES, DENSE_LAYERS and LSTM_LAYERS to contain several values will result in a model being trained for each possible combination.

After training your model, you should change the paths.py located in main/core/ to reflect the path to the new model by changing the constant MODEL_NAME to the name of your model:

MODEL_NAME = 'dsl_lstm.model'

Finally, it also possible to generate a confusion matrix for your model by using the generate_confusion_matrix.py script. Here, you simply change the constants DATASET_NAME and MODEL_NAME such that the DATASET_DIR points to your dataset directory, and MODEL_DIR points to your model directory, respectively:

DATASET_NAME = "dsl_dataset"
MODEL_NAME = "dsl_lstm"
DATASET_DIR = f"./main/algorithm/datasets/{DATASET_NAME}/{DATASET_NAME}.pickle"
MODEL_DIR = f"./main/algorithm/models/{DATASET_NAME}/{MODEL_NAME}"

Happy signing :O)

Authors

  • Adil Cemalovic
  • Martin Lønne
  • Magnus Helleshøj Lund
Owner
Martin Lønne
Full-stack software developer with an interest in Cloud development. Is working most with Javascript, C#, and Python for machine learning.
Martin Lønne
A simple demo program for using OpenCV on Android

Kivy OpenCV Demo A simple demo program for using OpenCV on Android Build with: buildozer android debug deploy run Run (on desktop) with: python main.p

Andrea Ranieri 13 Dec 29, 2022
Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract

Responsive Doc. scanner using U^2-Net, Textcleaner and Tesseract Toolset U^2-Net is used for background removal Textcleaner is used for image cleaning

3 Jul 13, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
DouZero is a reinforcement learning framework for DouDizhu - 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

Kwai 3.1k Jan 05, 2023
A pure pytorch implemented ocr project including text detection and recognition

ocr.pytorch A pure pytorch implemented ocr project. Text detection is based CTPN and text recognition is based CRNN. More detection and recognition me

coura 444 Dec 30, 2022
[python3.6] 运用tf实现自然场景文字检测,keras/pytorch实现ctpn+crnn+ctc实现不定长场景文字OCR识别

本文基于tensorflow、keras/pytorch实现对自然场景的文字检测及端到端的OCR中文文字识别 update20190706 为解决本项目中对数学公式预测的准确性,做了其他的改进和尝试,效果还不错,https://github.com/xiaofengShi/Image2Katex 希

xiaofeng 2.7k Dec 25, 2022
Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Omdena-abuja-anpd - Automatic Number Plate Detection for the security of lives and properties using Computer Vision.

Abdulazeez Jimoh 1 Jan 01, 2022
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
Code for the paper STN-OCR: A single Neural Network for Text Detection and Text Recognition

STN-OCR: A single Neural Network for Text Detection and Text Recognition This repository contains the code for the paper: STN-OCR: A single Neural Net

Christian Bartz 496 Jan 05, 2023
Color Picker and Color Detection tool for METR4202

METR4202 Color Detection Help This is sample code that can be used for the METR4202 project demo. There are two files provided, both running on Python

Miguel Valencia 1 Oct 23, 2021
Détection de créneaux de vaccination disponibles pour l'outil ViteMaDose

Vite Ma Dose ! est un outil open source de CovidTracker permettant de détecter les rendez-vous disponibles dans votre département afin de vous faire v

CovidTracker 239 Dec 13, 2022
CTPN + DenseNet + CTC based end-to-end Chinese OCR implemented using tensorflow and keras

简介 基于Tensorflow和Keras实现端到端的不定长中文字符检测和识别 文本检测:CTPN 文本识别:DenseNet + CTC 环境部署 sh setup.sh 注:CPU环境执行前需注释掉for gpu部分,并解开for cpu部分的注释 Demo 将测试图片放入test_images

Yang Chenguang 2.6k Dec 29, 2022
This project modify tensorflow object detection api code to predict oriented bounding boxes. It can be used for scene text detection.

This is an oriented object detector based on tensorflow object detection API. Most of the code is not changed except for those related to the need of

Dafang He 30 Oct 22, 2022
7th place solution

SIIM-FISABIO-RSNA-COVID-19-Detection 7th place solution Validation: We used iterative-stratification with 5 folds (https://github.com/trent-b/iterativ

11 Jul 17, 2022
Simple SDF mesh generation in Python

Generate 3D meshes based on SDFs (signed distance functions) with a dirt simple Python API.

Michael Fogleman 1.1k Jan 08, 2023
Python Computer Vision application that allows users to draw/erase on the screen using their webcam.

CV-Virtual-WhiteBoard The Virtual WhiteBoard is a project I made using the OpenCV and Mediapipe Python libraries. Using your index and middle finger y

Stephen Wang 1 Jan 07, 2022
Image processing is one of the most common term in computer vision

Image processing is one of the most common term in computer vision. Computer vision is the process by which computers can understand images and videos, and how they are stored, manipulated, and retri

Happy N. Monday 3 Feb 15, 2022
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

Tips: A more recent scene text detection algorithm: PixelLink, has been implemented here: https://github.com/ZJULearning/pixel_link Contents: Introduc

dengdan 484 Dec 07, 2022
A set of workflows for corpus building through OCR, post-correction and normalisation

PICCL: Philosophical Integrator of Computational and Corpus Libraries PICCL offers a workflow for corpus building and builds on a variety of tools. Th

Language Machines 41 Dec 27, 2022
The Open Source Framework for Machine Vision

SimpleCV Quick Links: About Installation [Docker] (#docker) Ubuntu Virtual Environment Arch Linux Fedora MacOS Windows Raspberry Pi SimpleCV Shell Vid

Sight Machine 2.6k Dec 31, 2022