This tool uses Deep Learning to help you draw and write with your hand and webcam.

Last update: Dec 10, 2022

Related tags

Overview

air-drawing 👆

This tool uses Deep Learning to help you draw and write with your hand and webcam. A Deep Learning model is used to try to predict whether you want to have 'pencil up' or 'pencil down'.

Try it online : loicmagne.github.io/air-drawing

Technical Details

This pipeline is made up of two steps: detecting the hand, and predicting the drawing. Both steps are done using Deep Learning.
The handpose detection is performed using MediaPipe toolbox
The drawing prediction part uses only the finger position, not the image. The input is a sequence of 2D points (actually i'm using the speed and acceleration of the finger instead of the position to make the prediction translation-invariant), and the output is a binary classification 'pencil up' or 'pencil down'. I used a simple bidirectionnal LSTM architecture. I made a small dataset myself (~50 samples) which I annotated thanks to tools provided in the python-stuff/data-wrangling/. At first I wanted to make the 'pencil up'/'pencil down' prediction in real-time, i.e. make the predictions at the same time the user draws. However this task was too difficult and I had poor results, which is why I'm now using bidirectionnal LSTM. You can find details of the deep learning pipeline in the jupyter-notebook in python-stuff/deep-learning/
The application is entirely client-side. I deployed the deep learning model by converting the PyTorch model to .onnx, and then using the ONNX Runtime which is very convenient and compatible with a lot of layers.

Going Forward

Overall the pipeline still struggles and needs some improvement. Ideas of amelioration include :

Having a bigger dataset, with more diverse user data.
Process and smooth the finger signal, to be less dependent on camera quality, and to improve model generalization.

This tool uses Deep Learning to help you draw and write with your hand and webcam.

Related tags

Overview

air-drawing 👆

Technical Details

Going Forward

Owner

lmagne

Project repo for the paper SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition

Generate fine-tuning samples & Fine-tuning the model & Generate samples by transferring Note On

Learning to trade under the reinforcement learning framework

Square Root Bundle Adjustment for Large-Scale Reconstruction

Torchyolo - Yolov3 ve Yolov4 modellerin Pytorch uygulamasıdır

Code for the paper Relation Prediction as an Auxiliary Training Objective for Improving Multi-Relational Graph Representations (AKBC 2021).

The official project of SimSwap (ACM MM 2020)

A bunch of random PyTorch models using PyTorch's C++ frontend

A python module for configuration of block devices

Use CLIP to represent video for Retrieval Task

WRENCH: Weak supeRvision bENCHmark

The official implementation of ELSA: Enhanced Local Self-Attention for Vision Transformer

Learn the Deep Learning for Computer Vision in three steps: theory from base to SotA, code in PyTorch, and space-repetition with Anki

Implementations of LSTM: A Search Space Odyssey variants and their training results on the PTB dataset.

A repository for benchmarking neural vocoders by their quality and speed.

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

MoCoGAN: Decomposing Motion and Content for Video Generation

Industrial Image Anomaly Localization Based on Gaussian Clustering of Pre-trained Feature

Learning to Adapt Structured Output Space for Semantic Segmentation, CVPR 2018 (spotlight)

PyTorch implementation of Pay Attention to MLPs