In this project we will be using the live feed coming from the webcam to create a virtual mouse with complete functionalities.

Overview

Virtual Mouse Using OpenCV

In this project we will be using the live feed coming from the webcam to create a virtual mouse using hand tracking.

Project Description:

In this project, I am using my hand as a virtual mouse than can do everything that a mouse does without even touching your system. I am using the webcam of my system to detect my hands. It will then create a bounding box around my hand and focus on two fingers: The fore finger and the middle finger. The fore finger will act as a cursor and moving it around, we will be moving the cursor around. Now, inorder to successfully click using hand tracking, it is detecting the distance between the fore finger and the middle finger. If they are joined together, then it will perform a click.

Furthermore, a smoothness factor was added as the movement was really shaky.

Alt Text

Requirements:

Following modules need to be installed for it to work properly:

  • OpenCV
  • Mediapipe
  • Autopy

OpenCV:

OpenCV is a huge open-source library for computer vision, machine learning, and image processing. OpenCV supports a wide variety of programming languages like Python, C++, Java, etc. It can process images and videos to identify objects, faces, or even the handwriting of a human.

It can be installed using "pip install opencv-python"

Mediapipe:

MediaPipe is a framework for building multimodal (eg. video, audio, any time series data), cross platform (i.e Android, iOS, web, edge devices) applied ML pipelines.

It can be installed using "pip install mediapipe"

Autopy:

AutoPy is a simple, cross-platform GUI automation library for Python. It includes functions for controlling the keyboard and mouse, finding colors and bitmaps on-screen, and displaying alerts.

It can be installed using "pip install autopy"

Important Note:

I faced alot of dependency issues throughout this project. Some of the issues and their solutions are as follows:

  • autopy not installing: This is because autopy currently doesn't support Python versions above 3.8
  • webcam not opening: It was a bug in mediapipe and was fixed in latest python versions

Hence, inorder for the project to run smoothly, you need to degrade the Python version to 3.8

How to Degrade Python Version:

Follow the following steps:

  • Uninstall Python from add/remove programs
  • Go to AppData and remove any python folder you see.
  • Download Python 3.8 from this link : Python 3.8
  • Install it.
  • Open command promt and run "pip" inorder to confirm installation.
  • Your Python version has been degraded :)

Contact Information:

For any further queries, feel free to contact me at:

Email: [email protected]

LinkedIn : Hassan Shahzad

Owner
Hassan Shahzad
I am a passionate Software Developer who is always aspiring to learn more. Practice makes perfect and I am always working on improving my skills and myself.
Hassan Shahzad
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
Scan the MRZ code of a passport and extract the firstname, lastname, passport number, nationality, date of birth, expiration date and personal numer.

PassportScanner Works with 2 and 3 line identity documents. What is this With PassportScanner you can use your camera to scan the MRZ code of a passpo

Edwin Vermeer 441 Dec 24, 2022
Indonesian ID Card OCR using tesseract OCR

KTP OCR Indonesian ID Card OCR using tesseract OCR KTP OCR is python-flask with tesseract web application to convert Indonesian ID Card to text / JSON

Revan Muhammad Dafa 5 Dec 06, 2021
OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched

OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched or copy-pasted. ocrmypdf # it's a scriptable c

jbarlow83 7.9k Jan 03, 2023
An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports.

Optical_Character_Recognition An Optical Character Recognition system using Pytesseract/Extracting data from Blood Pressure Reports. As an IOT/Compute

Ramsis Hammadi 1 Feb 12, 2022
Links to awesome OCR projects

Awesome OCR This list contains links to great software tools and libraries and literature related to Optical Character Recognition (OCR). Contribution

Konstantin Baierer 2.2k Jan 02, 2023
Handwritten Character Recognition using CNN

Handwritten Character Recognition using CNN Problem Definition The main objective of this project is to solve the problem of handwritten character rec

Mohit Kaushik 4 Mar 02, 2022
Read Japanese manga inside browser with selectable text.

mokuro Read Japanese manga with selectable text inside a browser. See demo: https://kha-white.github.io/manga-demo mokuro_demo.mp4 Demo contains excer

Maciej Budyś 170 Dec 27, 2022
Python library to extract tabular data from images and scanned PDFs

Overview ExtractTable - API to extract tabular data from images and scanned PDFs The motivation is to make it easy for developers to extract tabular d

Org. Account 165 Dec 31, 2022
SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition

SCOUTER: Slot Attention-based Classifier for Explainable Image Recognition PDF Abstract Explainable artificial intelligence has been gaining attention

87 Dec 26, 2022
[BMVC'21] Official PyTorch Implementation of Grounded Situation Recognition with Transformers

Grounded Situation Recognition with Transformers Paper | Model Checkpoint This is the official PyTorch implementation of Grounded Situation Recognitio

Junhyeong Cho 18 Jul 19, 2022
The papers published in top-tier AI conferences in recent years.

AI-conference-papers The papers published in top-tier AI conferences in recent years. Paper table AAAI ICLR CVPR ICML ICCV ECCV NIPS 2019 ✔️ ✔️ ✔️ ✔️

Jinbae Park 6 Dec 09, 2022
The official code for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates".

SpeechDrivesTemplates The official repo for the ICCV-2021 paper "Speech Drives Templates: Co-Speech Gesture Synthesis with Learned Templates". [arxiv

Qian Shenhan 53 Dec 23, 2022
Using computer vision method to recognize and calcutate the features of the architecture.

building-feature-recognition In this repository, we accomplished building feature recognition using traditional/dl-assisted computer vision method. Th

4 Aug 11, 2022
code for our ICCV 2021 paper "DeepCAD: A Deep Generative Network for Computer-Aided Design Models"

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022
Official code for ROCA: Robust CAD Model Retrieval and Alignment from a Single Image (CVPR 2022)

ROCA: Robust CAD Model Alignment and Retrieval from a Single Image (CVPR 2022) Code release of our paper ROCA. Check out our video, paper, and website

123 Dec 25, 2022
Python rubik's cube solver

This program makes a 3D representation of a rubiks cube and solves it step by step.

Pablo QB 4 May 29, 2022
Handwritten Text Recognition (HTR) using TensorFlow 2.x

Handwritten Text Recognition (HTR) system implemented using TensorFlow 2.x and trained on the Bentham/IAM/Rimes/Saint Gall/Washington offline HTR data

Arthur Flôr 160 Dec 21, 2022
Polaris is a Face recognition attendance system .

Support Me 🚀 About Polaris 📄 Polaris is a system based on facial recognition with a futuristic GUI design, Can easily find people informations store

XN3UR0N 215 Dec 26, 2022
(CVPR 2021) ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

ST3D Code release for the paper ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection, CVPR 2021 Authors: Jihan Yang*, Shaoshu

CVMI Lab 224 Dec 28, 2022