Real-time object detection on Android using the YOLO network with TensorFlow

Last update: Jan 03, 2023

Overview

TensorFlow YOLO object detection on Android

android-yolo is the first implementation of YOLO for TensorFlow on an Android device. It is compatible with Android Studio and usable out of the box. It can detect the 20 classes of objects in the Pascal VOC dataset: aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, dining table, dog, horse, motorbike, person, potted plant, sheep, sofa, train and tv/monitor. The network only outputs one predicted bounding box at a time for now. The code can and will be extended in the future to output several predictions.

To use this demo first clone the repository. Download the TensorFlow YOLO model and put it in android-yolo/app/src/main/assets. Then open the project on Android Studio. Once the project is open you can run the project on your Android device using the Run 'app' command and selecting your device.

NEW: The standalone APK has been released and you can find it here. Just open your browser on your Android device and download the APK file. When the file has been downloaded it should begin installing on your device after you grant the required permissions.

GPUs are not currently supported by TensorFlow on Android. If you have a decent Android device you will have around two frames per second of processed images.

Here is a video showing a small demo of the app.

Nataniel Ruiz
School of Interactive Computing
Georgia Institute of Technology

Credits: App launch icon made by Freepik from Flaticon is licensed by Creative Commons BY 3.0.

Disclaimer: The app is hardcoded for 20 classes and for the tiny-yolo network final output layer. You can check the following code if you want to change this:

https://github.com/natanielruiz/android-yolo/blob/master/app/src/main/java/org/tensorflow/demo/TensorflowClassifier.java

The code describes the interpretation of the output.

The code for the network inference pass is written in C++ and the output is passed to Java. The output of the network is in the form of a String which is converted to a StringTokenizer and is then converted into an array of Floats in line 87 of TensorflowClassifier.java

You can work from there and read the papers to transform the new yolo model output into something that makes sense. (I did it only for one bounding box and also obtained the confidence of this bounding box). This part of the code is commented by me so you can understand what I did. Also read the paper here: https://arxiv.org/abs/1506.02640

Real-time object detection on Android using the YOLO network with TensorFlow

Related tags

Overview

TensorFlow YOLO object detection on Android

Owner

Nataniel Ruiz

Real-time LIDAR-based Urban Road and Sidewalk detection for Autonomous Vehicles 🚗

Real-time pose estimation accelerated with NVIDIA TensorRT

Supplementary code for TISMIR paper "Sliding-Window Pitch-Class Histograms as a Means of Modeling Musical Form"

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

IJCAI2020 & IJCV 2020 :city_sunrise: Unsupervised Scene Adaptation with Memory Regularization in vivo

LEDNet: A Lightweight Encoder-Decoder Network for Real-time Semantic Segmentation

This is the official implementation for the paper "(Almost) Free Incentivized Exploration from Decentralized Learning Agents" in NeurIPS 2021.

Categorizing comments on YouTube into different categories.

ISBI 2022: Cross-level Contrastive Learning and Consistency Constraint for Semi-supervised Medical Image.

🐾 Semantic segmentation of paws from cute pet images (PyTorch)

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

Code for "My(o) Armband Leaks Passwords: An EMG and IMU Based Keylogging Side-Channel Attack" paper

Estimation of human density in a closed space using deep learning.

Interactive Terraform visualization. State and configuration explorer.

Simple and Robust Loss Design for Multi-Label Learning with Missing Labels

Official implementation for Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder at NeurIPS 2020

QAT(quantize aware training) for classification with MQBench

Object-aware Contrastive Learning for Debiased Scene Representation

The mini-MusicNet dataset

PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition, CVPR 2018