Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

Last update: Oct 03, 2022

Overview

human-pose-estimation-3d-python-cpp

RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used)

1. Run

1-1. RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used)

$ xhost +local: && \
docker run -it --rm \
-v `pwd`:/home/user/workdir \
-v /tmp/.X11-unix/:/tmp/.X11-unix:rw \
--device /dev/video0:/dev/video0:mwr \
--device /dev/video1:/dev/video1:mwr \
--device /dev/video2:/dev/video2:mwr \
--device /dev/video3:/dev/video3:mwr \
--device /dev/video4:/dev/video4:mwr \
--device /dev/video5:/dev/video5:mwr \
--net=host \
-e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR \
-e DISPLAY=$DISPLAY \
--privileged \
ghcr.io/pinto0309/openvino2tensorflow:latest

$ python3 human_pose_estimation_3d_demo.py \
--model models/openvino/FP16/human-pose-estimation-3d-0001_bgr_480x640.xml \
--device CPU \
--input 4

1-2. RealSenseD435 (RGB) 480x640 + iGPU (OpenCL)

$ xhost +local: && \
docker run -it --rm \
-v `pwd`:/home/user/workdir \
-v /tmp/.X11-unix/:/tmp/.X11-unix:rw \
--device /dev/video0:/dev/video0:mwr \
--device /dev/video1:/dev/video1:mwr \
--device /dev/video2:/dev/video2:mwr \
--device /dev/video3:/dev/video3:mwr \
--device /dev/video4:/dev/video4:mwr \
--device /dev/video5:/dev/video5:mwr \
--net=host \
-e LIBVA_DRIVER_NAME=iHD \
-e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR \
-e DISPLAY=$DISPLAY \
--privileged \
ghcr.io/pinto0309/openvino2tensorflow:latest

$ python3 human_pose_estimation_3d_demo.py \
--model models/openvino/FP16/human-pose-estimation-3d-0001_bgr_480x640.xml \
--device GPU \
--input 4

1-3. General USB Camera 480x640 + CPU

$ xhost +local: && \
docker run -it --rm \
-v `pwd`:/home/user/workdir \
-v /tmp/.X11-unix/:/tmp/.X11-unix:rw \
--device /dev/video0:/dev/video0:mwr \
--net=host \
-e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR \
-e DISPLAY=$DISPLAY \
--privileged \
ghcr.io/pinto0309/openvino2tensorflow:latest

$ python3 human_pose_estimation_3d_demo.py \
--model models/openvino/FP16/human-pose-estimation-3d-0001_bgr_480x640.xml \
--device CPU \
--input 0

2. Build

$ PYTHON_PREFIX=$(python3 -c "import sys; print(sys.prefix)") \
&& PYTHON_VERSION=$(python3 -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}')") \
&& PYTHON_INCLUDE_DIRS=${PYTHON_PREFIX}/include/python${PYTHON_VERSION}

$ NUMPY_INCLUDE_DIR=$(python3 -c "import numpy; print(numpy.get_include())")

$ mkdir -p pose_extractor/build && cd pose_extractor/build

$ cmake \
-DPYTHON_INCLUDE_DIRS=${PYTHON_INCLUDE_DIRS} \
-DNUMPY_INCLUDE_DIR=${NUMPY_INCLUDE_DIR} ..

$ make && cp pose_extractor.so ../.. && cd ../..

Monocular 3D pose estimation. OpenVINO. CPU inference or iGPU (OpenCL) inference.

Related tags

Overview

human-pose-estimation-3d-python-cpp

1. Run

1-1. RealSenseD435 (RGB) 480x640 + CPU Corei9 45 FPS (Depth is not used)

1-2. RealSenseD435 (RGB) 480x640 + iGPU (OpenCL)

1-3. General USB Camera 480x640 + CPU

2. Build

3. Reference

Owner

Katsuya Hyodo

CUDA Python Low-level Bindings

Segmentation models with pretrained backbones. PyTorch.

Volumetric parameterization of the placenta to a flattened template

Code for Phase diagram of Stochastic Gradient Descent in high-dimensional two-layer neural networks

🔥 Cogitare - A Modern, Fast, and Modular Deep Learning and Machine Learning framework for Python

Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Official PyTorch implementation of "VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization" (CVPR 2021)

Time Delayed NN implemented in pytorch

StarGAN - Official PyTorch Implementation (CVPR 2018)

pytorch implementation for Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network arXiv:1609.04802

Multi-Stage Episodic Control for Strategic Exploration in Text Games

The source code of the paper "SHGNN: Structure-Aware Heterogeneous Graph Neural Network"

PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

Riemannian Convex Potential Maps

Python code to fuse multiple RGB-D images into a TSDF voxel volume.

Official repository for "Restormer: Efficient Transformer for High-Resolution Image Restoration". SOTA results for single-image motion deblurring, image deraining, image denoising (synthetic and real data), and dual-pixel defocus deblurring.

Localized representation learning from Vision and Text (LoVT)

pix2pix in tensorflow.js

Read number plates with https://platerecognizer.com/

PyTorch implementation of Asymmetric Siamese (https://arxiv.org/abs/2204.00613)