Keras Realtime Multi-Person Pose Estimation - Keras version of Realtime Multi-Person Pose Estimation project

Last update: Dec 08, 2022

Overview

This repository has become incompatible with the latest and recommended version of Tensorflow 2.0 Instead of refactoring this code painfully, I created a new fresh repository with some additional features like:

Training code for smaller model based on MobilenetV2.
Visualisation of predictions (heatmaps, pafs) in Tensorboard.
Additional scripts to convert and test models for Tensorflow Lite.

Here is the link to the new repo: tensorflow_Realtime_Multi-Person_Pose_Estimation

Realtime Multi-Person Pose Estimation (DEPRECATED)

This is a keras version of Realtime Multi-Person Pose Estimation project

Introduction

Code repo for reproducing 2017 CVPR paper using keras.

This is a new improved version. The main objective was to remove dependency on separate c++ server which besides the complexity of compiling also contained some bugs... and was very slow. The old version utilizing rmpe_dataset_server is still available under the tag v0.1 if you really would like to take a look.

Require

Keras
Caffe - docker required if you would like to convert caffe model to keras model. You don't have to compile/install caffe on your local machine.

Converting Caffe model to Keras model

Authors of original implementation released already trained caffe model which you can use to extract weights data.

Download caffe model cd model; sh get_caffe_model.sh
Dump caffe layers to numpy data cd ..; docker run -v [absolute path to your keras_Realtime_Multi-Person_Pose_Estimation folder]:/workspace -it bvlc/caffe:cpu python dump_caffe_layers.py Note that docker accepts only absolute paths so you have to set the full path to the folder containing this project.
Convert caffe model (from numpy data) to keras model python caffe_to_keras.py

Testing steps

Convert caffe model to keras model or download already converted keras model https://www.dropbox.com/s/llpxd14is7gyj0z/model.h5
Run the notebook demo.ipynb.
python demo_image.py --image sample_images/ski.jpg to run the picture demo. Result will be stored in the file result.png. You can use any image file as an input.

Training steps

Install gsutil curl https://sdk.cloud.google.com | bash. This is a really helpful tool for downloading large datasets.
Download the data set (~25 GB) cd dataset; sh get_dataset.sh,
Download COCO official toolbox in dataset/coco/ .
cd coco/PythonAPI; sudo python setup.py install to install pycocotools.
Go to the "training" folder cd ../../../training.
Optionally, you can set the number of processes used to generate samples in parallel dataset.py -> find the line df = PrefetchDataZMQ(df, nr_proc=4)
Run the command in terminal python train_pose.py

Changes

25/06/2018

Performance improvement thanks to replacing c++ server rmpe_dataset_server with tensorpack dataflow. Tensorpack is a very efficient library for preprocessing and data loading for tensorflow models. Dataflow object behaves like a normal Python iterator but it can generate samples using many processes. This significantly reduces latency when GPU waits for the next sample to be processed.
Masks generated on the fly - no need to run separate scripts to generate masks. In fact most of the mask were only positive (nothing to mask out)
Masking out the discarded persons who are too close to the main person in the picture, so that the network never sees unlabelled people. Previously we filtered out keypoints of such smaller persons but they were still visible in the picture.
Incorrect handling of masks has been fixed. The rmpe_dataset_server sometimes assigned a wrong mask to the image, misleading the network.

26/10/2017

Fixed problem with the training procedure. Here are my results after training for 5 epochs = 25000 iterations (1 epoch is ~5000 batches) The loss values are quite similar as in the original training - output.txt

Results of running demo_image --image sample_images/ski.jpg --model training/weights.best.h5 with model trained only 25000 iterations. Not too bad !!! Training on my single 1070 GPU took around 10 hours.

22/10/2017

Augmented samples are fetched from the server. The network never sees the same image twice which was a problem in previous approach (tool rmpe_dataset_transformer) This allows you to run augmentation locally or on separate node. You can start 2 instances, one serving training set and a second one serving validation set (on different port if locally)

Related repository

CVPR'16, Convolutional Pose Machines.
CVPR'17, Realtime Multi-Person Pose Estimation.

Citation

Please cite the paper in your publications if it helps your research:

@InProceedings{cao2017realtime,
  title = {Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields},
  author = {Zhe Cao and Tomas Simon and Shih-En Wei and Yaser Sheikh},
  booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year = {2017}
  }

Keras Realtime Multi-Person Pose Estimation - Keras version of Realtime Multi-Person Pose Estimation project

Related tags

Overview

This repository has become incompatible with the latest and recommended version of Tensorflow 2.0 Instead of refactoring this code painfully, I created a new fresh repository with some additional features like:

Training code for smaller model based on MobilenetV2.

Visualisation of predictions (heatmaps, pafs) in Tensorboard.

Additional scripts to convert and test models for Tensorflow Lite.

Here is the link to the new repo: tensorflow_Realtime_Multi-Person_Pose_Estimation

Realtime Multi-Person Pose Estimation (DEPRECATED)

Introduction

Results

Contents

Require

Converting Caffe model to Keras model

Testing steps

Training steps

Changes

Related repository

Citation

Owner

M Faber

Scikit-event-correlation - Event Correlation and Forecasting over High Dimensional Streaming Sensor Data algorithms

Negative Sample Matters: A Renaissance of Metric Learning for Temporal Grounding

An Artificial Intelligence trying to drive a car by itself on a user created map

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

MetaDrive: Composing Diverse Scenarios for Generalizable Reinforcement Learning

This is the unofficial code of Deep Dual-resolution Networks for Real-time and Accurate Semantic Segmentation of Road Scenes. which achieve state-of-the-art trade-off between accuracy and speed on cityscapes and camvid, without using inference acceleration and extra data

Rotated Box Is Back : Accurate Box Proposal Network for Scene Text Detection

Amazon Forest Computer Vision: Satellite Image tagging code using PyTorch / Keras with lots of PyTorch tricks

Repository of continual learning papers

This repo contains source code and materials for the TEmporally COherent GAN SIGGRAPH project.

CVPR2022 (Oral) - Rethinking Semantic Segmentation: A Prototype View

This repo is about to create the Streamlit application for given ML model.

Official implementation of the network presented in the paper "M4Depth: A motion-based approach for monocular depth estimation on video sequences"

Tensorflow-seq2seq-tutorials - Dynamic seq2seq in TensorFlow, step by step

CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images

Constrained Logistic Regression - How to apply specific constraints to logistic regression's coefficients

SpeechBrain is an open-source and all-in-one speech toolkit based on PyTorch.

tree-math: mathematical operations for JAX pytrees

Implementation of ML models like Decision tree, Naive Bayes, Logistic Regression and many other

An efficient 3D semantic segmentation framework for Urban-scale point clouds like SensatUrban, Campus3D, etc.