PyTorch implementation for 3D human pose estimation

Last update: Dec 22, 2022

Related tags

Overview

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

This repository is the PyTorch implementation for the network presented in:

Xingyi Zhou, Qixing Huang, Xiao Sun, Xiangyang Xue, Yichen Wei, Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach ICCV 2017 (arXiv:1704.02447)

Note: This repository has been updated and is different from the method discribed in the paper. To fully reproduce the results in the paper, please checkout the original torch implementation or our pytorch re-implementation branch (slightly worse than torch). We also provide a clean 2D hourglass network branch.

The updates include:

Change network backbone to ResNet50 with deconvolution layers (Xiao et al. ECCV2018). Training is now about 3x faster than the original hourglass net backbone (but no significant performance improvement).
Change the depth regression sub-network to a one-layer depth map (described in our StarMap project).
Change the Human3.6M dataset to official release in ECCV18 challenge.
Update from python 2.7 and pytorch 0.1.12 to python 3.6 and pytorch 0.4.1.

Contact: [email protected]

Installation

The code was tested with Anaconda Python 3.6 and PyTorch v0.4.1. After install Anaconda and Pytorch:

Clone the repo:

POSE_ROOT=/path/to/clone/pytorch-pose-hg-3d
git clone https://github.com/xingyizhou/pytorch-pose-hg-3d POSE_ROOT

Install dependencies (opencv, and progressbar):

conda install --channel https://conda.anaconda.org/menpo opencv
conda install --channel https://conda.anaconda.org/auto progress

Disable cudnn for batch_norm (see issue):

# PYTORCH=/path/to/pytorch
# for pytorch v0.4.0
sed -i "1194s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py
# for pytorch v0.4.1
sed -i "1254s/torch\.backends\.cudnn\.enabled/False/g" ${PYTORCH}/torch/nn/functional.py

Optionally, install tensorboard for visializing training.
```
pip install tensorflow
```

Demo

Download our pre-trained model and move it to models.
Run python demo.py --demo /path/to/image/or/image/folder [--gpus -1] [--load_model /path/to/model].

--gpus -1 is for CPU mode. We provide example images in images/. For testing your own image, it is important that the person should be at the center of the image and most of the body parts should be within the image.

Benchmark Testing

To test our model on Human3.6 dataset run

python main.py --exp_id test --task human3d --dataset fusion_3d --load_model ../models/fusion_3d_var.pth --test --full_test

The expected results should be 64.55mm.

Training

Prepare the training data:

Download images from MPII dataset and their annotation in json format (train.json and val.json) (from Xiao et al. ECCV2018).
Download Human3.6M ECCV challenge dataset.
Download meta data (2D bounding box) of the Human3.6 dataset (from Sun et al. ECCV 2018).
Place the data (or create symlinks) to make the data folder like:

${POSE_ROOT}
|-- data
`-- |-- mpii
    `-- |-- annot
        |   |-- train.json
        |   |-- valid.json
        `-- images
            |-- 000001163.jpg
            |-- 000003072.jpg
`-- |-- h36m
    `-- |-- ECCV18_Challenge
        |   |-- Train
        |   |-- Val
        `-- msra_cache
            `-- |-- HM36_eccv_challenge_Train_cache
                |   |-- HM36_eccv_challenge_Train_w288xh384_keypoint_jnt_bbox_db.pkl
                `-- HM36_eccv_challenge_Val_cache
                    |-- HM36_eccv_challenge_Val_w288xh384_keypoint_jnt_bbox_db.pkl

Stage1: Train 2D pose only. model, log

python main.py --exp_id mpii

Stage2: Train on 2D and 3D data without geometry loss (drop LR at 45 epochs). model, log

python main.py --exp_id fusion_3d --task human3d --dataset fusion_3d --ratio_3d 1 --weight_3d 0.1 --load_model ../exp/mpii/model_last.pth --num_epoch 60 --lr_step 45

Stage3: Train with geometry loss. model, log

python main.py --exp_id fusion_3d_var --task human3d --dataset fusion_3d --ratio_3d 1 --weight_3d 0.1 --weight_var 0.01 --load_model ../models/fusion_3d.pth  --num_epoch 10 --lr 1e-4

Citation

@InProceedings{Zhou_2017_ICCV,
author = {Zhou, Xingyi and Huang, Qixing and Sun, Xiao and Xue, Xiangyang and Wei, Yichen},
title = {Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2017}
}

PyTorch implementation for 3D human pose estimation

Related tags

Overview

Towards 3D Human Pose Estimation in the Wild: a Weakly-supervised Approach

Installation

Demo

Benchmark Testing

Training

Citation

Owner

Xingyi Zhou

TalkingHead-1KH is a talking-head dataset consisting of YouTube videos

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

(SIGIR2020) “Asymmetric Tri-training for Debiasing Missing-Not-At-Random Explicit Feedback’’

pyhsmm - library for approximate unsupervised inference in Bayesian Hidden Markov Models (HMMs) and explicit-duration Hidden semi-Markov Models (HSMMs), focusing on the Bayesian Nonparametric extensions, the HDP-HMM and HDP-HSMM, mostly with weak-limit approximations.

PyTorch code for our paper "Attention in Attention Network for Image Super-Resolution"

A Python library for generating new text from existing samples.

Plato: A New Framework for Federated Learning Research

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

A Python implementation of global optimization with gaussian processes.

Official Implementation of PCT

This is my research project for the Irving Center for Cancer Dynamics/Azizi Lab, Columbia University.

MAg: a simple learning-based patient-level aggregation method for detecting microsatellite instability from whole-slide images

Trafffic prediction analysis using hybrid models - Machine Learning

Extending JAX with custom C++ and CUDA code

Writeups for the challenges from DownUnderCTF 2021

Here we present the implementation in TensorFlow of our work about liver lesion segmentation accepted in the Machine Learning 4 Health Workshop

Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields.

Only works with the dashboard version / branch of jesse

A standard framework for modelling Deep Learning Models for tabular data

The Dual Memory is build from a simple CNN for the deep memory and Linear Regression fro the fast Memory