Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Last update: Dec 24, 2022

Overview

Human Performance Capture from Monocular Video in the Wild

Paper | Video | Project Page

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild. We propose a method capable of capturing the dynamic 3D human shape from a monocular video featuring challenging body poses, without any additional input.

If you find our code or paper useful, please cite as

@inproceedings{guo2021human,
  title={Human Performance Capture from Monocular Video in the Wild},
  author={Guo, Chen and Chen, Xu and Song, Jie and Hilliges, Otmar},
  booktitle={2021 International Conference on 3D Vision (3DV)},
  pages={889--898},
  year={2021},
  organization={IEEE}
}

Quick Start

CLone this repo:

git clone https://github.com/MoyGcc/hpcwild.git
cd  hpcwild
conda env create -f environment.yml
conda activate hpcwild

Additional Dependencies:

Kaolin 0.1.0 (https://github.com/NVIDIAGameWorks/kaolin)
MPI mesh library (https://github.com/MPI-IS/mesh)
torch-mesh-isect (https://github.com/vchoutas/torch-mesh-isect)

Download SMPL models (1.0.0 for Python 2.7 (10 shape PCs)) and move them to the corresponding places:

mkdir lib/smpl/smpl_model/
mv /path/to/smpl/models/basicModel_f_lbs_10_207_0_v1.0.0.pkl smpl_rendering/smpl_model/SMPL_FEMALE.pkl
mv /path/to/smpl/models/basicmodel_m_lbs_10_207_0_v1.0.0.pkl smpl_rendering/smpl_model/SMPL_MALE.pkl

Download checkpoints for external modules:

wget https://download.01.org/opencv/openvino_training_extensions/models/human_pose_estimation/checkpoint_iter_370000.pth
mv /path/to/checkpoint_iter_370000.pth external/lightweight-human-pose-estimation.pytorch/checkpoint_iter_370000.pth

wget https://dl.fbaipublicfiles.com/pifuhd/checkpoints/pifuhd.pt pifuhd.pt 
mv /path/to/pifuhd.pt external/pifuhd/checkpoints/pifuhd.pt

Download IPNet weights: https://datasets.d2.mpi-inf.mpg.de/IPNet2020/IPNet_p5000_01_exp_id01.zip
unzip IPNet_p5000_01_exp_id01.zip
mv /path/to/IPNet_p5000_01_exp_id01 registration/experiments/IPNet_p5000_01_exp_id01

gdown --id 1mcr7ALciuAsHCpLnrtG_eop5-EYhbCmz -O modnet_photographic_portrait_matting.ckpt
mv /path/to/modnet_photographic_portrait_matting.ckpt external/MODNet/pretrained/modnet_photographic_portrait_matting.ckpt

Test on 3DPW dataset

Download 3DPW dataset

modify the dataset_path in test.conf.
run bash mesh_recon.sh to obtain the rigid body shape.
run bash registration.sh to register a SMPL+D model to the rigid human body.
run bash tracking.sh to capture the human performance temporally.

Test on your own video

run OpenPose to obtain the 2D keypoints.
run LGD to acquire the initial 3D poses.
run MODNet to extract sihouettes.

Acknowledgement

We use the code in PIFuHD for the rigid body construction and adapt IPNet for human model registration. We use off-the-shelf methods OpenPose and MODNet for the extraction of 2D keypoints and sihouettes. We sincerely thank these authors for their awesome work.

Official code release for 3DV 2021 paper Human Performance Capture from Monocular Video in the Wild.

Related tags

Overview

Human Performance Capture from Monocular Video in the Wild

Paper | Video | Project Page

Quick Start

Test on 3DPW dataset

Test on your own video

Acknowledgement

Owner

Chen Guo

Official PyTorch implementation of MX-Font (Multiple Heads are Better than One: Few-shot Font Generation with Multiple Localized Experts)

Semi-supevised Semantic Segmentation with High- and Low-level Consistency

A light-weight image labelling tool for Python designed for creating segmentation data sets.

Dataloader tools for language modelling

Blind Video Temporal Consistency via Deep Video Prior

This repository contains project created during the Data Challenge module at London School of Hygiene & Tropical Medicine

Yolov5-lite - Minimal PyTorch implementation of YOLOv5

Automatically replace ONNX's RandomNormal node with Constant node.

Structured Edge Detection Toolbox

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Implémentation en pyhton de l'article Depixelizing pixel art de Johannes Kopf et Dani Lischinski

A collection of awesome resources image-to-image translation.

The Multi-Mission Maximum Likelihood framework (3ML)

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Minimisation of a negative log likelihood fit to extract the lifetime of the D^0 meson (MNLL2ELDM)

Udacity Suse Cloud Native Foundations Scholarship Course Walkthrough

Image-retrieval-baseline - MUGE Multimodal Retrieval Baseline

PartImageNet is a large, high-quality dataset with part segmentation annotations

Full Resolution Residual Networks for Semantic Image Segmentation

Pseudo-mask Matters in Weakly-supervised Semantic Segmentation