Human Dynamics from Monocular Video with Dynamic Camera Movements

Overview

Human Dynamics from Monocular Video with Dynamic Camera Movements

Ri Yu, Hwangpil Park and Jehee Lee

Seoul National University

ACM Transactions on Graphics, Volume 40, Number 6, Article 208. (SIGGRAPH Asia 2021)

Teaser Image

Abstract

We propose a new method that reconstructs 3D human motion from in-the wild video by making full use of prior knowledge on the laws of physics. Previous studies focus on reconstructing joint angles and positions in the body local coordinate frame. Body translations and rotations in the global reference frame are partially reconstructed only when the video has a static camera view. We are interested in overcoming this static view limitation to deal with dynamic view videos. The camera may pan, tilt, and zoom to track the moving subject. Since we do not assume any limitations on camera movements, body translations and rotations from the video do not correspond to absolute positions in the reference frame. The key technical challenge is inferring body translations and rotations from a sequence of 3D full-body poses, assuming the absence of root motion. This inference is possible because human motion obeys the law of physics. Our reconstruction algorithm produces a control policy that simulates 3D human motion imitating the one in the video. Our algorithm is particularly useful for reconstructing highly dynamic movements, such as sports, dance, gymnastics, and parkour actions.

Requirements

  • Ubuntu (tested on 18.04 LTS)

  • Python 3 (tested on version 3.6+)

  • Dart (modified version, see below)

  • Fltk 1.3.4.1

Installation

Dart

sudo apt install libeigen3-dev libassimp-dev libccd-dev libfcl-dev libboost-regex-dev libboost-system-dev libopenscenegraph-dev libnlopt-dev coinor-libipopt-dev libbullet-dev libode-dev liboctomap-dev libflann-dev libtinyxml2-dev liburdfdom-dev doxygen libxi-dev libxmu-dev liblz4-dev
git clone https://github.com/hpgit/dart-ltspd.git
cd dart-ltspd
mkdir build
cd build
cmake ..
make -j4
sudo make install

Pydart

sudo apt install swig

after virtual environment(venv) activates,

source venv/bin/activate
git clone https://github.com/hpgit/pydart2.git
cd pydart2
pip install pyopengl==3.1.0 pyopengl-accelerate==3.1.0
python setup.py build
python setup.py install

Fltk and Pyfltk

sudo apt install libfltk1.3-dev

Download pyfltk

cd ~/Downloads
tar xzf pyFltk-1.3.4.1_py3.tar
cd pyFltk-1.3.4.1_py3
python setup.py build
python setup.py install

misc

pip install pillow cvxopt scipy
cd PyCommon/modules/GUI
sudo apt install libgle3-dev

Run examples

source venv/bin/activate
export PYTHONPATH=$PWD
cd control/parkour1
python3 render_parkour1.py

Bibtex

@article{Yu:2021:MovingCam,
    author = {Yu, Ri and Park, Hwangpil and Lee, Jehee},
    title = {Human Dynamics from Monocular Video with Dynamic Camera Movements},
    journal = {ACM Trans. Graph.},
    volume = {40},
    number = {6},
    year = {2021},
    articleno = {208}
}
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022
Conceptual 12M is a dataset containing (image-URL, caption) pairs collected for vision-and-language pre-training.

Conceptual 12M We introduce the Conceptual 12M (CC12M), a dataset with ~12 million image-text pairs meant to be used for vision-and-language pre-train

Google Research Datasets 226 Dec 07, 2022
The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track.

ISC21-Descriptor-Track-1st The 1st Place Solution of the Facebook AI Image Similarity Challenge (ISC21) : Descriptor Track. You can check our solution

lyakaap 75 Jan 08, 2023
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
Shōgun

The SHOGUN machine learning toolbox Unified and efficient Machine Learning since 1999. Latest release: Cite Shogun: Develop branch build status: Donat

Shōgun ML 2.9k Jan 04, 2023
Library for machine learning stacking generalization.

stacked_generalization Implemented machine learning *stacking technic[1]* as handy library in Python. Feature weighted linear stacking is also availab

114 Jul 19, 2022
MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution

Octave Convolution MXNet implementation for: Drop an Octave: Reducing Spatial Redundancy in Convolutional Neural Networks with Octave Convolution Imag

Meta Research 549 Dec 28, 2022
Implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

SemCo The official pytorch implementation of the paper All Labels Are Not Created Equal: Enhancing Semi-supervision via Label Grouping and Co-training

42 Nov 14, 2022
In this project, we develop a face recognize platform based on MTCNN object-detection netcwork and FaceNet self-supervised network.

模式识别大作业——人脸检测与识别平台 本项目是一个简易的人脸检测识别平台,提供了人脸信息录入和人脸识别的功能。前端采用 html+css+js,后端采用 pytorch,

Xuhua Huang 5 Aug 02, 2022
This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch

This computer program provides a reference implementation of Lagrangian Monte Carlo in metric induced by the Monge patch. The code was prepared to the final version of the accepted manuscript in AIST

Marcelo Hartmann 2 May 06, 2022
A particular navigation route using satellite feed and can help in toll operations & traffic managemen

How about adding some info that can quanitfy the stress on a particular navigation route using satellite feed and can help in toll operations & traffic management The current analysis is on the satel

Ashish Pandey 1 Feb 14, 2022
YOLOv5 detection interface - PyQt5 implementation

所有代码已上传,直接clone后,运行yolo_win.py即可开启界面。 2021/9/29:加入置信度选择 界面是在ultralytics的yolov5基础上建立的,界面使用pyqt5实现,内容较简单,娱乐而已。 功能: 模型选择 本地文件选择(视频图片均可) 开关摄像头

487 Dec 27, 2022
50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program

50-days-of-Statistics-for-Data-Science - This repository consist of a 50-day program. All the statistics required for the complete understanding of data science will be uploaded in this repository.

komal_lamba 22 Dec 09, 2022
PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021)

PyTorch code for the paper "Curriculum Graph Co-Teaching for Multi-target Domain Adaptation" (CVPR2021) This repo presents PyTorch implementation of M

Evgeny 79 Dec 19, 2022
Implementation of OpenAI paper with Simple Noise Scale on Fastai V2

README Implementation of OpenAI paper "An Empirical Model of Large-Batch Training" for Fastai V2. The code is based on the batch size finder implement

13 Dec 10, 2021
PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

PyTorch Large-Scale Language Model A Large-Scale PyTorch Language Model trained on the 1-Billion Word (LM1B) / (GBW) dataset Latest Results 39.98 Perp

Ryan Spring 114 Nov 04, 2022
This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR

This project is a re-implementation of MASTER: Multi-Aspect Non-local Network for Scene Text Recognition by MMOCR,which is an open-source toolbox based on PyTorch. The overall architecture will be sh

Jianquan Ye 82 Nov 17, 2022
A light and fast one class detection framework for edge devices. We provide face detector, head detector, pedestrian detector, vehicle detector......

A Light and Fast Face Detector for Edge Devices Big News: LFD, which is a big update of LFFD, now is released (2021.03.09). It is strongly recommended

YonghaoHe 1.3k Dec 25, 2022
StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators

StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators [Project Website] [Replicate.ai Project] StyleGAN-NADA: CLIP-Guided Domain Adaptation

992 Dec 30, 2022
Recognize Handwritten Digits using Deep Learning on the browser itself.

MNIST on the Web An attempt to predict MNIST handwritten digits from my PyTorch model from the browser (client-side) and not from the server, with the

Harjyot Bagga 7 May 28, 2022