PIXIE: Collaborative Regression of Expressive Bodies

Related tags

Deep LearningPIXIE
Overview

PIXIE: Collaborative Regression of Expressive Bodies

[Project Page]

This is the official Pytorch implementation of PIXIE.

PIXIE reconstructs an expressive body with detailed face shape and hand articulation from a single image. PIXIE does this by regressing the body, face and hands directly from image pixels using a neural network that includes a novel moderator, which attends to add weights information about the different body parts. Unlike prior work, PIXIE estimates bodies with a gender-appropriate shape but does so in a gender neutral shape space to accommodate non-binary shapes. Please refer to the Paper for more details.

The main features of PIXIE are:

  • Expressive body estimation: Given a single image, PIXIE reconstructs the 3D body shape and pose, hand articulation and facial expression as SMPL-X parameters
  • Facial details: PIXIE extracts detailed face shape, including wrinkles, using DECA
  • Facial texture: PIXIE also returns a estimate of the albedo of the subject
  • Animation: The estimated body can be re-posed and animated
  • Robust: Tested on full-body images in unconstrained conditions. The moderation strategy prevents unnatural poses. Overall, our method is robust to: various poses, illumination conditions and occlusions
  • Accurate: state-of-the-art expressive body reconstruction
  • Fast: this is a direct regression method (pixels in, SMPL-X out)

Getting started

Please follow the installation instructions to install all necessary packages and download the data.

Demo

Expressive 3D body reconstruction

python demos/demo_fit_body.py --saveObj True 

This return the estimated 3D body geometry with texture, in the form of an obj file, and render it from multiple viewpoints. If you set the optional --deca_path argument then the result will also contain facial details from DECA, provided that the face moderator is confident enough. Please run python demos/demo_fit_body.py --help for a more detailed description of the various available options.

input body image, estimated 3D body, with facial details, with texture, different views

3D face reconstruction

python demos/demo_fit_face.py --saveObj True --showBody True

Note that, given only a face image, our method still regresses the full SMPL-X parameters, producing a body mesh (as shown in the rightmost image). Futher, note how different face shapes produce different body shapes. The face tells us a lot about the body.

input face image, estimated face, with facial details, with texture, whole body in T-pose

3D hand reconstruction

python demos/demo_fit_hand.py --saveObj True

We do not provide support for hand detection, please make sure that to pass hand-only images and flip horizontally all left hands.

input hand image, estimated hand, with texture(fixed texture).

Animation

python demos/demo_animate_body.py 

Bodies estimated by PIXIE are easily animated. For example, we can estimate the body from one image and animate with the poses regressed from a different image sequence.

The visualization contains the input image, the predicted expressive 3D body, the animation result, the reference video and its corresponding reconstruction. For the latter, the color of the hands and head represents the confidence of the corresponding moderators. A lighter color means that PIXIE trusts more the information of the body image rather than the parts, which can happen when a person is facing away from the camera for example.

Notes

You can find more details on our method, as well as a discussion of the limitations of PIXIE here.

Citation

If you find our work useful to your research, please consider citing:

@inproceedings{PIXIE:2021,
      title={Collaborative Regression of Expressive Bodies using Moderation}, 
      author={Yao Feng and Vasileios Choutas and Timo Bolkart and Dimitrios Tzionas and Michael J. Black},
      booktitle={International Conference on 3D Vision (3DV)},
      year={2021}
}

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE file. By downloading and using the code and model you agree to the terms in the LICENSE.

Acknowledgments

For functions or scripts that are based on external sources, we acknowledge the origin individually in each file.
Here are some great resources we benefit from:

We would also like to thank the authors of other public body regression methods, which allow us to easily perform quantitative and qualitative comparisons:
HMR, SPIN, frankmocap

Last but not least, we thank Victoria Fernández Abrevaya, Yinghao Huang and Radek Danecek for their helpful comments and proof reading, and Yuliang Xiu for his help in capturing demo sequences. This research was partially supported by the Max Planck ETH Center for Learning Systems. Some of the images used in the qualitative examples come from pexels.com.

Contact

For questions, please contact [email protected].
For commercial licensing (and all related questions for business applications), please contact [email protected].

Owner
Yao Feng
Yao Feng
YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks

YOLTv5 rapidly detects objects in arbitrarily large aerial or satellite images that far exceed the ~600×600 pixel size typically ingested by deep learning object detection frameworks.

Adam Van Etten 145 Jan 01, 2023
Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch

Lie Transformer - Pytorch (wip) Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch. Only the SE3 version will be present in thi

Phil Wang 78 Oct 26, 2022
The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you

Yu Meng 38 Dec 12, 2022
Neural Logic Inductive Learning

Neural Logic Inductive Learning This is the implementation of the Neural Logic Inductive Learning model (NLIL) proposed in the ICLR 2020 paper: Learn

36 Nov 28, 2022
[CVPR 2021] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans

Scan2Cap: Context-aware Dense Captioning in RGB-D Scans Introduction We introduce the task of dense captioning in 3D scans from commodity RGB-D sensor

Dave Z. Chen 79 Nov 07, 2022
This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper

DeepShift This is project is the implementation of the DeepShift: Towards Multiplication-Less Neural Networks paper, that aims to replace multiplicati

Mostafa Elhoushi 88 Dec 23, 2022
COVID-Net Open Source Initiative

The COVID-Net models provided here are intended to be used as reference models that can be built upon and enhanced as new data becomes available

Linda Wang 1.1k Dec 26, 2022
Multi-robot collaborative exploration and mapping through Voronoi partition and DRL in unknown environment

Voronoi Multi_Robot Collaborate Exploration Introduction In the unknown environment, the cooperative exploration of multiple robots is completed by Vo

PeaceWord 6 Nov 22, 2022
MoveNetを用いたPythonでの姿勢推定のデモ

MoveNet-Python-Example MoveNetのPythonでの動作サンプルです。 ONNXに変換したモデルも同梱しています。変換自体を試したい方はMoveNet_tf2onnx.ipynbを使用ください。 2021/08/24時点でTensorFlow Hubで提供されている以下モデ

KazuhitoTakahashi 38 Dec 17, 2022
Official implementation for Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

Multi-modal Interaction Graph Convolutioal Network for Temporal Language Localization in Videos Official implementation for Multi-Modal Interaction Gr

Zongmeng Zhang 15 Oct 18, 2022
This repo contains research materials released by members of the Google Brain team in Tokyo.

Brain Tokyo Workshop 🧠 🗼 This repo contains research materials released by members of the Google Brain team in Tokyo. Past Projects Weight Agnostic

Google 1.2k Jan 02, 2023
Official implementation of "CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding" (CVPR, 2022)

CrossPoint: Self-Supervised Cross-Modal Contrastive Learning for 3D Point Cloud Understanding (CVPR'22) Paper Link | Project Page Abstract : Manual an

Mohamed Afham 152 Dec 23, 2022
shufflev2-yolov5:lighter, faster and easier to deploy

shufflev2-yolov5: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size

pogg 1.5k Jan 05, 2023
Repository for the NeurIPS 2021 paper: "Exploiting Domain-Specific Features to Enhance Domain Generalization".

meta-Domain Specific-Domain Invariant (mDSDI) Source code implementation for the paper: Manh-Ha Bui, Toan Tran, Anh Tuan Tran, Dinh Phung. "Exploiting

VinAI Research 12 Nov 25, 2022
Self-Supervised Multi-Frame Monocular Scene Flow (CVPR 2021)

Self-Supervised Multi-Frame Monocular Scene Flow 3D visualization of estimated depth and scene flow (overlayed with input image) from temporally conse

Visual Inference Lab @TU Darmstadt 85 Dec 22, 2022
git git《Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking》(CVPR 2021) GitHub:git2] 《Masksembles for Uncertainty Estimation》(CVPR 2021) GitHub:git3]

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking Ning Wang, Wengang Zhou, Jie Wang, and Houqiang Li Accepted by CVPR

NingWang 236 Dec 22, 2022
Rank1 Conversation Emotion Detection Task

Rank1-Conversation_Emotion_Detection_Task accuracy macro-f1 recall 0.826 0.7544 0.719 基于预训练模型和时序预测模型的对话情感探测任务 1 摘要 针对对话情感探测任务,本文将其分为文本分类和时间序列预测两个子任务,分

Yuchen Han 2 Nov 28, 2021
Metric learning algorithms in Python

metric-learn: Metric Learning in Python metric-learn contains efficient Python implementations of several popular supervised and weakly-supervised met

1.3k Dec 28, 2022
CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding This repo contains the data and source code for baseline models in the NeurIPS 2

Microsoft 29 Dec 29, 2022
MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift

MemStream Implementation of MemStream: Memory-Based Anomaly Detection in Multi-Aspect Streams with Concept Drift . Siddharth Bhatia, Arjit Jain, Shivi

Stream-AD 61 Dec 02, 2022