Official Pytorch implementation of 6DRepNet: 6D Rotation representation for unconstrained head pose estimation.

Overview

PWC PWC Hugging Face Spaces

6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch)

animated

Paper

Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi, "6D Rotation Representation for Unconstrained Head Pose Estimation", submitted to ICIP 2022. [ResearchGate][Arxiv]

Abstract

In this paper, we present a method for unconstrained end-to-end head pose estimation. We address the problem of ambiguous rotation labels by introducing the rotation matrix formalism for our ground truth data and propose a continuous 6D rotation matrix representation for efficient and robust direct regression. This way, our method can learn the full rotation appearance which is contrary to previous approaches that restrict the pose prediction to a narrow-angle for satisfactory results. In addition, we propose a geodesic distance-based loss to penalize our network with respect to the manifold geometry. Experiments on the public AFLW2000 and BIWI datasets demonstrate that our proposed method significantly outperforms other state-of-the-art methods by up to 20%.


Trained on 300W-LP, Test on AFLW2000 and BIWI

Full Range Yaw Pitch Roll MAE Yaw Pitch Roll MAE
HopeNet ( =2) N 6.47 6.56 5.44 6.16 5.17 6.98 3.39 5.18
HopeNet ( =1) N 6.92 6.64 5.67 6.41 4.81 6.61 3.27 4.90
FSA-Net N 4.50 6.08 4.64 5.07 4.27 4.96 2.76 4.00
HPE N 4.80 6.18 4.87 5.28 3.12 5.18 4.57 4.29
QuatNet N 3.97 5.62 3.92 4.50 2.94 5.49 4.01 4.15
WHENet-V N 4.44 5.75 4.31 4.83 3.60 4.10 2.73 3.48
WHENet Y/N 5.11 6.24 4.92 5.42 3.99 4.39 3.06 3.81
TriNet Y 4.04 5.77 4.20 4.67 4.11 4.76 3.05 3.97
FDN N 3.78 5.61 3.88 4.42 4.52 4.70 2.56 3.93
6DRepNet Y 3.63 4.91 3.37 3.97 3.24 4.48 2.68 3.47

BIWI 70/30

Yaw Pitch Roll MAE
HopeNet ( =1) 3.29 3.39 3.00 3.23
FSA-Net 2.89 4.29 3.60 3.60
TriNet 2.93 3.04 2.44 2.80
FDN 3.00 3.98 2.88 3.29
6DRepNet 2.69 2.92 2.36 2.66

Fine-tuned Models

Fine-tuned models can be download from here: https://drive.google.com/drive/folders/1V1pCV0BEW3mD-B9MogGrz_P91UhTtuE_?usp=sharing

Quick Start:

git clone https://github.com/thohemp/6DRepNet
cd 6DRepNet

Set up a virtual environment:

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt  # Install required packages

In order to run the demo scripts you need to install the face detector

pip install git+https://github.com/elliottzheng/[email protected]

Camera Demo:

python demo.py  --snapshot 6DRepNet_300W_LP_AFLW2000.pth \
                --cam 0

Test/Train 3DRepNet

Preparing datasets

Download datasets:

  • 300W-LP, AFLW2000 from here.

  • BIWI (Biwi Kinect Head Pose Database) from here

Store them in the datasets directory.

For 300W-LP and AFLW2000 we need to create a filenamelist.

python create_filename_list.py --root_dir datasets/300W_LP

The BIWI datasets needs be preprocessed by a face detector to cut out the faces from the images. You can use the script provided here. For 7:3 splitting of the BIWI dataset you can use the equivalent script here. We set the cropped image size to 256.

Testing:

python test.py  --batch_size 64 \
                --dataset ALFW2000 \
                --data_dir datasets/AFLW2000 \
                --filename_list datasets/AFLW2000/files.txt \
                --snapshot output/snapshots/1.pth \
                --show_viz False 

Training

Download pre-trained RepVGG model 'RepVGG-B1g2-train.pth' from here and save it in the root directory.

python train.py --batch_size 64 \
                --num_epochs 30 \
                --lr 0.00001 \
                --dataset Pose_300W_LP \
                --data_dir datasets/300W_LP \
                --filename_list datasets/300W_LP/files.txt

Deploy models

For reparameterization the trained models into inference-models use the convert script.

python convert.py input-model.tar output-model.pth

Inference-models are loaded with the flag deploy=True.

model = SixDRepNet(backbone_name='RepVGG-B1g2',
                    backbone_file='',
                    deploy=True,
                    pretrained=False)

Citing

If you find our work useful, please cite the paper:

@misc{hempel20226d,
      title={6D Rotation Representation For Unconstrained Head Pose Estimation}, 
      author={Thorsten Hempel and Ahmed A. Abdelrahman and Ayoub Al-Hamadi},
      year={2022},
      eprint={2202.12555},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Thorsten Hempel
Computer Vision, Robotics
Thorsten Hempel
Wenzhou-Kean University AI-LAB

AI-LAB This is Wenzhou-Kean University AI-LAB. Our research interests are in Computer Vision and Natural Language Processing. Computer Vision Please g

WKU AI-LAB 10 May 05, 2022
Bayesian Optimization using GPflow

Note: This package is for use with GPFlow 1. For Bayesian optimization using GPFlow 2 please see Trieste, a joint effort with Secondmind. GPflowOpt GP

GPflow 257 Dec 26, 2022
Official source code of Fast Point Transformer, CVPR 2022

Fast Point Transformer Project Page | Paper This repository contains the official source code and data for our paper: Fast Point Transformer Chunghyun

182 Dec 23, 2022
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators

AMOS This repository contains the scripts for fine-tuning AMOS pretrained models on GLUE and SQuAD 2.0 benchmarks. Paper: Pretraining Text Encoders wi

Microsoft 22 Sep 15, 2022
Anomaly detection in multi-agent trajectories: Code for training, evaluation and the OpenAI highway simulation.

Anomaly Detection in Multi-Agent Trajectories for Automated Driving This is the official project page including the paper, code, simulation, baseline

12 Dec 02, 2022
git《Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction》(ECCV 2020) GitHub:

Learning Pairwise Inter-Plane Relations for Piecewise Planar Reconstruction Code for the ECCV 2020 paper by Yiming Qian and Yasutaka Furukawa Getting

37 Dec 04, 2022
Boosted neural network for tabular data

XBNet - Xtremely Boosted Network Boosted neural network for tabular data XBNet is an open source project which is built with PyTorch which tries to co

Tushar Sarkar 175 Jan 04, 2023
This is the code repository for the paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (NeurIPS 2021).

Code Repository for the Paper "Identification of the Generalized Condorcet Winner in Multi-dueling Bandits" (To appear in: Proceedings of NeurIPS20

1 Oct 03, 2022
Tensorflow port of a full NetVLAD network

netvlad_tf The main intention of this repo is deployment of a full NetVLAD network, which was originally implemented in Matlab, in Python. We provide

Robotics and Perception Group 225 Nov 08, 2022
Official implementation for "Low-light Image Enhancement via Breaking Down the Darkness"

Low-light Image Enhancement via Breaking Down the Darkness by Qiming Hu, Xiaojie Guo. 1. Dependencies Python3 PyTorch=1.0 OpenCV-Python, TensorboardX

Qiming Hu 30 Jan 01, 2023
SMCA replication There are no extra compiled components in SMCA DETR and package dependencies are minimal

Usage There are no extra compiled components in SMCA DETR and package dependencies are minimal, so the code is very simple to use. We provide instruct

22 May 06, 2022
Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection

fpn.pytorch Pytorch implementation of Feature Pyramid Network (FPN) for Object Detection Introduction This project inherits the property of our pytorc

Jianwei Yang 912 Dec 21, 2022
Anime Face Detector using mmdet and mmpose

Anime Face Detector This is an anime face detector using mmdetection and mmpose. (To avoid copyright issues, I use generated images by the TADNE model

198 Jan 07, 2023
GuideDog is an AI/ML-based mobile app designed to assist the lives of the visually impaired, 100% voice-controlled

Guidedog Authors: Kyuhee Jo, Steven Gunarso, Jacky Wang, Raghav Sharma GuideDog is an AI/ML-based mobile app designed to assist the lives of the visua

Kyuhee Jo 5 Nov 24, 2021
Quantized tflite models for ailia TFLite Runtime

ailia-models-tflite Quantized tflite models for ailia TFLite Runtime About ailia TFLite Runtime ailia TF Lite Runtime is a TensorFlow Lite compatible

ax Inc. 13 Dec 23, 2022
Happywhale - Whale and Dolphin Identification Silver🥈 Solution (26/1588)

Kaggle-Happywhale Happywhale - Whale and Dolphin Identification Silver 🥈 Solution (26/1588) 竞赛方案思路 图像数据预处理-标志性特征图片裁剪:首先根据开源的标注数据训练YOLOv5x6目标检测模型,将训练集

Franxx 20 Nov 14, 2022
An official source code for "Augmentation-Free Self-Supervised Learning on Graphs"

Augmentation-Free Self-Supervised Learning on Graphs An official source code for Augmentation-Free Self-Supervised Learning on Graphs paper, accepted

Namkyeong Lee 59 Dec 01, 2022
Implementation of our paper "DMT: Dynamic Mutual Training for Semi-Supervised Learning"

DMT: Dynamic Mutual Training for Semi-Supervised Learning This repository contains the code for our paper DMT: Dynamic Mutual Training for Semi-Superv

Zhengyang Feng 120 Dec 30, 2022
Simple Python project using Opencv and datetime package to recognise faces and log attendance data in a csv file.

Attendance-System-based-on-Facial-recognition-Attendance-data-stored-in-csv-file- Simple Python project using Opencv and datetime package to recognise

3 Aug 09, 2022
A smart Chat bot that can help to know about corona virus and Make prediction of corona using X-ray.

TRINIT_Hum_kuchh_nahi_karenge_ML01 Document Link https://github.com/Jatin-Goyal-552/TRINIT_Hum_kuchh_nahi_karenge_ML01/blob/main/hum_kuchh_nahi_kareng

JatinGoyal 1 Feb 03, 2022