UDP++ (ECCVW 2020 Oral), (Winner of COCO 2020 Keypoint Challenge).

Overview

UDP-Pose

This is the pytorch implementation for UDP++, which won the Fisrt place in COCO Keypoint Challenge at ECCV 2020 Workshop. Illustrating the performance of the proposed UDP

Top-Down

Results on MPII val dataset

Method--- Head Sho. Elb. Wri. Hip Kne. Ank. Mean Mean 0.1
HRNet32 97.1 95.9 90.3 86.5 89.1 87.1 83.3 90.3 37.7
+Dark 97.2 95.9 91.2 86.7 89.7 86.7 84.0 90.6 42.0
+UDP 97.4 96.0 91.0 86.5 89.1 86.6 83.3 90.4 42.1

Results on COCO val2017 with detector having human AP of 65.1 on COCO val2017 dataset

Arch Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L) AR
pose_resnet_50 256x192 34.0M 8.90 71.3 89.9 78.9 68.3 77.4 76.9
+UDP 256x192 34.2M 8.96 72.9 90.0 80.2 69.7 79.3 78.2
pose_resnet_50 384x288 34.0M 20.0 73.2 90.7 79.9 69.4 80.1 78.2
+UDP 384x288 34.2M 20.1 74.0 90.3 80.0 70.2 81.0 79.0
pose_resnet_152 256x192 68.6M 15.7 72.9 90.6 80.8 69.9 79.0 78.3
+UDP 256x192 68.8M 15.8 74.3 90.9 81.6 71.2 80.6 79.6
pose_resnet_152 384x288 68.6M 35.6 75.3 91.0 82.3 71.9 82.0 80.4
+UDP 384x288 68.8M 35.7 76.2 90.8 83.0 72.8 82.9 81.2
pose_hrnet_w32 256x192 28.5M 7.10 75.6 91.9 83.0 72.2 81.6 80.5
+UDP 256x192 28.7M 7.16 76.8 91.9 83.7 73.1 83.3 81.6
+UDPv1 256x192 28.7M 7.16 77.2 91.6 84.2 73.7 83.7 82.5
+UDPv1+AID 256x192 28.7M 7.16 77.9 92.1 84.5 74.1 84.1 82.8
RSN18+UDP 256x192 - 2.5 74.7 - - - - -
pose_hrnet_w32 384x288 28.5M 16.0 76.7 91.9 83.6 73.2 83.2 81.6
+UDP 384x288 28.7M 16.1 77.8 91.7 84.5 74.2 84.3 82.4
pose_hrnet_w48 256x192 63.6M 14.6 75.9 91.9 83.5 72.6 82.1 80.9
+UDP 256x192 63.8M 14.7 77.2 91.8 83.7 73.8 83.7 82.0
pose_hrnet_w48 384x288 63.6M 32.9 77.1 91.8 83.8 73.5 83.5 81.8
+UDP 384x288 63.8M 33.0 77.8 92.0 84.3 74.2 84.5 82.5

Note:

  • Flip test is used.
  • Person detector has person AP of 65.1 on COCO val2017 dataset.
  • GFLOPs is for convolution and linear layers only.
  • UDPv1: v0:LOSS.KPD=4.0, v1:LOSS.KPD=3.5

Results on COCO test-dev with detector having human AP of 65.1 on COCO val2017 dataset

Arch Input size #Params GFLOPs AP Ap .5 AP .75 AP (M) AP (L) AR
pose_resnet_50 256x192 34.0M 8.90 70.2 90.9 78.3 67.1 75.9 75.8
+UDP 256x192 34.2M 8.96 71.7 91.1 79.6 68.6 77.5 77.2
pose_resnet_50 384x288 34.0M 20.0 71.3 91.0 78.5 67.3 77.9 76.6
+UDP 384x288 34.2M 20.1 72.5 91.1 79.7 68.8 79.1 77.9
pose_resnet_152 256x192 68.6M 15.7 71.9 91.4 80.1 68.9 77.4 77.5
+UDP 256x192 68.8M 15.8 72.9 91.6 80.9 70.0 78.5 78.4
pose_resnet_152 384x288 68.6M 35.6 73.8 91.7 81.2 70.3 80.0 79.1
+UDP 384x288 68.8M 35.7 74.7 91.8 82.1 71.5 80.8 80.0
pose_hrnet_w32 256x192 28.5M 7.10 73.5 92.2 82.0 70.4 79.0 79.0
+UDP 256x192 28.7M 7.16 75.2 92.4 82.9 72.0 80.8 80.4
pose_hrnet_w32 384x288 28.5M 16.0 74.9 92.5 82.8 71.3 80.9 80.1
+UDP 384x288 28.7M 16.1 76.1 92.5 83.5 72.8 82.0 81.3
pose_hrnet_w48 256x192 63.6M 14.6 74.3 92.4 82.6 71.2 79.6 79.7
+UDP 256x192 63.8M 14.7 75.7 92.4 83.3 72.5 81.4 80.9
pose_hrnet_w48 384x288 63.6M 32.9 75.5 92.5 83.3 71.9 81.5 80.5
+UDP 384x288 63.8M 33.0 76.5 92.7 84.0 73.0 82.4 81.6

Note:

  • Flip test is used.
  • Person detector has person AP of 65.1 on COCO val2017 dataset.
  • GFLOPs is for convolution and linear layers only.

Bottom-Up

HRNet

Arch P2I Input size Speed(task/s) AP Ap .5 AP .75 AP (M) AP (L) AR
HRNet(ori) T 512x512 - 64.4 - - 57.1 75.6 -
HRNet(mmpose) F 512x512 39.5 65.8 86.3 71.8 59.2 76.0 70.7
HRNet(mmpose) T 512x512 6.8 65.3 86.2 71.5 58.6 75.7 70.9
HRNet+UDP T 512x512 5.8 65.9 86.2 71.8 59.4 76.0 71.4
HRNet+UDP F 512x512 37.2 67.0 86.2 72.0 60.7 76.7 71.6
HRNet+UDP+AID F 512x512 37.2 68.4 88.1 74.9 62.7 77.1 73.0

HigherHRNet

Arch P2I Input size Speed(task/s) AP Ap .5 AP .75 AP (M) AP (L) AR
HigherHRNet(ori) T 512x512 - 67.1 - - 61.5 76.1 -
HigherHRNet T 512x512 9.4 67.2 86.1 72.9 61.8 76.1 72.2
HigherHRNet+UDP T 512x512 9.0 67.6 86.1 73.7 62.2 76.2 72.4
HigherHRNet F 512x512 24.1 67.1 86.1 73.6 61.7 75.9 72.0
HigherHRNet+UDP F 512x512 23.0 67.6 86.2 73.8 62.2 76.2 72.4
HigherHRNet+UDP+AID F 512x512 23.0 69.0 88.0 74.9 64.0 76.9 73.8

Note:

  • ori : Result from original HigherHrnet
  • mmpose : Pretrained models from mmpose
  • P2I : PROJECT2IMAGE
  • we use mmpose for codebase
  • the configurations of the baseline are HRNet-W32-512x512-batch16-lr0.001
  • Speed is tested with dist_test in mmpose codebase and 8 Gpus + 16 batchsize

Quick Start

(Recommend) For mmpose, please refer to MMPose

For hrnet, please refer to Hrnet

For RSN, please refer to RSN

Data preparation For coco, we provide the human detection result and pretrained model at BaiduDisk(dsa9)

Citation

If you use our code or models in your research, please cite with:

@inproceedings{cai2020learning,
  title={Learning Delicate Local Representations for Multi-Person Pose Estimation},
  author={Yuanhao Cai and Zhicheng Wang and Zhengxiong Luo and Binyi Yin and Angang Du and Haoqian Wang and Xinyu Zhou and Erjin Zhou and Xiangyu Zhang and Jian Sun},
  booktitle={ECCV},
  year={2020}
}
@article{huang2020joint,
  title={Joint coco and lvis workshop at eccv 2020: Coco keypoint challenge track technical report: Udp+},
  author={Huang, Junjie and Shan, Zengguang and Cai, Yuanhao and Guo, Feng and Ye, Yun and Chen, Xinze and Zhu, Zheng and Huang, Guan and Lu, Jiwen and Du, Dalong},
  year={2020}
}
Owner
Tsinghua University, Megvii Inc [email protected]
Turning pixels into virtual points for multimodal 3D object detection.

Multimodal Virtual Point 3D Detection Turning pixels into virtual points for multimodal 3D object detection. Multimodal Virtual Point 3D Detection, Ti

Tianwei Yin 204 Jan 08, 2023
Survival analysis (SA) is a well-known statistical technique for the study of temporal events.

DAGSurv Survival analysis (SA) is a well-known statistical technique for the study of temporal events. In SA, time-to-an-event data is modeled using a

Rahul Kukreja 1 Sep 05, 2022
The fastest way to visualize GradCAM with your Keras models.

VizGradCAM VizGradCam is the fastest way to visualize GradCAM in Keras models. GradCAM helps with providing visual explainability of trained models an

58 Nov 19, 2022
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search

Breaking the Curse of Space Explosion: Towards Effcient NAS with Curriculum Search Pytorch implementation for "Breaking the Curse of Space Explosion:

guoyong 17 Jan 03, 2023
Official Repo for ICCV2021 Paper: Learning to Regress Bodies from Images using Differentiable Semantic Rendering

[ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering Getting Started DSR has been implemented and tested on Ubunt

Sai Kumar Dwivedi 83 Nov 27, 2022
A pytorch implementation of Detectron. Both training from scratch and inferring directly from pretrained Detectron weights are available.

Use this instead: https://github.com/facebookresearch/maskrcnn-benchmark A Pytorch Implementation of Detectron Example output of e2e_mask_rcnn-R-101-F

Roy 2.8k Dec 29, 2022
Paddle implementation for "Highly Efficient Knowledge Graph Embedding Learning with Closed-Form Orthogonal Procrustes Analysis" (NAACL 2021)

ProcrustEs-KGE Paddle implementation for Highly Efficient Knowledge Graph Embedding Learning with Orthogonal Procrustes Analysis 🙈 A more detailed re

Lincedo Lab 4 Jun 09, 2021
This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in Eurographics 2021

Deep-Detail-Enhancement-for-Any-Garment Introduction This repository contains the implementation of Deep Detail Enhancment for Any Garment proposed in

40 Dec 13, 2022
Code for pre-training CharacterBERT models (as well as BERT models).

Pre-training CharacterBERT (and BERT) This is a repository for pre-training BERT and CharacterBERT. DISCLAIMER: The code was largely adapted from an o

Hicham EL BOUKKOURI 31 Dec 05, 2022
Rename Images with Auto Generated Neural Image Captions

Recaption Images with Generated Neural Image Caption Example Usage: Commandline: Recaption all images from folder /home/feng/Downloads/images to folde

feng wang 3 May 01, 2022
Deep Learning Algorithms for Hedging with Frictions

Deep Learning Algorithms for Hedging with Frictions This repository contains the Forward-Backward Stochastic Differential Equation (FBSDE) solver and

Xiaofei Shi 3 Dec 22, 2022
Realtime_Multi-Person_Pose_Estimation

Introduction Multi Person PoseEstimation By PyTorch Results Require Pytorch Installation git submodule init && git submodule update Demo Download conv

tensorboy 1.3k Jan 05, 2023
Core ML tools contain supporting tools for Core ML model conversion, editing, and validation.

Core ML Tools Use coremltools to convert machine learning models from third-party libraries to the Core ML format. The Python package contains the sup

Apple 3k Jan 08, 2023
AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning (NeurIPS 2020) Introduction AdaShare is a novel and differentiable approach fo

94 Dec 22, 2022
[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning | 斗地主AI

[ICML 2021] DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning DouZero is a reinforcement learning framework for DouDizhu (斗地主), t

Kwai Inc. 3.1k Jan 04, 2023
CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped

CSWin-Transformer This repo is the official implementation of "CSWin Transformer: A General Vision Transformer Backbone with Cross-Shaped Windows". Th

Microsoft 409 Jan 06, 2023
A very tiny, very simple, and very secure file encryption tool.

Picocrypt is a very tiny (hence "Pico"), very simple, yet very secure file encryption tool. It uses the modern ChaCha20-Poly1305 cipher suite as well

Evan Su 1k Dec 30, 2022
PyTorch Implementation of "Light Field Image Super-Resolution with Transformers"

LFT PyTorch implementation of "Light Field Image Super-Resolution with Transformers", arXiv 2021. [pdf]. Contributions: We make the first attempt to a

Squidward 62 Nov 28, 2022
Template repository to build PyTorch projects from source on any version of PyTorch/CUDA/cuDNN.

The Ultimate PyTorch Source-Build Template Translations: 한국어 TL;DR PyTorch built from source can be x4 faster than a naïve PyTorch install. This repos

Joonhyung Lee/이준형 651 Dec 12, 2022
Fog Simulation on Real LiDAR Point Clouds for 3D Object Detection in Adverse Weather

LiDAR fog simulation Created by Martin Hahner at the Computer Vision Lab of ETH Zurich. This is the official code release of the paper Fog Simulation

Martin Hahner 110 Dec 30, 2022