Object detection, 3D detection, and pose estimation using center point detection:

Overview

Objects as Points

Object detection, 3D detection, and pose estimation using center point detection:

Objects as Points,
Xingyi Zhou, Dequan Wang, Philipp Krähenbühl,
arXiv technical report (arXiv 1904.07850)

Contact: [email protected]. Any questions or discussions are welcomed!

Updates

  • (June, 2020) We released a state-of-the-art Lidar-based 3D detection and tracking framework CenterPoint.
  • (April, 2020) We released a state-of-the-art (multi-category-/ pose-/ 3d-) tracking extension CenterTrack.

Abstract

Detection identifies objects as axis-aligned boxes in an image. Most successful object detectors enumerate a nearly exhaustive list of potential object locations and classify each. This is wasteful, inefficient, and requires additional post-processing. In this paper, we take a different approach. We model an object as a single point -- the center point of its bounding box. Our detector uses keypoint estimation to find center points and regresses to all other object properties, such as size, 3D location, orientation, and even pose. Our center point based approach, CenterNet, is end-to-end differentiable, simpler, faster, and more accurate than corresponding bounding box based detectors. CenterNet achieves the best speed-accuracy trade-off on the MS COCO dataset, with 28.1% AP at 142 FPS, 37.4% AP at 52 FPS, and 45.1% AP with multi-scale testing at 1.4 FPS. We use the same approach to estimate 3D bounding box in the KITTI benchmark and human pose on the COCO keypoint dataset. Our method performs competitively with sophisticated multi-stage methods and runs in real-time.

Highlights

  • Simple: One-sentence method summary: use keypoint detection technic to detect the bounding box center point and regress to all other object properties like bounding box size, 3d information, and pose.

  • Versatile: The same framework works for object detection, 3d bounding box estimation, and multi-person pose estimation with minor modification.

  • Fast: The whole process in a single network feedforward. No NMS post processing is needed. Our DLA-34 model runs at 52 FPS with 37.4 COCO AP.

  • Strong: Our best single model achieves 45.1AP on COCO test-dev.

  • Easy to use: We provide user friendly testing API and webcam demos.

Main results

Object Detection on COCO validation

Backbone AP / FPS Flip AP / FPS Multi-scale AP / FPS
Hourglass-104 40.3 / 14 42.2 / 7.8 45.1 / 1.4
DLA-34 37.4 / 52 39.2 / 28 41.7 / 4
ResNet-101 34.6 / 45 36.2 / 25 39.3 / 4
ResNet-18 28.1 / 142 30.0 / 71 33.2 / 12

Keypoint detection on COCO validation

Backbone AP FPS
Hourglass-104 64.0 6.6
DLA-34 58.9 23

3D bounding box detection on KITTI validation

Backbone FPS AP-E AP-M AP-H AOS-E AOS-M AOS-H BEV-E BEV-M BEV-H
DLA-34 32 96.9 87.8 79.2 93.9 84.3 75.7 34.0 30.5 26.8

All models and details are available in our Model zoo.

Installation

Please refer to INSTALL.md for installation instructions.

Use CenterNet

We support demo for image/ image folder, video, and webcam.

First, download the models (By default, ctdet_coco_dla_2x for detection and multi_pose_dla_3x for human pose estimation) from the Model zoo and put them in CenterNet_ROOT/models/.

For object detection on images/ video, run:

python demo.py ctdet --demo /path/to/image/or/folder/or/video --load_model ../models/ctdet_coco_dla_2x.pth

We provide example images in CenterNet_ROOT/images/ (from Detectron). If set up correctly, the output should look like

For webcam demo, run

python demo.py ctdet --demo webcam --load_model ../models/ctdet_coco_dla_2x.pth

Similarly, for human pose estimation, run:

python demo.py multi_pose --demo /path/to/image/or/folder/or/video/or/webcam --load_model ../models/multi_pose_dla_3x.pth

The result for the example images should look like:

You can add --debug 2 to visualize the heatmap outputs. You can add --flip_test for flip test.

To use this CenterNet in your own project, you can

import sys
CENTERNET_PATH = /path/to/CenterNet/src/lib/
sys.path.insert(0, CENTERNET_PATH)

from detectors.detector_factory import detector_factory
from opts import opts

MODEL_PATH = /path/to/model
TASK = 'ctdet' # or 'multi_pose' for human pose estimation
opt = opts().init('{} --load_model {}'.format(TASK, MODEL_PATH).split(' '))
detector = detector_factory[opt.task](opt)

img = image/or/path/to/your/image/
ret = detector.run(img)['results']

ret will be a python dict: {category_id : [[x1, y1, x2, y2, score], ...], }

Benchmark Evaluation and Training

After installation, follow the instructions in DATA.md to setup the datasets. Then check GETTING_STARTED.md to reproduce the results in the paper. We provide scripts for all the experiments in the experiments folder.

Develop

If you are interested in training CenterNet in a new dataset, use CenterNet in a new task, or use a new network architecture for CenterNet, please refer to DEVELOP.md. Also feel free to send us emails for discussions or suggestions.

Third-party resources

License

CenterNet itself is released under the MIT License (refer to the LICENSE file for details). Portions of the code are borrowed from human-pose-estimation.pytorch (image transform, resnet), CornerNet (hourglassnet, loss functions), dla (DLA network), DCNv2(deformable convolutions), tf-faster-rcnn(Pascal VOC evaluation) and kitti_eval (KITTI dataset evaluation). Please refer to the original License of these projects (See NOTICE).

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{zhou2019objects,
  title={Objects as Points},
  author={Zhou, Xingyi and Wang, Dequan and Kr{\"a}henb{\"u}hl, Philipp},
  booktitle={arXiv preprint arXiv:1904.07850},
  year={2019}
}
Owner
Xingyi Zhou
CS Ph.D. student at UT Austin.
Xingyi Zhou
Api for getting bin info and getting encrypted card details for adyen.

Bin Info And Adyen Cse Enc Python api for getting bin info and getting encrypted

Roldex Stark 8 Dec 30, 2022
Providing the solutions for high-frequency trading (HFT) strategies using data science approaches (Machine Learning) on Full Orderbook Tick Data.

Modeling High-Frequency Limit Order Book Dynamics Using Machine Learning Framework to capture the dynamics of high-frequency limit order books. Overvi

Chang-Shu Chung 1.3k Jan 07, 2023
EMNLP 2020 - Summarizing Text on Any Aspects

Summarizing Text on Any Aspects This repo contains preliminary code of the following paper: Summarizing Text on Any Aspects: A Knowledge-Informed Weak

Bowen Tan 35 Nov 14, 2022
Zero-shot Learning by Generating Task-specific Adapters

Code for "Zero-shot Learning by Generating Task-specific Adapters" This is the repository containing code for "Zero-shot Learning by Generating Task-s

INK Lab @ USC 11 Dec 17, 2021
ECCV2020 paper: Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code and Data.

This repo contains some of the codes for the following paper Fashion Captioning: Towards Generating Accurate Descriptions with Semantic Rewards. Code

Xuewen Yang 56 Dec 08, 2022
This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

This Repostory contains the pretrained DTLN-aec model for real-time acoustic echo cancellation.

Nils L. Westhausen 182 Jan 07, 2023
PyTorch implementation of SmoothGrad: removing noise by adding noise.

SmoothGrad implementation in PyTorch PyTorch implementation of SmoothGrad: removing noise by adding noise. Vanilla Gradients SmoothGrad Guided backpro

SSKH 143 Jan 05, 2023
Anomaly detection analysis and labeling tool, specifically for multiple time series (one time series per category)

taganomaly Anomaly detection labeling tool, specifically for multiple time series (one time series per category). Taganomaly is a tool for creating la

Microsoft 272 Dec 17, 2022
Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022)

Blockwise Sequential Model Learning Code for 'Blockwise Sequential Model Learning for Partially Observable Reinforcement Learning' (AAAI 2022) For ins

2 Jun 17, 2022
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations

TopClus The source code used for Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations, published in WWW 2022. Requ

Yu Meng 63 Dec 18, 2022
FPGA: Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification

FPGA & FreeNet Fast Patch-Free Global Learning Framework for Fully End-to-End Hyperspectral Image Classification by Zhuo Zheng, Yanfei Zhong, Ailong M

Zhuo Zheng 92 Jan 03, 2023
Banglore House Prediction Using Flask Server (Python)

Banglore House Prediction Using Flask Server (Python) 🌐 Links 🌐 📂 Repo In this repository, I've implemented a Machine Learning-based Bangalore Hous

Dhyan Shah 1 Jan 24, 2022
Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks.

The Lottery Ticket Hypothesis for Pre-trained BERT Networks Code for this paper The Lottery Ticket Hypothesis for Pre-trained BERT Networks. [NeurIPS

VITA 122 Dec 14, 2022
A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Jun-Yan Zhu 27 Aug 08, 2022
Pre-trained NFNets with 99% of the accuracy of the official paper

NFNet Pytorch Implementation This repo contains pretrained NFNet models F0-F6 with high ImageNet accuracy from the paper High-Performance Large-Scale

Benjamin Schmidt 133 Dec 09, 2022
Codebase for ECCV18 "The Sound of Pixels"

Sound-of-Pixels Codebase for ECCV18 "The Sound of Pixels". *This repository is under construction, but the core parts are already there. Environment T

Hang Zhao 318 Dec 20, 2022
Pytorch implementation of the paper: "A Unified Framework for Separating Superimposed Images", in CVPR 2020.

Deep Adversarial Decomposition PDF | Supp | 1min-DemoVideo Pytorch implementation of the paper: "Deep Adversarial Decomposition: A Unified Framework f

Zhengxia Zou 72 Dec 18, 2022
Code and hyperparameters for the paper "Generative Adversarial Networks"

Generative Adversarial Networks This repository contains the code and hyperparameters for the paper: "Generative Adversarial Networks." Ian J. Goodfel

Ian Goodfellow 3.5k Jan 08, 2023
PyTorch code to run synthetic experiments.

Code repository for Invariant Risk Minimization Source code for the paper: @article{InvariantRiskMinimization, title={Invariant Risk Minimization}

Facebook Research 345 Dec 12, 2022
Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning. CVPR 2018

Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning Tensorflow code and models for the paper: Large Scale Fine-Grained Categ

Yin Cui 187 Oct 01, 2022