Code & Models for 3DETR - an End-to-end transformer model for 3D object detection

Related tags

Deep Learning3detr
Overview

3DETR: An End-to-End Transformer Model for 3D Object Detection

PyTorch implementation and models for 3DETR.

3DETR (3D DEtection TRansformer) is a simpler alternative to complex hand-crafted 3D detection pipelines. It does not rely on 3D backbones such as PointNet++ and uses few 3D-specific operators. 3DETR obtains comparable or better performance than 3D detection methods such as VoteNet. The encoder can also be used for other 3D tasks such as shape classification. More details in the paper "An End-to-End Transformer Model for 3D Object Detection".

[website] [arXiv] [bibtex]

Code description. Our code is based on prior work such as DETR and VoteNet and we aim for simplicity in our implementation. We hope it can ease research in 3D detection.

3DETR Approach Decoder Detections

Pretrained Models

We provide the pretrained model weights and the corresponding metrics on the val set (per class APs, Recalls). We provide a Python script utils/download_weights.py to easily download the weights/metrics files.

Arch Dataset Epochs AP25 AP50 Model weights Eval metrics
3DETR-m SUN RGB-D 1080 59.1 30.3 weights metrics
3DETR SUN RGB-D 1080 58.0 30.3 weights metrics
3DETR-m ScanNet 1080 65.0 47.0 weights metrics
3DETR ScanNet 1080 62.1 37.9 weights metrics

Model Zoo

For convenience, we provide model weights for 3DETR trained for different number of epochs.

Arch Dataset Epochs AP25 AP50 Model weights Eval metrics
3DETR-m SUN RGB-D 90 51.0 22.0 weights metrics
3DETR-m SUN RGB-D 180 55.6 27.5 weights metrics
3DETR-m SUN RGB-D 360 58.2 30.6 weights metrics
3DETR-m SUN RGB-D 720 58.1 30.4 weights metrics
3DETR SUN RGB-D 90 43.7 16.2 weights metrics
3DETR SUN RGB-D 180 52.1 25.8 weights metrics
3DETR SUN RGB-D 360 56.3 29.6 weights metrics
3DETR SUN RGB-D 720 56.0 27.8 weights metrics
3DETR-m ScanNet 90 47.1 19.5 weights metrics
3DETR-m ScanNet 180 58.7 33.6 weights metrics
3DETR-m ScanNet 360 62.4 37.7 weights metrics
3DETR-m ScanNet 720 63.7 44.5 weights metrics
3DETR ScanNet 90 42.8 15.3 weights metrics
3DETR ScanNet 180 54.5 28.8 weights metrics
3DETR ScanNet 360 59.0 35.4 weights metrics
3DETR ScanNet 720 61.1 40.2 weights metrics

Running 3DETR

Installation

Our code is tested with PyTorch 1.4.0, CUDA 10.2 and Python 3.6. It may work with other versions.

You will need to install pointnet2 layers by running

cd third_party/pointnet2 && python setup.py install

You will also need Python dependencies (either conda install or pip install)

matplotlib
opencv-python
plyfile
'trimesh>=2.35.39,<2.35.40'
'networkx>=2.2,<2.3'
scipy

Some users have experienced issues using CUDA 11 or higher. Please try using CUDA 10.2 if you run into CUDA issues.

Optionally, you can install a Cythonized implementation of gIOU for faster training.

conda install cython
cd utils && python cython_compile.py build_ext --inplace

Benchmarking

Dataset preparation

We follow the VoteNet codebase for preprocessing our data. The instructions for preprocessing SUN RGB-D are [here] and ScanNet are [here].

You can edit the dataset paths in datasets/sunrgbd.py and datasets/scannet.py or choose to specify at runtime.

Testing

Once you have the datasets prepared, you can test pretrained models as

python main.py --dataset_name <dataset_name> --nqueries <number of queries> --test_ckpt <path_to_checkpoint> --test_only [--enc_type masked]

We use 128 queries for the SUN RGB-D dataset and 256 queries for the ScanNet dataset. You will need to add the flag --enc_type masked when testing the 3DETR-m checkpoints. Please note that the testing process is stochastic (due to randomness in point cloud sampling and sampling the queries) and so results can vary within 1% AP25 across runs. This stochastic nature of the inference process is also common for methods such as VoteNet.

If you have not edited the dataset paths for the files in the datasets folder, you can pass the path to the datasets using the --dataset_root_dir flag.

Training

The model can be simply trained by running main.py.

python main.py --dataset_name <dataset_name> --checkpoint_dir <path to store outputs>

To reproduce the results in the paper, we provide the arguments in the scripts folder. A variance of 1% AP25 across different training runs can be expected.

You can quickly verify your installation by training a 3DETR model for 90 epochs on ScanNet following the file scripts/scannet_quick.sh and compare it to the pretrained checkpoint from the Model Zoo.

License

The majority of 3DETR is licensed under the Apache 2.0 license as found in the LICENSE file, however portions of the project are available under separate license terms: licensing information for pointnet2 is available at https://github.com/erikwijmans/Pointnet2_PyTorch/blob/master/UNLICENSE

Contributing

We welcome your pull requests! Please see CONTRIBUTING and CODE_OF_CONDUCT for more info.

Citation

If you find this repository useful, please consider starring us and citing

@inproceedings{misra2021-3detr,
    title={{An End-to-End Transformer Model for 3D Object Detection}},
    author={Misra, Ishan and Girdhar, Rohit and Joulin, Armand},
    booktitle={{ICCV}},
    year={2021},
}
Owner
Facebook Research
Facebook Research
The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate. Website • Key Features • How To Use • Docs •

Pytorch Lightning 21.1k Jan 01, 2023
Discovering Explanatory Sentences in Legal Case Decisions Using Pre-trained Language Models.

Statutory Interpretation Data Set This repository contains the data set created for the following research papers: Savelka, Jaromir, and Kevin D. Ashl

17 Dec 23, 2022
YOLOv2 in PyTorch

YOLOv2 in PyTorch NOTE: This project is no longer maintained and may not compatible with the newest pytorch (after 0.4.0). This is a PyTorch implement

Long Chen 1.5k Jan 02, 2023
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation

Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis Implementation This project attempted to implement the paper Putting NeRF on a

254 Dec 27, 2022
Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads-Tutorial-3 Testing the Facial Emotion Recognition (FER) algorithm on animations

PegHeads Inc 2 Jan 03, 2022
the official code for ICRA 2021 Paper: "Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation"

G2S This is the official code for ICRA 2021 Paper: Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation by Hemang

NeurAI 4 Jul 27, 2022
Deep Semisupervised Multiview Learning With Increasing Views (IEEE TCYB 2021, PyTorch Code)

Deep Semisupervised Multiview Learning With Increasing Views (ISVN, IEEE TCYB) Peng Hu, Xi Peng, Hongyuan Zhu, Liangli Zhen, Jie Lin, Huaibai Yan, Dez

3 Nov 19, 2022
Official implementation of FCL-taco2: Fast, Controllable and Lightweight version of Tacotron2 @ ICASSP 2021

FCL-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech synthesis (ICASSP 2021) Paper | Demo Block diagram of FCL-taco2, where the decode

Disong Wang 39 Sep 28, 2022
Piotr - IoT firmware emulation instrumentation for training and research

Piotr: Pythonic IoT exploitation and Research Introduction to Piotr Piotr is an emulation helper for Qemu that provides a convenient way to create, sh

Damien Cauquil 51 Nov 09, 2022
10x faster matrix and vector operations

Bolt is an algorithm for compressing vectors of real-valued data and running mathematical operations directly on the compressed representations. If yo

2.3k Jan 09, 2023
[NeurIPS-2021] Slow Learning and Fast Inference: Efficient Graph Similarity Computation via Knowledge Distillation

Efficient Graph Similarity Computation - (EGSC) This repo contains the source code and dataset for our paper: Slow Learning and Fast Inference: Effici

24 Dec 31, 2022
A Python library for common tasks on 3D point clouds

Point Cloud Utils (pcu) - A Python library for common tasks on 3D point clouds Point Cloud Utils (pcu) is a utility library providing the following fu

Francis Williams 622 Dec 27, 2022
Official PyTorch Implementation of Learning Architectures for Binary Networks

Learning Architectures for Binary Networks An Pytorch Implementation of the paper Learning Architectures for Binary Networks (BNAS) (ECCV 2020) If you

Computer Vision Lab. @ GIST 25 Jun 09, 2022
Poisson Surface Reconstruction for LiDAR Odometry and Mapping

Poisson Surface Reconstruction for LiDAR Odometry and Mapping Surfels TSDF Our Approach Table: Qualitative comparison between the different mapping te

Photogrammetry & Robotics Bonn 305 Dec 21, 2022
Pytorch implementation of U-Net, R2U-Net, Attention U-Net, and Attention R2U-Net.

pytorch Implementation of U-Net, R2U-Net, Attention U-Net, Attention R2U-Net U-Net: Convolutional Networks for Biomedical Image Segmentation https://a

leejunhyun 2k Jan 02, 2023
AsymmetricGAN - Dual Generator Generative Adversarial Networks for Multi-Domain Image-to-Image Translation

AsymmetricGAN for Image-to-Image Translation AsymmetricGAN Framework for Multi-Domain Image-to-Image Translation AsymmetricGAN Framework for Hand Gest

Hao Tang 42 Jan 15, 2022
Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021 Global Pooling, More than Meets the Eye: Posi

Md Amirul Islam 32 Apr 24, 2022
Spherical CNNs

Spherical CNNs Equivariant CNNs for the sphere and SO(3) implemented in PyTorch Overview This library contains a PyTorch implementation of the rotatio

Jonas Köhler 893 Dec 28, 2022
Unsupervised Foreground Extraction via Deep Region Competition

Unsupervised Foreground Extraction via Deep Region Competition [Paper] [Code] The official code repository for NeurIPS 2021 paper "Unsupervised Foregr

28 Nov 06, 2022
Official git repo for the CHIRP project

CHIRP Project This is the official git repository for the CHIRP project. Pull requests are accepted here, but for the moment, the main repository is s

Dan Smith 77 Jan 08, 2023