Official PyTorch Code of GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection (CVPR 2021)

Overview

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection, CVPR 2021

Abhinav Kumar, Garrick Brazil, Xiaoming Liu

[project], [supp], [slides], [1min_talk], demo, arxiv

This code is based on Kinematic-3D, such that the setup/organization is very similar. A few of the implementations, such as classical NMS, are based on Caffe.

References

Please cite the following paper if you find this repository useful:

@inproceedings{kumar2021groomed,
  title={{GrooMeD-NMS}: Grouped Mathematically Differentiable NMS for Monocular {$3$D} Object Detection},
  author={Kumar, Abhinav and Brazil, Garrick and Liu, Xiaoming},
  booktitle={IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021}
}

Setup

  • Requirements

    1. Python 3.6
    2. Pytorch 0.4.1
    3. Torchvision 0.2.1
    4. Cuda 8.0
    5. Ubuntu 18.04/Debian 8.9

    This is tested with NVIDIA 1080 Ti GPU. Other platforms have not been tested. Unless otherwise stated, the below scripts and instructions assume the working directory is the project root.

    Clone the repo first:

    git clone https://github.com/abhi1kumar/groomed_nms.git
  • Cuda & Python

    Install some basic packages:

    sudo apt-get install libopenblas-dev libboost-dev libboost-all-dev git
    sudo apt install gfortran
    
    # We need to compile with older version of gcc and g++
    sudo apt install gcc-5 g++-5
    sudo ln -f /usr/bin/gcc-5 /usr/local/cuda-8.0/bin/gcc
    sudo ln -s /usr/bin/g++-5 /usr/local/cuda-8.0/bin/g++

    Next, install conda and then install the required packages:

    wget https://repo.anaconda.com/archive/Anaconda3-2020.02-Linux-x86_64.sh
    bash Anaconda3-2020.02-Linux-x86_64.sh
    source ~/.bashrc
    conda list
    conda create --name py36 --file dependencies/conda.txt
    conda activate py36
  • KITTI Data

    Download the following images of the full KITTI 3D Object detection dataset:

    Then place a soft-link (or the actual data) in data/kitti:

     ln -s /path/to/kitti data/kitti

    The directory structure should look like this:

    ./groomed_nms
    |--- cuda_env
    |--- data
    |      |---kitti
    |            |---training
    |            |        |---calib
    |            |        |---image_2
    |            |        |---label_2
    |            |
    |            |---testing
    |                     |---calib
    |                     |---image_2
    |
    |--- dependencies
    |--- lib
    |--- models
    |--- scripts

    Then, use the following scripts to extract the data splits, which use soft-links to the above directory for efficient storage:

    python data/kitti_split1/setup_split.py
    python data/kitti_split2/setup_split.py

    Next, build the KITTI devkit eval:

     sh data/kitti_split1/devkit/cpp/build.sh
  • Classical NMS

    Lastly, build the classical NMS modules:

    cd lib/nms
    make
    cd ../..

Training

Training is carried out in two stages - a warmup and a full. Review the configurations in scripts/config for details.

chmod +x scripts_training.sh
./scripts_training.sh

If your training is accidentally stopped, you can resume at a checkpoint based on the snapshot with the restore flag. For example, to resume training starting at iteration 10k, use the following command:

source dependencies/cuda_8.0_env
CUDA_VISIBLE_DEVICES=0 python -u scripts/train_rpn_3d.py --config=groumd_nms --restore=10000

Testing

We provide logs/models/predictions for the main experiments on KITTI Val 1/Val 2/Test data splits available to download here.

Make an output folder in the project directory:

mkdir output

Place different models in the output folder as follows:

./groomed_nms
|--- output
|      |---groumd_nms
|      |
|      |---groumd_nms_split2
|      |
|      |---groumd_nms_full_train_2
|
| ...

To test, run the file as below:

chmod +x scripts_evaluation.sh
./scripts_evaluation.sh

Contact

For questions, feel free to post here or drop an email to this address- [email protected]

You might also like...
Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21

MonoFlex Released code for Objects are Different: Flexible Monocular 3D Object Detection, CVPR21. Work in progress. Installation This repo is tested w

[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
[CVPR 2022 Oral] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

EPro-PnP EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation In CVPR 2022 (Oral). [paper] Hanshen

Code for
Code for "NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video", CVPR 2021 oral

NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video Project Page | Paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Mon

Code for
Code for "LASR: Learning Articulated Shape Reconstruction from a Monocular Video". CVPR 2021.

LASR Installation Build with conda conda env create -f lasr.yml conda activate lasr # install softras cd third_party/softras; python setup.py install;

The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection .

GCoNet The official repo of the CVPR 2021 paper Group Collaborative Learning for Co-Salient Object Detection . Trained model Download final_gconet.pth

Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

Categorical Depth Distribution Network for Monocular 3D Object Detection
Categorical Depth Distribution Network for Monocular 3D Object Detection

CaDDN CaDDN is a monocular-based 3D object detection method. This repository is based off of [OpenPCDet]. Categorical Depth Distribution Network for M

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection This repository contains implementation of the

Progressive Coordinate Transforms for Monocular 3D Object Detection
Progressive Coordinate Transforms for Monocular 3D Object Detection

Progressive Coordinate Transforms for Monocular 3D Object Detection This repository is the official implementation of PCT. Introduction In this paper,

Comments
  • Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

    Is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

    Hi~thanks for your great work. However, I have some confusion in understanding the motivation of this algorithm. If we want to achieve the consistency of training and test, we can simply penalize the highest-confidence proposal in the training pipeline, which seems to achieve similar result.So, is there any difference between groom-nms and penalize highest-confidence proposal using gt directly?

    opened by kaixinbear 3
  • Problem in test

    Problem in test

    Hi, this is an exciting work.And i have a question when I try to test with the pre-train model. I can't find "Kinematic3D-Release/val1_kinematic/model_final".

    opened by chenH20000109 1
Releases(v0.1)
Owner
Abhinav Kumar
PhD Student, Computer Vision and Deep Learning, MSU
Abhinav Kumar
Official DGL implementation of "Rethinking High-order Graph Convolutional Networks"

SE Aggregation This is the implementation for Rethinking High-order Graph Convolutional Networks. Here we show the codes for citation networks as an e

Tianqi Zhang (张天启) 32 Jul 19, 2022
FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation

FCN_via_Keras FCN FCN (Fully Convolutional Network) is deep fully convolutional neural network architecture for semantic pixel-wise segmentation. This

Kento Watanabe 48 Aug 30, 2022
Code samples for my book "Neural Networks and Deep Learning"

Code samples for "Neural Networks and Deep Learning" This repository contains code samples for my book on "Neural Networks and Deep Learning". The cod

Michael Nielsen 13.9k Dec 26, 2022
A DeepStack custom model for detecting common objects in dark/night images and videos.

DeepStack_ExDark This repository provides a custom DeepStack model that has been trained and can be used for creating a new object detection API for d

MOSES OLAFENWA 98 Dec 24, 2022
Dynamic vae - Dynamic VAE algorithm is used for anomaly detection of battery data

Dynamic VAE frame Automatic feature extraction can be achieved by probability di

10 Oct 07, 2022
Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder

Antoine Caillon 589 Jan 02, 2023
The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

The trained model and denoising example for paper : Cardiopulmonary Auscultation Enhancement with a Two-Stage Noise Cancellation Approach

ycj_project 1 Jan 18, 2022
BMW TechOffice MUNICH 148 Dec 21, 2022
Codes for the compilation and visualization examples to the HIF vegetation dataset

High-impedance vegetation fault dataset This repository contains the codes that compile the "Vegetation Conduction Ignition Test Report" data, which a

1 Dec 12, 2021
Attention-guided gan for synthesizing IR images

SI-AGAN Attention-guided gan for synthesizing IR images This repository contains the Tensorflow code for "Pedestrian Gender Recognition by Style Trans

1 Oct 25, 2021
Collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

The repository collects many various multi-modal transformer architectures, including image transformer, video transformer, image-language transformer, video-language transformer and related datasets

Jun Chen 139 Dec 21, 2022
This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

OpenAI 3k Dec 26, 2022
Generate images from texts. In Russian. In PaddlePaddle

ruDALL-E PaddlePaddle ruDALL-E in PaddlePaddle. Install: pip install rudalle_paddle==0.0.1rc1 Run with free v100 on AI Studio. Original Pytorch versi

AgentMaker 20 Oct 18, 2022
PyTorch implementation of paper “Unbiased Scene Graph Generation from Biased Training”

A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiased

Kaihua Tang 824 Jan 03, 2023
Chainer Implementation of Fully Convolutional Networks. (Training code to reproduce the original result is available.)

fcn - Fully Convolutional Networks Chainer implementation of Fully Convolutional Networks. Installation pip install fcn Inference Inference is done as

Kentaro Wada 218 Oct 27, 2022
You Only Look Once for Panopitic Driving Perception

You Only 👀 Once for Panoptic 🚗 Perception You Only Look at Once for Panoptic driving Perception by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wan

Hust Visual Learning Team 1.4k Jan 04, 2023
Event sourced bank - A wide-and-shallow example using the Python event sourcing library

Event Sourced Bank A "wide but shallow" example of using the Python event sourci

3 Mar 09, 2022
Semi-supevised Semantic Segmentation with High- and Low-level Consistency

Semi-supevised Semantic Segmentation with High- and Low-level Consistency This Pytorch repository contains the code for our work Semi-supervised Seman

123 Dec 30, 2022
Playable Video Generation

Playable Video Generation Playable Video Generation Willi Menapace, Stéphane Lathuilière, Sergey Tulyakov, Aliaksandr Siarohin, Elisa Ricci Paper: ArX

Willi Menapace 136 Dec 31, 2022
Github Traffic Insights as Prometheus metrics.

github-traffic Github Traffic collects your repository's traffic data and exposes it as Prometheus metrics. Grafana dashboard that displays the metric

Grafana Labs 34 Oct 27, 2022