Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks

Related tags

Deep LearningSSTNet
Overview

SSTNet

PWC PWC

overview Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks(ICCV2021) by Zhihao Liang, Zhihao Li, Songcen Xu, Mingkui Tan, Kui Jia*. (*) Corresponding author. [arxiv]

Introduction

Instance segmentation in 3D scenes is fundamental in many applications of scene understanding. It is yet challenging due to the compound factors of data irregularity and uncertainty in the numbers of instances. State-of-the-art methods largely rely on a general pipeline that first learns point-wise features discriminative at semantic and instance levels, followed by a separate step of point grouping for proposing object instances. While promising, they have the shortcomings that (1) the second step is not supervised by the main objective of instance segmentation, and (2) their point-wise feature learning and grouping are less effective to deal with data irregularities, possibly resulting in fragmented segmentations. To address these issues, we propose in this work an end-to-end solution of Semantic Superpoint Tree Network (SSTNet) for proposing object instances from scene points. Key in SSTNet is an intermediate, semantic superpoint tree (SST), which is constructed based on the learned semantic features of superpoints, and which will be traversed and split at intermediate tree nodes for proposals of object instances. We also design in SSTNet a refinement module, termed CliqueNet, to prune superpoints that may be wrongly grouped into instance proposals.

Installation

Requirements

  • Python 3.8.5
  • Pytorch 1.7.1
  • torchvision 0.8.2
  • CUDA 11.1

then install the requirements:

pip install -r requirements.txt

SparseConv

For the SparseConv, please refer PointGroup's spconv to install.

Extension

This project is based on our Gorilla-Lab deep learning toolkit - gorilla-core and 3D toolkit gorilla-3d.

For gorilla-core, you can install it by running:

pip install gorilla-core==0.2.7.6

or building from source(recommend)

git clone https://github.com/Gorilla-Lab-SCUT/gorilla-core
cd gorilla-core
python setup.py install(develop)

For gorilla-3d, you should install it by building from source:

git clone https://github.com/Gorilla-Lab-SCUT/gorilla-3d
cd gorilla-3d
python setup.py develop

Tip: for high-version torch, the BuildExtension may fail by using ninja to build the compile system. If you meet this problem, you can change the BuildExtension in cmdclass={"build_ext": BuildExtension} as cmdclass={"build_ext": BuildExtension}.with_options(use_ninja=False)

Otherwise, this project also need other extension, we use the pointgroup_ops to realize voxelization and use the segmentator to generate superpoints for scannet scene. we use the htree to construct the Semantic Superpoint Tree and the hierarchical node-inheriting relations is realized based on the modified cluster.hierarchy.linkage function from scipy.

  • For pointgroup_ops, we modified the package from PointGroup to let its function calls get rid of the dependence on absolute paths. You can install it by running:
    conda install -c bioconda google-sparsehash 
    cd $PROJECT_ROOT$
    cd sstnet/lib/pointgroup_ops
    python setup.py develop
    Then, you can call the function like:
    import pointgroup_ops
    pointgroup_ops.voxelization
    >>> <function Voxelization.apply>
  • For htree, it can be seen as a supplement to the treelib python package, and I abstract the SST through both of them. You can install it by running:
    cd $PROJECT_ROOT$
    cd sstnet/lib/htree
    python setup.py install

    Tip: The interaction between this piece of code and treelib is a bit messy. I lack time to organize it, which may cause some difficulties for someone in understanding. I am sorry for this. At the same time, I also welcome people to improve it.

  • For cluster, it is originally a sub-module in scipy, the SST construction requires the cluster.hierarchy.linkage to be implemented. However, the origin implementation do not consider the sizes of clustering nodes (each superpoint contains different number of points). To this end, we modify this function and let it support the property mentioned above. So, for used, you can install it by running:
    cd $PROJECT_ROOT$
    cd sstnet/lib/cluster
    python setup.py install
  • For segmentator, please refer here to install. (We wrap the segmentator in ScanNet)

Data Preparation

Please refer to the README.md in data/scannetv2 to realize data preparation.

Training

CUDA_VISIBLE_DEVICES=0 python train.py --config config/default.yaml

You can start a tensorboard session by

tensorboard --logdir=./log --port=6666

Tip: For the directory of logging, please refer the implementation of function gorilla.collect_logger.

Inference and Evaluation

CUDA_VISIBLE_DEVICES=0 python test.py --config config/default.yaml --pretrain pretrain.pth --eval
  • --split is the evaluation split of dataset.
  • --save is the action to save instance segmentation results.
  • --eval is the action to evaluate the segmentation results.
  • --semantic is the action to evaluate semantic segmentation only (work on the --eval mode).
  • --log-file is to define the logging file to save evaluation result (default please to refer the gorilla.collect_logger).
  • --visual is the action to save visualization of instance segmentation. (It will be mentioned in the next partion.)

Results on ScanNet Benchmark

Rank 1st on the ScanNet benchmark benchmark

Pretrained

We provide a pretrained model trained on ScanNet(v2) dataset. [Google Drive] [Baidu Cloud] (提取码:f3az) Its performance on ScanNet(v2) validation set is 49.4/64.9/74.4 in terms of mAP/mAP50/mAP25.

Acknowledgement

This repo is built upon several repos, e.g., PointGroup, spconv and ScanNet.

Contact

If you have any questions or suggestions about this repo or paper, please feel free to contact me in issue or email ([email protected]).

TODO

  • Distributed training(not verification)
  • Batch inference
  • Multi-processing for getting superpoints

Citation

If you find this work useful in your research, please cite:

@misc{liang2021instance,
      title={Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks}, 
      author={Zhihao Liang and Zhihao Li and Songcen Xu and Mingkui Tan and Kui Jia},
      year={2021},
      eprint={2108.07478},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
Research lab focusing on CV, ML, and AI
ICLR 2021, Fair Mixup: Fairness via Interpolation

Fair Mixup: Fairness via Interpolation Training classifiers under fairness constraints such as group fairness, regularizes the disparities of predicti

Ching-Yao Chuang 49 Nov 22, 2022
Unofficial PyTorch Implementation for HifiFace (https://arxiv.org/abs/2106.09965)

HifiFace — Unofficial Pytorch Implementation Image source: HifiFace: 3D Shape and Semantic Prior Guided High Fidelity Face Swapping (figure 1, pg. 1)

MINDs Lab 218 Jan 04, 2023
Robotics with GPU computing

Robotics with GPU computing Cupoch is a library that implements rapid 3D data processing for robotics using CUDA. The goal of this library is to imple

Shirokuma 625 Jan 07, 2023
Pytorch implementation for Semantic Segmentation/Scene Parsing on MIT ADE20K dataset

Semantic Segmentation on MIT ADE20K dataset in PyTorch This is a PyTorch implementation of semantic segmentation models on MIT ADE20K scene parsing da

MIT CSAIL Computer Vision 4.5k Jan 08, 2023
Code and experiments for "Deep Neural Networks for Rank Consistent Ordinal Regression based on Conditional Probabilities"

corn-ordinal-neuralnet This repository contains the orginal model code and experiment logs for the paper "Deep Neural Networks for Rank Consistent Ord

Raschka Research Group 14 Dec 27, 2022
PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods.

PEPit: Performance Estimation in Python This open source Python library provides a generic way to use PEP framework in Python. Performance estimation

Baptiste 53 Nov 16, 2022
Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees"

Companion code for "Bayesian logistic regression for online recalibration and revision of risk prediction models with performance guarantees" Installa

0 Oct 13, 2021
Full Resolution Residual Networks for Semantic Image Segmentation

Full-Resolution Residual Networks (FRRN) This repository contains code to train and qualitatively evaluate Full-Resolution Residual Networks (FRRNs) a

Toby Pohlen 274 Oct 27, 2022
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022
PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment

logit-adj-pytorch PyTorch implementation of the paper: Long-tail Learning via Logit Adjustment This code implements the paper: Long-tail Learning via

Chamuditha Jayanga 53 Dec 23, 2022
Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation

Neural Reprojection Error: Merging Feature Learning and Camera Pose Estimation This is the official repository for our paper Neural Reprojection Error

Hugo Germain 78 Dec 01, 2022
A super lightweight Lagrangian model for calculating millions of trajectories using ERA5 data

Easy-ERA5-Trck Easy-ERA5-Trck Galleries Install Usage Repository Structure Module Files Version iteration Easy-ERA5-Trck is a super lightweight Lagran

Zhenning Li 26 Nov 19, 2022
Continual Learning of Long Topic Sequences in Neural Information Retrieval

ContinualPassageRanking Repository for the paper "Continual Learning of Long Topic Sequences in Neural Information Retrieval". In this repository you

0 Apr 12, 2022
A real world application of a Recurrent Neural Network on a binary classification of time series data

What is this This is a real world application of a Recurrent Neural Network on a binary classification of time series data. This project includes data

Josep Maria Salvia Hornos 2 Jan 30, 2022
Official implementation for CVPR 2021 paper: Adaptive Class Suppression Loss for Long-Tail Object Detection

Adaptive Class Suppression Loss for Long-Tail Object Detection This repo is the official implementation for CVPR 2021 paper: Adaptive Class Suppressio

CASIA-IVA-Lab 67 Dec 04, 2022
Expand human face editing via Global Direction of StyleCLIP, especially to maintain similarity during editing.

Oh-My-Face This project is based on StyleCLIP, RIFE, and encoder4editing, which aims to expand human face editing via Global Direction of StyleCLIP, e

AiLin Huang 51 Nov 17, 2022
Miscellaneous and lightweight network tools

Network Tools Collection of miscellaneous and lightweight network tools to simplify daily operations, administration, and troubleshooting of networks.

Nicholas Russo 22 Mar 22, 2022
A new test set for ImageNet

ImageNetV2 The ImageNetV2 dataset contains new test data for the ImageNet benchmark. This repository provides associated code for assembling and worki

186 Dec 18, 2022
A Structured Self-attentive Sentence Embedding

Structured Self-attentive sentence embeddings Implementation for the paper A Structured Self-Attentive Sentence Embedding, which was published in ICLR

Kaushal Shetty 488 Nov 28, 2022
Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation

Generalizing Gaze Estimation with Outlier-guided Collaborative Adaptation Our paper is accepted by ICCV2021. Picture: Overview of the proposed Plug-an

Yunfei Liu 32 Dec 10, 2022