Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Last update: Dec 22, 2022

Related tags

Deep Learning 3DGenZ

Overview

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds

Björn Michele¹⁾, Alexandre Boulch¹⁾, Gilles Puy¹⁾, Maxime Bucher¹⁾ and Renaud Marlet¹⁾²⁾

¹⁾ Valeo.ai ²⁾LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, Franc

Accepted at 3DV 2021
Arxiv: Paper and Supp.
Poster or Presentation

Abstract: While there has been a number of studies on Zero-Shot Learning (ZSL) for 2D images, its application to 3D data is still recent and scarce, with just a few methods limited to classification. We present the first generative approach for both ZSL and Generalized ZSL (GZSL) on 3D data, that can handle both classification and, for the first time, semantic segmentation. We show that it reaches or outperforms the state of the art on ModelNet40 classification for both inductive ZSL and inductive GZSL. For semantic segmentation, we created three benchmarks for evaluating this new ZSL task, using S3DIS, ScanNet and SemanticKITTI. Our experiments show that our method outperforms strong baselines, which we additionally propose for this task.

If you want to cite this work:

@inproceedings{michele2021generative,
  title={Generative Zero-Shot Learning for Semantic Segmentation of {3D} Point Cloud},
  author={Michele, Bj{\"o}rn and Boulch, Alexandre and Puy, Gilles and Bucher, Maxime and Marlet, Renaud},
  booktitle={International Conference on 3D Vision (3DV)},
  year={2021}

Code

We provide in this repository the code and the pretrained models for the semantic segmentation tasks on SemanticKITTI and ScanNet.

To-Do:

We will add more experiments in the future (You could "watch" the repo to stay updated).

Code Semantic Segmentation

Installation

Dependencies: Please see requirements.txt for all needed code libraries. Tested with: Pytorch 1.6.0 and 1.7.1 (both Cuda 10.1). As torch-geometric is needed Pytoch >= 1.4.0 is required.

Clone this repository.
Download and/or install the backbones (ConvPoint is also necessary for our adaption of FKAConv. More information: ConvPoint, FKAConv, KP-Conv).
- For ConvPoint:
```
cd 3DGenZ/genz3d/convpoint/convpoint/knn
python3 setup.py install --home="."
```
- For FKAConv:
```
cd 3DGenZ/genz3d/fkaconv
pip install -ve . 
```
- For KPConv have a look at: INSTALL.md
Download the datasets.
- For an out of the box start we recommend the following folder structure.
```
~/3DGenZ
~/data/scannet/
~/data/semantic_kitti/
```
Download the semantic word embeddings and the pretrained backbones.
- Place the semantic word embeddings in
```
3DGenZ/genz3d/word_representations/
```
- For SN, the pre-trained backbone model and the config file, are placed in
```
3DGenZ/genz3d/fkaconv/examples/scannet/FKAConv_scannet_ZSL4
```
The complete ZSL-trained model cpkt is placed in (create the folder if necessary)
```
3DGenZ/genz3d/seg/run/scannet/
```
- For SK, the pre-trained backbone-model, the "Log-..." folder is placed in
```
3DGenZ/genz3d/kpconv/results
```
And the complete ZSL-trained model ckpt is placed in
```
3DGenZ/genz3d/seg/run/sk
```

Run training and evalutation

Training (Classifier layer): In 3DGenZ/genz3d/seg/ you find for each of the datasets a folder with scripts to run the generator and classificator training.(see: SN,SK)
- Alternatively, you can use the pretrained models from us.
Evalutation: Is done with the evaluation functions of the backbones. (see: SN_eval, KP-Conv_eval)

Backbones

For the datasets we used different backbones, for which we highly rely on their code basis. In order to adapt them to the ZSL setting we made the change that during the backbone training no crops of point clouds with unseen classes are shown (if there is a single unseen class

ConvPoint [1] for the S3DIS dataset (and also partly used for the ScanNet dataset).
FKAConv [2] for the ScanNet dataset.
KPConv [3] for the SemanticKITTI dataset.

Datasets

For semantic segmentation we did experiments on 3 datasets.

SemanticKITTI [4][5].
S3DIS [6].
ScanNet[7].

Acknowledgements

For the Generator Training we use parts of the code basis of ZS3.
For the backbones we use the code of ConvPoint, FKAConv and KPConv.

References

[1] Boulch, A. (2020). ConvPoint: Continuous convolutions for point cloud processing. Computers & Graphics, 88, 24-34.
[2] Boulch, A., Puy, G., & Marlet, R. (2020). FKAConv: Feature-kernel alignment for point cloud convolution. In Proceedings of the Asian Conference on Computer Vision.
[3] Thomas, H., Qi, C. R., Deschaud, J. E., Marcotegui, B., Goulette, F., & Guibas, L. J. (2019). Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6411-6420).
[4] Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9297-9307).
[5] Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361). IEEE.
[6] Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1534-1543).
[7] Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., & Nießner, M. (2017). Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5828-5839).

Updates

9.12.2021 Initial Code release

Licence

3DGenZ is released under the Apache 2.0 license.

The folder 3DGenZ/genz3d/kpconv includes large parts of code taken from KP-Conv and is therefore distributed under the MIT Licence. See the LICENSE for this folder.

The folder 3DGenZ/genz3d/seg/utils also includes files taken from https://github.com/jfzhang95/pytorch-deeplab-xception and is therefore also distributed under the MIT License. See the LICENSE for these files.

Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Related tags

Overview

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds

Code

To-Do:

Code Semantic Segmentation

Installation

Run training and evalutation

Backbones

Datasets

Acknowledgements

References

Updates

Licence

Owner

valeo.ai

Python implementation of Wu et al (2018)'s registration fusion

“Data Augmentation for Cross-Domain Named Entity Recognition” (EMNLP 2021)

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior and Non-local Spatial-Temporal Similarity

A PyTorch implementation of "Predict then Propagate: Graph Neural Networks meet Personalized PageRank" (ICLR 2019).

Code for the paper "Adversarial Generator-Encoder Networks"

Implementations of CNNs, RNNs, GANs, etc

Flexible Networks for Learning Physical Dynamics of Deformable Objects (2021)

Generative Flow Networks for Discrete Probabilistic Modeling

PyTorch implementation of Federated Learning with Non-IID Data, and federated learning algorithms, including FedAvg, FedProx.

Transformer - Transformer in PyTorch

Harmonic Memory Networks for Graph Completion

Simple Pixelbot for Diablo 2 Resurrected written in python and opencv.

Codes to calculate solar-sensor zenith and azimuth angles directly from hyperspectral images collected by UAV. Works only for UAVs that have high resolution GNSS/IMU unit.

The official homepage of the COCO-Stuff dataset.

Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

A Python framework for conversational search

Official PyTorch Implementation of Hypercorrelation Squeeze for Few-Shot Segmentation, arXiv 2021

[CVPR 2022] PoseTriplet: Co-evolving 3D Human Pose Estimation, Imitation, and Hallucination under Self-supervision (Oral)

The Habitat-Matterport 3D Research Dataset - the largest-ever dataset of 3D indoor spaces.

Code and models for "Rethinking Deep Image Prior for Denoising" (ICCV 2021)