Public repository of the 3DV 2021 paper "Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds"

Related tags

Deep Learning3DGenZ
Overview

Generative Zero-Shot Learning for Semantic Segmentation of 3D Point Clouds

Björn Michele1), Alexandre Boulch1), Gilles Puy1), Maxime Bucher1) and Renaud Marlet1)2)

1) Valeo.ai 2)LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-Vallée, Franc

Accepted at 3DV 2021
Arxiv: Paper and Supp.
Poster or Presentation

Abstract: While there has been a number of studies on Zero-Shot Learning (ZSL) for 2D images, its application to 3D data is still recent and scarce, with just a few methods limited to classification. We present the first generative approach for both ZSL and Generalized ZSL (GZSL) on 3D data, that can handle both classification and, for the first time, semantic segmentation. We show that it reaches or outperforms the state of the art on ModelNet40 classification for both inductive ZSL and inductive GZSL. For semantic segmentation, we created three benchmarks for evaluating this new ZSL task, using S3DIS, ScanNet and SemanticKITTI. Our experiments show that our method outperforms strong baselines, which we additionally propose for this task.

If you want to cite this work:

@inproceedings{michele2021generative,
  title={Generative Zero-Shot Learning for Semantic Segmentation of {3D} Point Cloud},
  author={Michele, Bj{\"o}rn and Boulch, Alexandre and Puy, Gilles and Bucher, Maxime and Marlet, Renaud},
  booktitle={International Conference on 3D Vision (3DV)},
  year={2021}

Code

We provide in this repository the code and the pretrained models for the semantic segmentation tasks on SemanticKITTI and ScanNet.

To-Do:

  • We will add more experiments in the future (You could "watch" the repo to stay updated).

Code Semantic Segmentation

Installation

Dependencies: Please see requirements.txt for all needed code libraries. Tested with: Pytorch 1.6.0 and 1.7.1 (both Cuda 10.1). As torch-geometric is needed Pytoch >= 1.4.0 is required.

  1. Clone this repository.

  2. Download and/or install the backbones (ConvPoint is also necessary for our adaption of FKAConv. More information: ConvPoint, FKAConv, KP-Conv).

    • For ConvPoint:
    cd 3DGenZ/genz3d/convpoint/convpoint/knn
    python3 setup.py install --home="."
    
    • For FKAConv:
    cd 3DGenZ/genz3d/fkaconv
    pip install -ve . 
    
  3. Download the datasets.

    • For an out of the box start we recommend the following folder structure.
    ~/3DGenZ
    ~/data/scannet/
    ~/data/semantic_kitti/
    
  4. Download the semantic word embeddings and the pretrained backbones.

    • Place the semantic word embeddings in
    3DGenZ/genz3d/word_representations/
    
    • For SN, the pre-trained backbone model and the config file, are placed in
    3DGenZ/genz3d/fkaconv/examples/scannet/FKAConv_scannet_ZSL4
    

    The complete ZSL-trained model cpkt is placed in (create the folder if necessary)

    3DGenZ/genz3d/seg/run/scannet/
    
    • For SK, the pre-trained backbone-model, the "Log-..." folder is placed in
    3DGenZ/genz3d/kpconv/results
    

    And the complete ZSL-trained model ckpt is placed in

    3DGenZ/genz3d/seg/run/sk
    

Run training and evalutation

  1. Training (Classifier layer): In 3DGenZ/genz3d/seg/ you find for each of the datasets a folder with scripts to run the generator and classificator training.(see: SN,SK)
    • Alternatively, you can use the pretrained models from us.
  2. Evalutation: Is done with the evaluation functions of the backbones. (see: SN_eval, KP-Conv_eval)

Backbones

For the datasets we used different backbones, for which we highly rely on their code basis. In order to adapt them to the ZSL setting we made the change that during the backbone training no crops of point clouds with unseen classes are shown (if there is a single unseen class

  • ConvPoint [1] for the S3DIS dataset (and also partly used for the ScanNet dataset).
  • FKAConv [2] for the ScanNet dataset.
  • KPConv [3] for the SemanticKITTI dataset.

Datasets

For semantic segmentation we did experiments on 3 datasets.

  • SemanticKITTI [4][5].
  • S3DIS [6].
  • ScanNet[7].

Acknowledgements

For the Generator Training we use parts of the code basis of ZS3.
For the backbones we use the code of ConvPoint, FKAConv and KPConv.

References

[1] Boulch, A. (2020). ConvPoint: Continuous convolutions for point cloud processing. Computers & Graphics, 88, 24-34.
[2] Boulch, A., Puy, G., & Marlet, R. (2020). FKAConv: Feature-kernel alignment for point cloud convolution. In Proceedings of the Asian Conference on Computer Vision.
[3] Thomas, H., Qi, C. R., Deschaud, J. E., Marcotegui, B., Goulette, F., & Guibas, L. J. (2019). Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6411-6420).
[4] Behley, J., Garbade, M., Milioto, A., Quenzel, J., Behnke, S., Stachniss, C., & Gall, J. (2019). Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9297-9307).
[5] Geiger, A., Lenz, P., & Urtasun, R. (2012, June). Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition (pp. 3354-3361). IEEE.
[6] Armeni, I., Sener, O., Zamir, A. R., Jiang, H., Brilakis, I., Fischer, M., & Savarese, S. (2016). 3d semantic parsing of large-scale indoor spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1534-1543).
[7] Dai, A., Chang, A. X., Savva, M., Halber, M., Funkhouser, T., & Nießner, M. (2017). Scannet: Richly-annotated 3d reconstructions of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 5828-5839).

Updates

9.12.2021 Initial Code release

Licence

3DGenZ is released under the Apache 2.0 license.

The folder 3DGenZ/genz3d/kpconv includes large parts of code taken from KP-Conv and is therefore distributed under the MIT Licence. See the LICENSE for this folder.

The folder 3DGenZ/genz3d/seg/utils also includes files taken from https://github.com/jfzhang95/pytorch-deeplab-xception and is therefore also distributed under the MIT License. See the LICENSE for these files.

Owner
valeo.ai
We are an international team based in Paris, conducting AI research for Valeo automotive applications, in collaboration with world-class academics.
valeo.ai
The repo for the paper "I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection".

I3CL: Intra- and Inter-Instance Collaborative Learning for Arbitrary-shaped Scene Text Detection Updates | Introduction | Results | Usage | Citation |

33 Jan 05, 2023
A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Hyunsoo Cho 1 Dec 20, 2021
Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy" (ICLR 2022 Spotlight)

About Code release for Anomaly Transformer: Time Series Anomaly Detection with Association Discrepancy (ICLR 2022 Spotlight)

THUML @ Tsinghua University 221 Dec 31, 2022
Large scale PTM - PPI relation extraction

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard

1 Feb 25, 2022
PyTorch implementation of Value Iteration Networks (VIN): Clean, Simple and Modular. Visualization in Visdom.

VIN: Value Iteration Networks This is an implementation of Value Iteration Networks (VIN) in PyTorch to reproduce the results.(TensorFlow version) Key

Xingdong Zuo 215 Dec 07, 2022
Deep-Learning-Book-Chapter-Summaries - Attempting to make the Deep Learning Book easier to understand.

Deep-Learning-Book-Chapter-Summaries This repository provides a summary for each chapter of the Deep Learning book by Ian Goodfellow, Yoshua Bengio an

Aman Dalmia 1k Dec 27, 2022
TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction

TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction TSDF++ is a novel multi-object TSDF formulation that can encode mult

ETHZ ASL 130 Dec 29, 2022
An efficient framework for reinforcement learning.

rl: An efficient framework for reinforcement learning Requirements Introduction PPO Test Requirements name version Python =3.7 numpy =1.19 torch =1

16 Nov 30, 2022
PyTorch(Geometric) implementation of G^2GNN in "Imbalanced Graph Classification via Graph-of-Graph Neural Networks"

This repository is an official PyTorch(Geometric) implementation of G^2GNN in "Imbalanced Graph Classification via Graph-of-Graph Neural Networks". Th

Yu Wang (Jack) 13 Nov 18, 2022
:boar: :bear: Deep Learning based Python Library for Stock Market Prediction and Modelling

bulbea "Deep Learning based Python Library for Stock Market Prediction and Modelling." Table of Contents Installation Usage Documentation Dependencies

Achilles Rasquinha 1.8k Jan 05, 2023
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Discriminative Sounding Objects Localization Code for our NeurIPS 2020 paper Discriminative Sounding Objects Localization via Self-supervised Audiovis

51 Dec 11, 2022
Mesh Graphormer is a new transformer-based method for human pose and mesh reconsruction from an input image

MeshGraphormer ✨ ✨ This is our research code of Mesh Graphormer. Mesh Graphormer is a new transformer-based method for human pose and mesh reconsructi

Microsoft 251 Jan 08, 2023
Official code for "On the Frequency Bias of Generative Models", NeurIPS 2021

Frequency Bias of Generative Models Generator Testbed Discriminator Testbed This repository contains official code for the paper On the Frequency Bias

35 Nov 01, 2022
Fast image augmentation library and easy to use wrapper around other libraries. Documentation: https://albumentations.ai/docs/ Paper about library: https://www.mdpi.com/2078-2489/11/2/125

Albumentations Albumentations is a Python library for image augmentation. Image augmentation is used in deep learning and computer vision tasks to inc

11.4k Jan 09, 2023
A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run.

Minimal Hand A minimal solution to hand motion capture from a single color camera at over 100fps. Easy to use, plug to run. This project provides the

Yuxiao Zhou 824 Jan 07, 2023
Code for the paper "Training GANs with Stronger Augmentations via Contrastive Discriminator" (ICLR 2021)

Training GANs with Stronger Augmentations via Contrastive Discriminator (ICLR 2021) This repository contains the code for reproducing the paper: Train

Jongheon Jeong 174 Dec 29, 2022
An Agnostic Computer Vision Framework - Pluggable to any Training Library: Fastai, Pytorch-Lightning with more to come

IceVision is the first agnostic computer vision framework to offer a curated collection with hundreds of high-quality pre-trained models from torchvision, MMLabs, and soon Pytorch Image Models. It or

airctic 789 Dec 29, 2022
1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

1st place solution to the Satellite Image Change Detection Challenge hosted by SenseTime

Lihe Yang 209 Jan 01, 2023
Trading Gym is an open source project for the development of reinforcement learning algorithms in the context of trading.

Trading Gym Trading Gym is an open-source project for the development of reinforcement learning algorithms in the context of trading. It is currently

Dimitry Foures 535 Nov 15, 2022
DeepCAD: A Deep Generative Network for Computer-Aided Design Models

DeepCAD This repository provides source code for our paper: DeepCAD: A Deep Generative Network for Computer-Aided Design Models Rundi Wu, Chang Xiao,

Rundi Wu 85 Dec 31, 2022