PyTorch implementation of SQN based on CloserLook3D's encoder

Last update: Oct 21, 2021

Related tags

Overview

SQN_pytorch

This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, check our SQN_tensorflow repo.

Caution: currently, this repo does not achieve a satisfactory result as the SQN paper reports. For performance details, check performance section.

The repo is still under development, with the aim of reaching the level of performance reported in the SQN paper.(Note: our SQN_tensorflow repo has slightly higher performance than this pytorch repo.)

Please open an issue, if you have any comments and suggestions for improving the model performance.

TODOs

implement the training strategy mentioned in the Appendix of the paper.
ablation study
benchmark weak supervision

Install python packages

The latest codes are tested on two Ubuntu settings:

Ubuntu 18.04, Nvidia 1080, CUDA 10.1, PyTorch 1.4 and Python 3.6
Ubuntu 18.04, Nvidia 3090, CUDA 11.3, PyTorch 1.4 and Python 3.6

For details setting up the development environment, check CloserLook3D Pytorch version. To facilitate settings, below I also provide my own bash script( install.sh ) to create a conda environment from scratch for this repo. (You may need tailor this script according to your own system)

#!/bin/bash
ENV_NAME='closerlook'
conda create –n $ENV_NAME python=3.6.10 -y
source activate $ENV_NAME
conda install -c anaconda pillow=6.2 -y
conda install pytorch==1.4.0 torchvision==0.5.0 cudatoolkit=10.1 -c pytorch -y
conda install -c conda-forge opencv -y
pip3 install termcolor tensorboard h5py easydict

Datasets

Take S3DIS as an example.

Scene Segmentation on S3DIS

You can download the S3DIS dataset from here (4.8 GB). You only need to download the file named Stanford3dDataset_v1.2.zip, unzip and move (or link) it to data/S3DIS/Stanford3dDataset_v1.2. (same as the CloserLook3D repo setting.)

The file structure should look like:

<root>
├── cfgs
│   └── s3dis
├── data
│   └── S3DIS
│       └── Stanford3dDataset_v1.2
│           ├── Area_1
│           ├── Area_2
│           ├── Area_3
│           ├── Area_4
│           ├── Area_5
│           └── Area_6
├── init.sh
├── datasets
├── function
├── models
├── ops
└── utils

run prepare-s3dis-sqn.sh to preprocess the S3DIS dataset. This script will generate a processed folder with the below structure with five types of data, including: raw, sub-sampled point clouds for each area, KDtrees for each sub-sampled area, projection indices for each raw point over the sub-sampled area and weak labels for raw and sub-sampled point clouds (involving different weak proportion of the dataset, e.g., 0.1, 0.01, 0.001, etc.. Details check datasets/S3DIS_sqn.py and my summary notes in this file.

The processed folder is organized as follows:

<root>
├── data
│   └── S3DIS
│       └── Stanford3dDataset_v1.2
│           ├── Area_1
│           ├── Area_2
│           ├── Area_3
│           ├── Area_4
│           ├── Area_5
│           ├── Area_6
│           └── processed
│             ├── weak_label_0.01
│             ├── weak_label_1.0
│             ├── Area_1_0.040_sub.pkl
│             ├── Area_1.pkl
│             ├── ...(many other pkl files)

Compile custom CUDA operators

sh init.sh

Run

use the run-sqn.sh script for training or evaluation.

The core training script is as follows:

python -m torch.distributed.launch \
--master_port 1234567 \
--nproc_per_node ${num_gpu} \
function/train_s3dis_dist_sqn.py \
--dataset_name ${dataset_name} \
--cfg cfgs/${dataset_name}/pospool_xyz_avg_sqn.yaml \
--num_points ${num_points} \
--batch_size ${batch_size} \
--val_freq 20 \
--weak_ratio ${weak_ratio}

The core evaluation script is as follows:

python -m torch.distributed.launch \
--master_port 12346 \
--nproc_per_node 1 \
function/evaluate_s3dis_dist_sqn.py \
--cfg cfgs/s3dis/pospool_xyz_avg_sqn.yaml \
--load_path <checkpoint>
[--log_dir <log directory>]

Performance on S3DIS

The experiments are still in progress due to my slow GPU.

Model	Weak ratio	Performance (mIoU, %)	Description
Official RandLA-Net	100%	63.0	Fully supervised method trained with full labels.
Official SQN	1/1000	61.4	This official SQN uses additional techniques to improve the performance, our replicaed SQN currently does not investigate this yet. Official SQN does not provide results of S3DIS under the weak ratio of 1/10 and 1/100
Our replicated SQN	1/10	51.4	Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.
Our replicated SQN	1/100	25.22	Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.
Our replicated SQN	1/1000	21.10	Use PosPool (s) as the encoder whose width=36, due to limited GPU usage and active learning is currently not used.

Acknowledgements

Our pytorch codes borrowed a lot from CloserLook3D and the custom trilinear interoplation CUDA ops are modified from erikwijmans's Pointnet2_PyTorch.

Citation

If you find our work useful in your research, please consider citing:

@article{pytorchpointnet++,
    Author = {YIN, Chao},
    Title = {SQN Pytorch implementation based on CloserLook3D's encoder},
    Journal = {https://github.com/PointCloudYC/SQN_pytorch},
    Year = {2021}
   }

@article{hu2021sqn,
    title={SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000x Fewer Labels},
    author={Hu, Qingyong and Yang, Bo and Fang, Guangchi and Guo, Yulan and Leonardis, Ales and Trigoni, Niki and Markham, Andrew},
    journal={arXiv preprint arXiv:2104.04891},
    year={2021}
  }

PyTorch implementation of SQN based on CloserLook3D's encoder

Related tags

Overview

SQN_pytorch

TODOs

Install python packages

Datasets

Compile custom CUDA operators

Run

Performance on S3DIS

Acknowledgements

Citation

Owner

PointCloudYC

Code base for reproducing results of I.Schubert, D.Driess, O.Oguz, and M.Toussaint: Learning to Execute: Efficient Learning of Universal Plan-Conditioned Policies in Robotics. NeurIPS (2021)

Parasite: a tool allowing you to compress and decompress files, to reduce their size

An automated algorithm to extract the linear blend skinning (LBS) from a set of example poses

Diverse Image Generation via Self-Conditioned GANs

GPU-Accelerated Deep Learning Library in Python

The official repo of the CVPR2021 oral paper: Representative Batch Normalization with Feature Calibration

Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion (CVPR'2021, Oral)

Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

Lecture materials for Cornell CS5785 Applied Machine Learning (Fall 2021)

TAPEX: Table Pre-training via Learning a Neural SQL Executor

TriMap: Large-scale Dimensionality Reduction Using Triplets

The coda and data for "Measuring Fine-Grained Domain Relevance of Terms: A Hierarchical Core-Fringe Approach" (ACL '21)

Model Zoo for AI Model Efficiency Toolkit

Official implementation of the paper Visual Parser: Representing Part-whole Hierarchies with Transformers

Evaluation framework for testing segmentation networks in PyTorch

Implements VQGAN+CLIP for image and video generation, and style transfers, based on text and image prompts. Emphasis on ease-of-use, documentation, and smooth video creation.

Official implement of Evo-ViT: Slow-Fast Token Evolution for Dynamic Vision Transformer

PyTorch code for MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning

Explore the Expression: Facial Expression Generation using Auxiliary Classifier Generative Adversarial Network