Source code for ZePHyR: Zero-shot Pose Hypothesis Rating @ ICRA 2021

Overview

ZePHyR: Zero-shot Pose Hypothesis Rating

ZePHyR is a zero-shot 6D object pose estimation pipeline. The core is a learned scoring function that compares the sensor observation to a sparse object rendering of each candidate pose hypothesis. We used PointNet++ as the network structure and trained and tested on YCB-V and LM-O dataset.

[ArXiv] [Project Page] [Video] [BibTex]

ZePHyR pipeline animation

Get Started

First, checkout this repo by

git clone --recurse-submodules [email protected]:r-pad/zephyr.git

Set up environment

  1. We recommend building the environment and install all required packages using Anaconda.
conda env create -n zephyr --file zephyr_env.yml
conda activate zephyr
  1. Install the required packages for compiling the C++ module
sudo apt-get install build-essential cmake libopencv-dev python-numpy
  1. Compile the c++ library for python bindings in the conda virtual environment
mkdir build
cd build
cmake .. -DPYTHON_EXECUTABLE=$(python -c "import sys; print(sys.executable)") -DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")  -DPYTHON_LIBRARY=$(python -c "import distutils.sysconfig as sysconfig; print(sysconfig.get_config_var('LIBDIR'))")
make; make install
  1. Install the current python package
cd .. # move to the root folder of this repo
pip install -e .

Download pre-processed dataset

Download pre-processed training and testing data (ycbv_preprocessed.zip, lmo_preprocessed.zip and ppf_hypos.zip) from this Google Drive link and unzip it in the python/zephyr/data folder. The unzipped data takes around 66GB of storage in total.

The following commands need to be run in python/zephyr/ folder.

cd python/zephyr/

Example script to run the network

To use the network, an example is provided in notebooks/TestExample.ipynb. In the example script, a datapoint is loaded from LM-O dataset provided by the BOP Challenge. The pose hypotheses is provided by PPF algorithm (extracted from ppf_hypos.zip). Despite the complex dataloading code, only the following data of the observation and the model point clouds is needed to run the network:

  • img: RGB image, np.ndarray of size (H, W, 3) in np.uint8
  • depth: depth map, np.ndarray of size (H, W) in np.float, in meters
  • cam_K: camera intrinsic matrix, np.ndarray of size (3, 3) in np.float
  • model_colors: colors of model point cloud, np.ndarray of size (N, 3) in float, scaled in [0, 1]
  • model_points: xyz coordinates of model point cloud, np.ndarray of size (N, 3) in float, in meters
  • model_normals: normal vectors of mdoel point cloud, np.ndarray of size (N, 3) in float, each L2 normalized
  • pose_hypos: pose hypotheses in camera frame, np.ndarray of size (K, 4, 4) in float

Run PPF algorithm using HALCON software

The PPF algorithm we used is the surface matching function implmemented in MVTec HALCON software. HALCON provides a Python interface for programmers together with its newest versions. I wrote a simple wrapper which calls create_surface_model() and find_surface_model() to get the pose hypotheses. See notebooks/TestExample.ipynb for how to use it.

The wrapper requires the HALCON 21.05 to be installed, which is a commercial software but it provides free licenses for students.

If you don't have access to HALCON, sets of pre-estimated pose hypotheses are provided in the pre-processed dataset.

Test the network

Download the pretrained pytorch model checkpoint from this Google Drive link and unzip it in the python/zephyr/ckpts/ folder. We provide 3 checkpoints, two trained on YCB-V objects with odd ID (final_ycbv.ckpt) and even ID (final_ycbv_valodd.ckpt) respectively, and one trained on LM objects that are not in LM-O dataset (final_lmo.ckpt).

Test on YCB-V dataset

Test on the YCB-V dataset using the model trained on objects with odd ID

python test.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_test/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final \
    --resume_path ./ckpts/final_ycbv.ckpt

Test on the YCB-V dataset using the model trained on objects with even ID

python test.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_test/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final \
    --resume_path ./ckpts/final_ycbv_valodd.ckpt

Test on LM-O dataset

python test.py \
    --model_name pn2 \
    --dataset_root ./data/lmo/matches_data_test/ \
    --dataset_name lmo \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final \
    --resume_path ./ckpts/final_lmo.ckpt

The testing results will be stored in test_logs and the results in BOP Challenge format will be in test_logs/bop_results. Please refer to bop_toolkit for converting the results to BOP Average Recall scores used in BOP challenge.

Train the network

Train on YCB-V dataset

These commands will train the network on the real-world images in the YCB-Video training set.

On object Set 1 (objects with odd ID)

python train.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_train/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final

On object Set 2 (objects with even ID)

python train.py \
    --model_name pn2 \
    --dataset_root ./data/ycb/matches_data_train/ \
    --dataset_name ycbv \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --val_obj odd \
    --exp_name final_valodd

Train on LM-O synthetic dataset

This command will train the network on the synthetic images provided by BlenderProc4BOP. We take the lm_train_pbr.zip as the training set but the network is only supervised on objects that is in Linemod but not in Linemod-Occluded (i.e. IDs for training objects are 2 3 4 7 13 14 15).

python train.py \
    --model_name pn2 \
    --dataset_root ./data/lmo/matches_data_train/ \
    --dataset_name lmo \
    --dataset HSVD_diff_uv_norm \
    --no_valid_proj --no_valid_depth \
    --loss_cutoff log \
    --exp_name final

Cite

If you find this codebase useful in your research, please consider citing:

@inproceedings{icra2021zephyr,
    title={ZePHyR: Zero-shot Pose Hypothesis Rating},
    author={Brian Okorn, Qiao Gu, Martial Hebert, David Held},
    booktitle={2021 International Conference on Robotics and Automation (ICRA)},
    year={2021}
}

Reference

Owner
R-Pad - Robots Perceiving and Doing
This is the repository for the R-Pad lab at CMU.
R-Pad - Robots Perceiving and Doing
The pyrelational package offers a flexible workflow to enable active learning with as little change to the models and datasets as possible

pyrelational is a python active learning library developed by Relation Therapeutics for rapidly implementing active learning pipelines from data management, model development (and Bayesian approximat

Relation Therapeutics 95 Dec 27, 2022
The project covers common metrics for super-resolution performance evaluation.

Super-Resolution Performance Evaluation Code The project covers common metrics for super-resolution performance evaluation. Metrics support The script

xmy 10 Aug 03, 2022
A new GCN model for Point Cloud Analyse

Pytorch Implementation of PointNet and PointNet++ This repo is implementation for VA-GCN in pytorch. Classification (ModelNet10/40) Data Preparation D

12 Feb 02, 2022
The GitHub repository for the paper: “Time Series is a Special Sequence: Forecasting with Sample Convolution and Interaction“.

SCINet This is the original PyTorch implementation of the following work: Time Series is a Special Sequence: Forecasting with Sample Convolution and I

386 Jan 01, 2023
Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for the task of Visual Document Understanding (VDU)

DocFormer - PyTorch Implementation of DocFormer: End-to-End Transformer for Document Understanding, a multi-modal transformer based architecture for t

171 Jan 06, 2023
領域を指定し、キーを入力することで画像を保存するツールです。クラス分類用のデータセット作成を想定しています。

image-capture-class-annotation 領域を指定し、キーを入力することで画像を保存するツールです。 クラス分類用のデータセット作成を想定しています。 Requirement OpenCV 3.4.2 or later Usage 実行方法は以下です。 起動後はマウスクリック4

KazuhitoTakahashi 5 May 28, 2021
A simple python module to generate anchor (aka default/prior) boxes for object detection tasks.

PyBx WIP A simple python module to generate anchor (aka default/prior) boxes for object detection tasks. Calculated anchor boxes are returned as ndarr

thatgeeman 4 Dec 15, 2022
Scenic: A Jax Library for Computer Vision and Beyond

Scenic Scenic is a codebase with a focus on research around attention-based models for computer vision. Scenic has been successfully used to develop c

Google Research 1.6k Dec 27, 2022
This is the official code for the paper "Ad2Attack: Adaptive Adversarial Attack for Real-Time UAV Tracking".

Ad^2Attack:Adaptive Adversarial Attack on Real-Time UAV Tracking Demo video 📹 Our video on bilibili demonstrates the test results of Ad^2Attack on se

Intelligent Vision for Robotics in Complex Environment 10 Nov 07, 2022
PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer.

Unsupervised_IEPGAN This is the PyTorch implementation of our ICCV 2021 paper Intrinsic-Extrinsic Preserved GANs for Unsupervised 3D Pose Transfer. Ha

25 Oct 26, 2022
Markov Attention Models

Introduction This repo contains code for reproducing the results in the paper Graphical Models with Attention for Context-Specific Independence and an

Vicarious 0 Dec 09, 2021
Official code implementation for "Personalized Federated Learning using Hypernetworks"

Personalized Federated Learning using Hypernetworks This is an official implementation of Personalized Federated Learning using Hypernetworks paper. [

Aviv Shamsian 121 Dec 25, 2022
Tree LSTM implementation in PyTorch

Tree-Structured Long Short-Term Memory Networks This is a PyTorch implementation of Tree-LSTM as described in the paper Improved Semantic Representati

Riddhiman Dasgupta 529 Dec 10, 2022
PyTorch implementation for STIN

STIN This repository contains PyTorch implementation for STIN. Abstract: In single-photon LiDAR, photon-efficient imaging captures the 3D structure of

Yiweins 2 Nov 22, 2022
Code release for NeuS

NeuS We present a novel neural surface reconstruction method, called NeuS, for reconstructing objects and scenes with high fidelity from 2D image inpu

Peng Wang 813 Jan 04, 2023
Multi-agent reinforcement learning algorithm and environment

Multi-agent reinforcement learning algorithm and environment [en/cn] Pytorch implements multi-agent reinforcement learning algorithms including IQL, Q

万鲲鹏 7 Sep 20, 2022
Official Repository for the ICCV 2021 paper "PixelSynth: Generating a 3D-Consistent Experience from a Single Image"

PixelSynth: Generating a 3D-Consistent Experience from a Single Image (ICCV 2021) Chris Rockwell, David F. Fouhey, and Justin Johnson [Project Website

Chris Rockwell 95 Nov 22, 2022
Recurrent Neural Network Tutorial, Part 2 - Implementing a RNN in Python and Theano

Please read the blog post that goes with this code! Jupyter Notebook Setup System Requirements: Python, pip (Optional) virtualenv To start the Jupyter

Denny Britz 863 Dec 15, 2022
Modelisation on galaxy evolution using PEGASE-HR

model_galaxy Modelisation on galaxy evolution using PEGASE-HR This is a labwork done in internship at IAP directed by Damien Le Borgne (https://github

Adrien Anthore 1 Jan 14, 2022
PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short-Term Transformer for Online Action Detection".

Long Short-Term Transformer for Online Action Detection Introduction This is a PyTorch implementation for our NeurIPS 2021 Spotlight paper "Long Short

77 Dec 16, 2022