Efficient 6-DoF Grasp Generation in Cluttered Scenes

Overview

Contact-GraspNet

Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes

Martin Sundermeyer, Arsalan Mousavian, Rudolph Triebel, Dieter Fox
ICRA 2021

paper, project page, video

Installation

This code has been tested with python 3.7, tensorflow 2.2, CUDA 10.1, and CUDNN 7.6.0

Create the conda env

conda env create -f contact_graspnet_env.yml

Troubleshooting

  • Recompile pointnet2 tf_ops, see here

Hardware

Training: 1x Nvidia GPU >= 24GB VRAM, >=64GB RAM
Inference: 1x Nvidia GPU >= 8GB VRAM (might work with less)

Download Models and Data

Model

Download trained models from here and copy them into the checkpoints/ folder.

Test data

Download the test data from here and copy them them into the test_data/ folder.

Inference

Contact-GraspNet can directly predict a 6-DoF grasp distribution from a raw scene point cloud. However, to obtain object-wise grasps, remove background grasps and to achieve denser proposals it is highly recommended to use (unknown) object segmentation [e.g. 1, 2] as preprocessing and then use the resulting segmentation map to crop local regions and filter grasp contacts.

Given a .npy/.npz file with a depth map (in meters), camera matrix K and (optionally) a 2D segmentation map, execute:

python contact_graspnet/inference.py \
       --np_path=test_data/*.npy \
       --local_regions --filter_grasps

--> close the window to go to next scene

Given a .npy/.npz file with just a 3D point cloud (in meters), execute for example:

python contact_graspnet/inference.py --np_path=/path/to/your/pc.npy \
                                     --forward_passes=5 \
                                     --z_range=[0.2,1.1]

--np_path: input .npz/.npy file(s) with 'depth', 'K' and optionally 'segmap', 'rgb' keys. For processing a Nx3 point cloud instead use 'xzy' and optionally 'xyz_color' as keys.
--ckpt_dir: relative path to checkpooint directory. By default checkpoint/scene_test_2048_bs3_hor_sigma_001 is used. For very clean / noisy depth data consider scene_2048_bs3_rad2_32 / scene_test_2048_bs3_hor_sigma_0025 trained with no / strong noise.
--local_regions: Crop 3D local regions around object segments for inference. (only works with segmap)
--filter_grasps: Filter grasp contacts such that they only lie on the surface of object segments. (only works with segmap)
--skip_border_objects Ignore segments touching the depth map boundary.
--forward_passes number of (batched) forward passes. Increase to sample more potential grasp contacts.
--z_range [min, max] z values in meter used to crop the input point cloud, e.g. to avoid grasps in the foreground/background(as above).
--arg_configs TEST.second_thres:0.19 TEST.first_thres:0.23 Overwrite config confidence thresholds for successful grasp contacts to get more/less grasp proposals

Training

Download Data

Download the Acronym dataset, ShapeNet meshes and make them watertight, following these steps.

Download the training data consisting of 10000 table top training scenes with contact grasp information from here and extract it to the same folder:

acronym
├── grasps
├── meshes
├── scene_contacts
└── splits

Train Contact-GraspNet

When training on a headless server set the environment variable

export PYOPENGL_PLATFORM='egl'

Start training with config contact_graspnet/config.yaml

python contact_graspnet/train.py --ckpt_dir checkpoints/your_model_name \
                                 --data_path /path/to/acronym/data

Generate Contact Grasps and Scenes yourself (optional)

The scene_contacts downloaded above are generated from the Acronym dataset. To generate/visualize table-top scenes yourself, also pip install the acronym_tools package in your conda environment as described in the acronym repository.

In the first step, object-wise 6-DoF grasps are mapped to their contact points saved in mesh_contacts

python tools/create_contact_infos.py /path/to/acronym

From the generated mesh_contacts you can create table-top scenes which are saved in scene_contacts with

python tools/create_table_top_scenes.py /path/to/acronym

Takes ~3 days in a single thread. Run the command several times to process on multiple cores in parallel.

You can also visualize existing table-top scenes and grasps

python tools/create_table_top_scenes.py /path/to/acronym \
       --load_existing scene_contacts/000000.npz -vis

Citation

@article{sundermeyer2021contact,
  title={Contact-GraspNet: Efficient 6-DoF Grasp Generation in Cluttered Scenes},
  author={Sundermeyer, Martin and Mousavian, Arsalan and Triebel, Rudolph and Fox, Dieter},
  booktitle={2021 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2021}
}
Owner
NVIDIA Research Projects
NVIDIA Research Projects
FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

FCOS: Fully Convolutional One-Stage Object Detection This project hosts the code for implementing the FCOS algorithm for object detection, as presente

Tian Zhi 3.1k Jan 05, 2023
SPEAR: Semi suPErvised dAta progRamming

Semi-Supervised Data Programming for Data Efficient Machine Learning SPEAR is a library for data programming with semi-supervision. The package implem

decile-team 91 Dec 06, 2022
Fang Zhonghao 13 Nov 19, 2022
Train CPPNs as a Generative Model, using Generative Adversarial Networks and Variational Autoencoder techniques to produce high resolution images.

cppn-gan-vae tensorflow Train Compositional Pattern Producing Network as a Generative Model, using Generative Adversarial Networks and Variational Aut

hardmaru 343 Dec 29, 2022
3rd Place Solution for ICCV 2021 Workshop SSLAD Track 3A - Continual Learning Classification Challenge

Online Continual Learning via Multiple Deep Metric Learning and Uncertainty-guided Episodic Memory Replay 3rd Place Solution for ICCV 2021 Workshop SS

Rifki Kurniawan 6 Nov 10, 2022
Curvlearn, a Tensorflow based non-Euclidean deep learning framework.

English | 简体中文 Why Non-Euclidean Geometry Considering these simple graph structures shown below. Nodes with same color has 2-hop distance whereas 1-ho

Alibaba 123 Dec 12, 2022
An implementation for Neural Architecture Search with Random Labels (CVPR 2021 poster) on Pytorch.

Neural Architecture Search with Random Labels(RLNAS) Introduction This project provides an implementation for Neural Architecture Search with Random L

18 Nov 08, 2022
9th place solution in "Santa 2020 - The Candy Cane Contest"

Santa 2020 - The Candy Cane Contest My solution in this Kaggle competition "Santa 2020 - The Candy Cane Contest", 9th place. Basic Strategy In this co

toshi_k 22 Nov 26, 2021
Deep learning with TensorFlow and earth observation data.

Deep Learning with TensorFlow and EO Data Complete file set for Jupyter Book Autor: Development Seed Date: 04 October 2021 ISBN: (to come) Notebook tu

Development Seed 20 Nov 16, 2022
Training vision models with full-batch gradient descent and regularization

Stochastic Training is Not Necessary for Generalization -- Training competitive vision models without stochasticity This repository implements trainin

Jonas Geiping 32 Jan 06, 2023
Kaggle Ultrasound Nerve Segmentation competition [Keras]

Ultrasound nerve segmentation using Keras (1.0.7) Kaggle Ultrasound Nerve Segmentation competition [Keras] #Install (Ubuntu {14,16}, GPU) cuDNN requir

179 Dec 28, 2022
Keras Implementation of The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation by (Simon Jégou, Michal Drozdzal, David Vazquez, Adriana Romero, Yoshua Bengio)

The One Hundred Layers Tiramisu: Fully Convolutional DenseNets for Semantic Segmentation: Work In Progress, Results can't be replicated yet with the m

Yad Konrad 196 Aug 30, 2022
MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research

MOOSE (Multi-organ objective segmentation) a data-centric AI solution that generates multilabel organ segmentations to facilitate systemic TB whole-person research.The pipeline is based on nn-UNet an

QIMP team 30 Jan 01, 2023
Official Repository for the paper "Improving Baselines in the Wild".

iWildCam and FMoW baselines (WILDS) This repository was originally forked from the official repository of WILDS datasets (commit 7e103ed) For general

Kazuki Irie 3 Nov 24, 2022
Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021)

Tracing Versus Freehand for Evaluating Computer-Generated Drawings (SIGGRAPH 2021) Zeyu Wang, Sherry Qiu, Nicole Feng, Holly Rushmeier, Leonard McMill

Zach Zeyu Wang 23 Dec 09, 2022
A variational Bayesian method for similarity learning in non-rigid image registration (CVPR 2022)

A variational Bayesian method for similarity learning in non-rigid image registration We provide the source code and the trained models used in the re

daniel grzech 14 Nov 21, 2022
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"

Region_Learner The Pytorch implementation for "Video-Text Pre-training with Learned Regions" (arxiv) We are still cleaning up the code further and pre

Rui Yan 0 Mar 20, 2022
This folder contains the python code of UR5E's advanced forward kinematics model.

This folder contains the python code of UR5E's advanced forward kinematics model. By entering the angle of the joint of UR5e, the detailed coordinates of up to 48 points around the robot arm can be c

Qiang Wang 4 Sep 17, 2022
Data manipulation and transformation for audio signal processing, powered by PyTorch

torchaudio: an audio library for PyTorch The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the

1.9k Dec 28, 2022
[ICCV 2021] Focal Frequency Loss for Image Reconstruction and Synthesis

Focal Frequency Loss - Official PyTorch Implementation This repository provides the official PyTorch implementation for the following paper: Focal Fre

Liming Jiang 460 Jan 04, 2023