Train robotic agents to learn pick and place with deep learning for vision-based manipulation in PyBullet.

Overview

Ravens - Transporter Networks

Ravens is a collection of simulated tasks in PyBullet for learning vision-based robotic manipulation, with emphasis on pick and place. It features a Gym-like API with 10 tabletop rearrangement tasks, each with (i) a scripted oracle that provides expert demonstrations (for imitation learning), and (ii) reward functions that provide partial credit (for reinforcement learning).


(a) block-insertion: pick up the L-shaped red block and place it into the L-shaped fixture.
(b) place-red-in-green: pick up the red blocks and place them into the green bowls amidst other objects.
(c) towers-of-hanoi: sequentially move disks from one tower to another—only smaller disks can be on top of larger ones.
(d) align-box-corner: pick up the randomly sized box and align one of its corners to the L-shaped marker on the tabletop.
(e) stack-block-pyramid: sequentially stack 6 blocks into a pyramid of 3-2-1 with rainbow colored ordering.
(f) palletizing-boxes: pick up homogeneous fixed-sized boxes and stack them in transposed layers on the pallet.
(g) assembling-kits: pick up different objects and arrange them on a board marked with corresponding silhouettes.
(h) packing-boxes: pick up randomly sized boxes and place them tightly into a container.
(i) manipulating-rope: rearrange a deformable rope such that it connects the two endpoints of a 3-sided square.
(j) sweeping-piles: push piles of small objects into a target goal zone marked on the tabletop.

Some tasks require generalizing to unseen objects (d,g,h), or multi-step sequencing with closed-loop feedback (c,e,f,h,i,j).

Team: this repository is developed and maintained by Andy Zeng, Pete Florence, Daniel Seita, Jonathan Tompson, and Ayzaan Wahid. This is the reference repository for the paper:

Transporter Networks: Rearranging the Visual World for Robotic Manipulation

Project Website  •  PDF  •  Conference on Robot Learning (CoRL) 2020

Andy Zeng, Pete Florence, Jonathan Tompson, Stefan Welker, Jonathan Chien, Maria Attarian, Travis Armstrong,
Ivan Krasin, Dan Duong, Vikas Sindhwani, Johnny Lee

Abstract. Robotic manipulation can be formulated as inducing a sequence of spatial displacements: where the space being moved can encompass an object, part of an object, or end effector. In this work, we propose the Transporter Network, a simple model architecture that rearranges deep features to infer spatial displacements from visual input—which can parameterize robot actions. It makes no assumptions of objectness (e.g. canonical poses, models, or keypoints), it exploits spatial symmetries, and is orders of magnitude more sample efficient than our benchmarked alternatives in learning vision-based manipulation tasks: from stacking a pyramid of blocks, to assembling kits with unseen objects; from manipulating deformable ropes, to pushing piles of small objects with closed-loop feedback. Our method can represent complex multi-modal policy distributions and generalizes to multi-step sequential tasks, as well as 6DoF pick-and-place. Experiments on 10 simulated tasks show that it learns faster and generalizes better than a variety of end-to-end baselines, including policies that use ground-truth object poses. We validate our methods with hardware in the real world.

Installation

Step 1. Recommended: install Miniconda with Python 3.7.

curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -u
echo $'\nexport PATH=~/miniconda3/bin:"${PATH}"\n' >> ~/.profile  # Add Conda to PATH.
source ~/.profile
conda init

Step 2. Create and activate Conda environment, then install GCC and Python packages.

cd ~/ravens
conda create --name ravens python=3.7 -y
conda activate ravens
sudo apt-get update
sudo apt-get -y install gcc libgl1-mesa-dev
pip install -r requirements.txt
python setup.py install --user

Step 3. Recommended: install GPU acceleration with NVIDIA CUDA 10.1 and cuDNN 7.6.5 for Tensorflow.

./oss_scipts/install_cuda.sh  #  For Ubuntu 16.04 and 18.04.
conda install cudatoolkit==10.1.243 -y
conda install cudnn==7.6.5 -y

Alternative: Pure pip

As an example for Ubuntu 18.04:

./oss_scipts/install_cuda.sh  #  For Ubuntu 16.04 and 18.04.
sudo apt install gcc libgl1-mesa-dev python3.8-venv
python3.8 -m venv ./venv
source ./venv/bin/activate
pip install -U pip
pip install scikit-build
pip install -r ./requirements.txt
export PYTHONPATH=${PWD}

Getting Started

Step 1. Generate training and testing data (saved locally). Note: remove --disp for headless mode.

python ravens/demos.py --assets_root=./ravens/environments/assets/ --disp=True --task=block-insertion --mode=train --n=10
python ravens/demos.py --assets_root=./ravens/environments/assets/ --disp=True --task=block-insertion --mode=test --n=100

To run with shared memory, open a separate terminal window and run python3 -m pybullet_utils.runServer. Then add --shared_memory flag to the command above.

Step 2. Train a model e.g., Transporter Networks model. Model checkpoints are saved to the checkpoints directory. Optional: you may exit training prematurely after 1000 iterations to skip to the next step.

python ravens/train.py --task=block-insertion --agent=transporter --n_demos=10

Step 3. Evaluate a Transporter Networks agent using the model trained for 1000 iterations. Results are saved locally into .pkl files.

python ravens/test.py --assets_root=./ravens/environments/assets/ --disp=True --task=block-insertion --agent=transporter --n_demos=10 --n_steps=1000

Step 4. Plot and print results.

python ravens/plot.py --disp=True --task=block-insertion --agent=transporter --n_demos=10

Optional. Track training and validation losses with Tensorboard.

python -m tensorboard.main --logdir=logs  # Open the browser to where it tells you to.

Datasets and Pre-Trained Models

Download our generated train and test datasets and pre-trained models.

wget https://storage.googleapis.com/ravens-assets/checkpoints.zip
wget https://storage.googleapis.com/ravens-assets/block-insertion.zip
wget https://storage.googleapis.com/ravens-assets/place-red-in-green.zip
wget https://storage.googleapis.com/ravens-assets/towers-of-hanoi.zip
wget https://storage.googleapis.com/ravens-assets/align-box-corner.zip
wget https://storage.googleapis.com/ravens-assets/stack-block-pyramid.zip
wget https://storage.googleapis.com/ravens-assets/palletizing-boxes.zip
wget https://storage.googleapis.com/ravens-assets/assembling-kits.zip
wget https://storage.googleapis.com/ravens-assets/packing-boxes.zip
wget https://storage.googleapis.com/ravens-assets/manipulating-rope.zip
wget https://storage.googleapis.com/ravens-assets/sweeping-piles.zip

The MDP formulation for each task uses transitions with the following structure:

Observations: raw RGB-D images and camera parameters (pose and intrinsics).

Actions: a primitive function (to be called by the robot) and parameters.

Rewards: total sum of rewards for a successful episode should be =1.

Info: 6D poses, sizes, and colors of objects.

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

The code for our paper "NSP-BERT: A Prompt-based Zero-Shot Learner Through an Original Pre-training Task —— Next Sentence Prediction"

Sun Yi 201 Nov 21, 2022
Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

Yifan Zhang 259 Dec 25, 2022
JDet is Object Detection Framework based on Jittor.

JDet is Object Detection Framework based on Jittor.

135 Dec 14, 2022
Convolutional 2D Knowledge Graph Embeddings resources

ConvE Convolutional 2D Knowledge Graph Embeddings resources. Paper: Convolutional 2D Knowledge Graph Embeddings Used in the paper, but do not use thes

Tim Dettmers 586 Dec 24, 2022
Unsupervised Learning of Video Representations using LSTMs

Unsupervised Learning of Video Representations using LSTMs Code for paper Unsupervised Learning of Video Representations using LSTMs by Nitish Srivast

Elman Mansimov 341 Dec 20, 2022
NeoPlay is the project dedicated to ESport events.

NeoPlay is the project dedicated to ESport events. On this platform users can participate in tournaments with prize pools as well as create their own tournaments.

3 Dec 18, 2021
Exemplo de implementação do padrão circuit breaker em python

fast-circuit-breaker Circuit breakers existem para permitir que uma parte do seu sistema falhe sem destruir todo seu ecossistema de serviços. Michael

James G Silva 17 Nov 10, 2022
ReLoss - Official implementation for paper "Relational Surrogate Loss Learning" ICLR 2022

Relational Surrogate Loss Learning (ReLoss) Official implementation for paper "R

Tao Huang 31 Nov 22, 2022
Wenzhou-Kean University AI-LAB

AI-LAB This is Wenzhou-Kean University AI-LAB. Our research interests are in Computer Vision and Natural Language Processing. Computer Vision Please g

WKU AI-LAB 10 May 05, 2022
Fewshot-face-translation-GAN - Generative adversarial networks integrating modules from FUNIT and SPADE for face-swapping.

Few-shot face translation A GAN based approach for one model to swap them all. The table below shows our priliminary face-swapping results requiring o

768 Dec 24, 2022
RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting

RATCHET: RAdiological Text Captioning for Human Examined Thoraxes RATCHET is a Medical Transformer for Chest X-ray Diagnosis and Reporting. Based on t

26 Nov 14, 2022
Rotation Robust Descriptors

RoRD Rotation-Robust Descriptors and Orthographic Views for Local Feature Matching Project Page | Paper link Evaluation and Datasets MMA : Training on

Udit Singh Parihar 25 Nov 15, 2022
It is the assignment for COMP 576 in Rice University

COMP-576 It is the assignment for COMP 576 in Rice University There are two programming assignments and one Final Project. Assignment 1: It is a MLP a

Maojie Tang 1 Nov 25, 2021
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

152 Jan 02, 2023
Analysis of Smiles through reservoir sampling & RDkit

Analysis of Smiles through reservoir sampling and machine learning (under development). This is a simple project that includes two Jupyter files for t

Aurimas A. Nausėdas 6 Aug 30, 2022
phylotorch-bito is a package providing an interface to BITO for phylotorch

phylotorch-bito phylotorch-bito is a package providing an interface to BITO for phylotorch Dependencies phylotorch BITO Installation Get the source co

Mathieu Fourment 2 Sep 01, 2022
Prompts - Read a textfile of prompts and import into anki via ankiconnect

prompts read a textfile of prompts and import into anki via ankiconnect Usage In

Alexander Cobleigh 2 Jul 28, 2022
A general framework for deep learning experiments under PyTorch based on pytorch-lightning

torchx Torchx is a general framework for deep learning experiments under PyTorch based on pytorch-lightning. TODO list gan-like training wrapper text

Yingtian Liu 6 Mar 17, 2022
A more easy-to-use implementation of KPConv

A more easy-to-use implementation of KPConv This repo contains a more easy-to-use implementation of KPConv based on PyTorch. Introduction KPConv is a

Zheng Qin 35 Dec 14, 2022
Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss (ATVGnet)

Hierarchical Cross-modal Talking Face Generation with Dynamic Pixel-wise Loss (ATVGnet) By Lele Chen , Ross K Maddox, Zhiyao Duan, Chenliang Xu. Unive

Lele Chen 218 Dec 27, 2022