Image Segmentation and Object Detection in Pytorch

Overview

Image Segmentation and Object Detection in Pytorch

Pytorch-Segmentation-Detection is a library for image segmentation and object detection with reported results achieved on common image segmentation/object detection datasets, pretrained models and scripts to reproduce them.

Segmentation

PASCAL VOC 2012

Implemented models were tested on Restricted PASCAL VOC 2012 Validation dataset (RV-VOC12) or Full PASCAL VOC 2012 Validation dataset (VOC-2012) and trained on the PASCAL VOC 2012 Training data and additional Berkeley segmentation data for PASCAL VOC 12.

You can find all the scripts that were used for training and evaluation here.

This code has been used to train networks with this performance:

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link Related paper
Resnet-18-8s RV-VOC12 59.0 in prog. in prog. 28 ms. Dropbox DeepLab
Resnet-34-8s RV-VOC12 68.0 in prog. in prog. 50 ms. Dropbox DeepLab
Resnet-50-16s VOC12 66.5 in prog. in prog. in prog. in prog. DeepLab
Resnet-50-8s VOC12 67.0 in prog. in prog. in prog. in prog. DeepLab
Resnet-50-8s-deep-sup VOC12 67.1 in prog. in prog. in prog. in prog. DeepLab
Resnet-101-16s VOC12 68.6 in prog. in prog. in prog. in prog. DeepLab
PSP-Resnet-18-8s VOC12 68.3 n/a n/a n/a in prog. PSPnet
PSP-Resnet-50-8s VOC12 73.6 n/a n/a n/a in prog. PSPnet

Some qualitative results:

Alt text

Endovis 2017

Implemented models were trained on Endovis 2017 segmentation dataset and the sequence number 3 was used for validation and was not included in training dataset.

The code to acquire the training and validating the model is also provided in the library.

Additional Qualitative results can be found on this youtube playlist.

Binary Segmentation

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-9-8s Seq # 3 * 96.1 in prog. in prog. 13.3 ms. Dropbox
Resnet-18-8s Seq # 3 96.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s Seq # 3 in prog. in prog. in prog. 50 ms. in prog.

Resnet-9-8s network was tested on the 0.5 reduced resoulution (512 x 640).

Qualitative results (on validation sequence):

Alt text

Multi-class Segmentation

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-18-8s Seq # 3 81.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s Seq # 3 in prog. in prog. in prog. 50 ms. in prog

Qualitative results (on validation sequence):

Alt text

Cityscapes

The dataset contains video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames. The annotations contain 19 classes which represent cars, road, traffic signs and so on.

Model Test data Mean IOU Mean pix. accuracy Pixel accuracy Inference time (512x512 px. image) Model Download Link
Resnet-18-32s Validation set 61.0 in prog. in prog. in prog. in prog.
Resnet-18-8s Validation set 60.0 in prog. in prog. 28 ms. Dropbox
Resnet-34-8s Validation set 69.1 in prog. in prog. 50 ms. Dropbox
Resnet-50-16s-PSP Validation set 71.2 in prog. in prog. in prog. in prog.

Qualitative results (on validation sequence):

Whole sequence can be viewed here.

Alt text

Installation

This code requires:

  1. Pytorch.

  2. Some libraries which can be acquired by installing Anaconda package.

    Or you can install scikit-image, matplotlib, numpy using pip.

  3. Clone the library:

git clone --recursive https://github.com/warmspringwinds/pytorch-segmentation-detection

And use this code snippet before you start to use the library:

import sys
# update with your path
# All the jupyter notebooks in the repository already have this
sys.path.append("/your/path/pytorch-segmentation-detection/")
sys.path.insert(0, '/your/path/pytorch-segmentation-detection/vision/')

Here we use our pytorch/vision fork, which might be merged and futher merged in a future. We have added it as a submodule to our repository.

  1. Download segmentation or detection models that you want to use manually (links can be found below).

About

If you used the code for your research, please, cite the paper:

@article{pakhomov2017deep,
  title={Deep Residual Learning for Instrument Segmentation in Robotic Surgery},
  author={Pakhomov, Daniil and Premachandran, Vittal and Allan, Max and Azizian, Mahdi and Navab, Nassir},
  journal={arXiv preprint arXiv:1703.08580},
  year={2017}
}

During implementation, some preliminary experiments and notes were reported:

Owner
Daniil Pakhomov
Phd student at JHU. Research interests: Image Classification, Image Segmentation, Face Detection and Face Recognition mostly based on Deep Learning.
Daniil Pakhomov
Align and Prompt: Video-and-Language Pre-training with Entity Prompts

ALPRO Align and Prompt: Video-and-Language Pre-training with Entity Prompts [Paper] Dongxu Li, Junnan Li, Hongdong Li, Juan Carlos Niebles, Steven C.H

Salesforce 127 Dec 21, 2022
The InterScript dataset contains interactive user feedback on scripts generated by a T5-XXL model.

Interscript The Interscript dataset contains interactive user feedback on a T5-11B model generated scripts. Dataset data.json contains the data in an

AI2 8 Dec 01, 2022
Code for paper "Vocabulary Learning via Optimal Transport for Neural Machine Translation"

**Codebase and data are uploaded in progress. ** VOLT(-py) is a vocabulary learning codebase that allows researchers and developers to automaticaly ge

416 Jan 09, 2023
PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning"

deepGCFX PyTorch implementation for our AAAI 2022 Paper "Graph-wise Common Latent Factor Extraction for Unsupervised Graph Representation Learning" Pr

Thilini Cooray 4 Aug 11, 2022
Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20. model in ONNX

ONNX msg_chn_wacv20 depth completion Python script for performing depth completion from sparse depth and rgb images using the msg_chn_wacv20 model in

Ibai Gorordo 19 Oct 22, 2022
K-Means Clustering and Hierarchical Clustering Unsupervised Learning Solution in Python3.

Unsupervised Learning - K-Means Clustering and Hierarchical Clustering - The Heritage Foundation's Economic Freedom Index Analysis 2019 - By David Sal

David Salako 1 Jan 12, 2022
Face Identity Disentanglement via Latent Space Mapping [SIGGRAPH ASIA 2020]

Face Identity Disentanglement via Latent Space Mapping Description Official Implementation of the paper Face Identity Disentanglement via Latent Space

150 Dec 07, 2022
Object Detection and Multi-Object Tracking

Object Detection and Multi-Object Tracking

Bobby Chen 1.6k Jan 04, 2023
Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance

Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance Project Page | Paper | Data This repository contains an implementatio

Lior Yariv 521 Dec 30, 2022
Heart Arrhythmia Classification

This program takes and input of an ECG in European Data Format (EDF) and outputs the classification for heartbeats into normal vs different types of arrhythmia . It uses a deep learning model for cla

4 Nov 02, 2022
PyTorch implementation of "PatchGame: Learning to Signal Mid-level Patches in Referential Games" to appear in NeurIPS 2021

PatchGame: Learning to Signal Mid-level Patches in Referential Games This repository is the official implementation of the paper - "PatchGame: Learnin

Kamal Gupta 22 Mar 16, 2022
Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs Check out the paper on arXiv: https://arxiv.org/abs/2103.13744 This repo cont

Christian Reiser 373 Dec 20, 2022
[NeurIPS 2021] Low-Rank Subspaces in GANs

Low-Rank Subspaces in GANs Figure: Image editing results using LowRankGAN on StyleGAN2 (first three columns) and BigGAN (last column). Low-Rank Subspa

112 Dec 28, 2022
Self-Supervised Contrastive Learning of Music Spectrograms

Self-Supervised Music Analysis Self-Supervised Contrastive Learning of Music Spectrograms Dataset Songs on the Billboard Year End Hot 100 were collect

27 Dec 10, 2022
Release of SPLASH: Dataset for semantic parse correction with natural language feedback in the context of text-to-SQL parsing

SPLASH: Semantic Parsing with Language Assistance from Humans SPLASH is dataset for the task of semantic parse correction with natural language feedba

Microsoft Research - Language and Information Technologies (MSR LIT) 35 Oct 31, 2022
Demystifying How Self-Supervised Features Improve Training from Noisy Labels

Demystifying How Self-Supervised Features Improve Training from Noisy Labels This code is a PyTorch implementation of the paper "[Demystifying How Sel

<a href=[email protected]"> 4 Oct 14, 2022
Ego4d dataset repository. Download the dataset, visualize, extract features & example usage of the dataset

Ego4D EGO4D is the world's largest egocentric (first person) video ML dataset and benchmark suite, with 3,600 hrs (and counting) of densely narrated v

Meta Research 118 Jan 07, 2023
A list of all papers and resoureces on Semantic Segmentation

Semantic-Segmentation A list of all papers and resoureces on Semantic Segmentation. Dataset importance SemanticSegmentation_DL Some implementation of

Alan Tang 1.1k Dec 12, 2022
PyTorch deep learning projects made easy.

PyTorch Template Project PyTorch deep learning project made easy. PyTorch Template Project Requirements Features Folder Structure Usage Config file fo

Victor Huang 3.8k Jan 01, 2023
Official Repository for "Robust On-Policy Data Collection for Data Efficient Policy Evaluation" (NeurIPS 2021 Workshop on OfflineRL).

Robust On-Policy Data Collection for Data-Efficient Policy Evaluation Source code of Robust On-Policy Data Collection for Data-Efficient Policy Evalua

Autonomous Agents Research Group (University of Edinburgh) 2 Oct 09, 2022