Semantic Segmentation in Pytorch

Related tags

Deep Learningsemseg
Overview

PyTorch Semantic Segmentation

Introduction

This repository is a PyTorch implementation for semantic segmentation / scene parsing. The code is easy to use for training and testing on various datasets. The codebase mainly uses ResNet50/101/152 as backbone and can be easily adapted to other basic classification structures. Implemented networks including PSPNet and PSANet, which ranked 1st places in ImageNet Scene Parsing Challenge 2016 @ECCV16, LSUN Semantic Segmentation Challenge 2017 @CVPR17 and WAD Drivable Area Segmentation Challenge 2018 @CVPR18. Sample experimented datasets are ADE20K, PASCAL VOC 2012 and Cityscapes.

Update

  • 2020.05.15: Branch master, use official nn.SyncBatchNorm, only multiprocessing training is supported, tested with pytorch 1.4.0.
  • 2019.05.29: Branch 1.0.0, both multithreading training (nn.DataParallel) and multiprocessing training (nn.parallel.DistributedDataParallel) (recommended) are supported. And the later one is much faster. Use syncbn from EncNet and apex, tested with pytorch 1.0.0.

Usage

  1. Highlight:

  2. Requirement:

    • Hardware: 4-8 GPUs (better with >=11G GPU memory)
    • Software: PyTorch>=1.1.0, Python3, tensorboardX,
  3. Clone the repository:

    git clone https://github.com/hszhao/semseg.git
  4. Train:

    • Download related datasets and symlink the paths to them as follows (you can alternatively modify the relevant paths specified in folder config):

      cd semseg
      mkdir -p dataset
      ln -s /path_to_ade20k_dataset dataset/ade20k
      
    • Download ImageNet pre-trained models and put them under folder initmodel for weight initialization. Remember to use the right dataset format detailed in FAQ.md.

    • Specify the gpu used in config then do training:

      sh tool/train.sh ade20k pspnet50
    • If you are using SLURM for nodes manager, uncomment lines in train.sh and then do training:

      sbatch tool/train.sh ade20k pspnet50
  5. Test:

    • Download trained segmentation models and put them under folder specified in config or modify the specified paths.

    • For full testing (get listed performance):

      sh tool/test.sh ade20k pspnet50
    • Quick demo on one image:

      PYTHONPATH=./ python tool/demo.py --config=config/ade20k/ade20k_pspnet50.yaml --image=figure/demo/ADE_val_00001515.jpg TEST.scales '[1.0]'
  6. Visualization: tensorboardX incorporated for better visualization.

    tensorboard --logdir=exp/ade20k
  7. Other:

    • Resources: GoogleDrive LINK contains shared models, visual predictions and data lists.
    • Models: ImageNet pre-trained models and trained segmentation models can be accessed. Note that our ImageNet pretrained models are slightly different from original ResNet implementation in the beginning part.
    • Predictions: Visual predictions of several models can be accessed.
    • Datasets: attributes (names and colors) are in folder dataset and some sample lists can be accessed.
    • Some FAQs: FAQ.md.
    • Former video predictions: high accuracy -- PSPNet, PSANet; high efficiency -- ICNet.

Performance

Description: mIoU/mAcc/aAcc stands for mean IoU, mean accuracy of each class and all pixel accuracy respectively. ss denotes single scale testing and ms indicates multi-scale testing. Training time is measured on a sever with 8 GeForce RTX 2080 Ti. General parameters cross different datasets are listed below:

  • Train Parameters: sync_bn(True), scale_min(0.5), scale_max(2.0), rotate_min(-10), rotate_max(10), zoom_factor(8), ignore_label(255), aux_weight(0.4), batch_size(16), base_lr(1e-2), power(0.9), momentum(0.9), weight_decay(1e-4).
  • Test Parameters: ignore_label(255), scales(single: [1.0], multiple: [0.5 0.75 1.0 1.25 1.5 1.75]).
  1. ADE20K: Train Parameters: classes(150), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(100). Test Parameters: classes(150), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).

    • Setting: train on train (20210 images) set and test on val (2000 images) set.
    Network mIoU/mAcc/aAcc(ss) mIoU/mAcc/pAcc(ms) Training Time
    PSPNet50 0.4189/0.5227/0.8039. 0.4284/0.5266/0.8106. 14h
    PSANet50 0.4229/0.5307/0.8032. 0.4305/0.5312/0.8101. 14h
    PSPNet101 0.4310/0.5375/0.8107. 0.4415/0.5426/0.8172. 20h
    PSANet101 0.4337/0.5385/0.8102. 0.4414/0.5392/0.8170. 20h
  2. PSACAL VOC 2012: Train Parameters: classes(21), train_h(473/465-PSP/A), train_w(473/465-PSP/A), epochs(50). Test Parameters: classes(21), test_h(473/465-PSP/A), test_w(473/465-PSP/A), base_size(512).

    • Setting: train on train_aug (10582 images) set and test on val (1449 images) set.
    Network mIoU/mAcc/aAcc(ss) mIoU/mAcc/pAcc(ms) Training Time
    PSPNet50 0.7705/0.8513/0.9489. 0.7802/0.8580/0.9513. 3.3h
    PSANet50 0.7725/0.8569/0.9491. 0.7787/0.8606/0.9508. 3.3h
    PSPNet101 0.7907/0.8636/0.9534. 0.7963/0.8677/0.9550. 5h
    PSANet101 0.7870/0.8642/0.9528. 0.7966/0.8696/0.9549. 5h
  3. Cityscapes: Train Parameters: classes(19), train_h(713/709-PSP/A), train_w(713/709-PSP/A), epochs(200). Test Parameters: classes(19), test_h(713/709-PSP/A), test_w(713/709-PSP/A), base_size(2048).

    • Setting: train on fine_train (2975 images) set and test on fine_val (500 images) set.
    Network mIoU/mAcc/aAcc(ss) mIoU/mAcc/pAcc(ms) Training Time
    PSPNet50 0.7730/0.8431/0.9597. 0.7838/0.8486/0.9617. 7h
    PSANet50 0.7745/0.8461/0.9600. 0.7818/0.8487/0.9622. 7.5h
    PSPNet101 0.7863/0.8577/0.9614. 0.7929/0.8591/0.9638. 10h
    PSANet101 0.7842/0.8599/0.9621. 0.7940/0.8631/0.9644. 10.5h

Citation

If you find the code or trained models useful, please consider citing:

@misc{semseg2019,
  author={Zhao, Hengshuang},
  title={semseg},
  howpublished={\url{https://github.com/hszhao/semseg}},
  year={2019}
}
@inproceedings{zhao2017pspnet,
  title={Pyramid Scene Parsing Network},
  author={Zhao, Hengshuang and Shi, Jianping and Qi, Xiaojuan and Wang, Xiaogang and Jia, Jiaya},
  booktitle={CVPR},
  year={2017}
}
@inproceedings{zhao2018psanet,
  title={{PSANet}: Point-wise Spatial Attention Network for Scene Parsing},
  author={Zhao, Hengshuang and Zhang, Yi and Liu, Shu and Shi, Jianping and Loy, Chen Change and Lin, Dahua and Jia, Jiaya},
  booktitle={ECCV},
  year={2018}
}

Question

Some FAQ.md collected. You are welcome to send pull requests or give some advices. Contact information: hengshuangzhao at gmail.com.

Owner
Hengshuang Zhao
Hengshuang Zhao
Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js

Start-to-finish tutorial for interactive music co-creation in PyTorch and Tensorflow.js

Chris Donahue 98 Dec 14, 2022
A PyTorch implementation of "Signed Graph Convolutional Network" (ICDM 2018).

SGCN ⠀ A PyTorch implementation of Signed Graph Convolutional Network (ICDM 2018). Abstract Due to the fact much of today's data can be represented as

Benedek Rozemberczki 251 Nov 30, 2022
Pytorch Lightning 1.2k Jan 06, 2023
Code repository for our paper "Learning to Generate Scene Graph from Natural Language Supervision" in ICCV 2021

Scene Graph Generation from Natural Language Supervision This repository includes the Pytorch code for our paper "Learning to Generate Scene Graph fro

Yiwu Zhong 64 Dec 24, 2022
This is 2nd term discrete maths project done by UCU students that uses backtracking to solve various problems.

Backtracking Project Sponsors This is a project made by UCU students: Olha Liuba - crossword solver implementation Hanna Yershova - sudoku solver impl

Dasha 4 Oct 17, 2021
TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision

TorchCV: A PyTorch-Based Framework for Deep Learning in Computer Vision @misc{you2019torchcv, author = {Ansheng You and Xiangtai Li and Zhen Zhu a

Donny You 2.2k Jan 06, 2023
Towards End-to-end Video-based Eye Tracking

Towards End-to-end Video-based Eye Tracking The code accompanying our ECCV 2020 publication and dataset, EVE. Authors: Seonwook Park, Emre Aksan, Xuco

Seonwook Park 76 Dec 12, 2022
On the model-based stochastic value gradient for continuous reinforcement learning

On the model-based stochastic value gradient for continuous reinforcement learning This repository is by Brandon Amos, Samuel Stanton, Denis Yarats, a

Facebook Research 46 Dec 15, 2022
To provide 100 JAX exercises over different sections structured as a course or tutorials to teach and learn for beginners, intermediates as well as experts

JaxTon 💯 JAX exercises Mission 🚀 To provide 100 JAX exercises over different sections structured as a course or tutorials to teach and learn for beg

Rohan Rao 512 Jan 01, 2023
Implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hashing by Maximizing Bit Entropy

Deep Unsupervised Image Hashing by Maximizing Bit Entropy This is the PyTorch implementation of accepted AAAI 2021 paper: Deep Unsupervised Image Hash

62 Dec 30, 2022
QuanTaichi evaluation suite

QuanTaichi: A Compiler for Quantized Simulations (SIGGRAPH 2021) Yuanming Hu, Jiafeng Liu, Xuanda Yang, Mingkuan Xu, Ye Kuang, Weiwei Xu, Qiang Dai, W

Taichi Developers 120 Jan 04, 2023
Koopman operator identification library in Python

pykoop pykoop is a Koopman operator identification library written in Python. It allows the user to specify Koopman lifting functions and regressors i

DECAR Systems Group 34 Jan 04, 2023
Activity image-based video retrieval

Cross-modal-retrieval Our approach is focus on Activity Image-to-Video Retrieval (AIVR) task. The compared methods are state-of-the-art single modalit

BCMI 75 Oct 21, 2021
This repo provides the official code for TransBTS: Multimodal Brain Tumor Segmentation Using Transformer (https://arxiv.org/pdf/2103.04430.pdf).

TransBTS: Multimodal Brain Tumor Segmentation Using Transformer This repo is the official implementation for TransBTS: Multimodal Brain Tumor Segmenta

Raymond 247 Dec 28, 2022
A FAIR dataset of TCV experimental results for validating edge/divertor turbulence models.

TCV-X21 validation for divertor turbulence simulations Quick links Intro Welcome to TCV-X21. We're glad you've found us! This repository is designed t

0 Dec 18, 2021
A tensorflow implementation of GCN-LPA

GCN-LPA This repository is the implementation of GCN-LPA (arXiv): Unifying Graph Convolutional Neural Networks and Label Propagation Hongwei Wang, Jur

Hongwei Wang 83 Nov 28, 2022
[CVPR 2021] MiVOS - Mask Propagation module. Reproduced STM (and better) with training code :star2:. Semi-supervised video object segmentation evaluation.

MiVOS (CVPR 2021) - Mask Propagation Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang [arXiv] [Paper PDF] [Project Page] [Papers with Code] This repo impleme

Rex Cheng 106 Jan 03, 2023
Predicting a person's gender based on their weight and height

Logistic Regression Advanced Case Study Gender Classification: Predicting a person's gender based on their weight and height 1. Introduction We turn o

1 Feb 01, 2022
Sentiment analysis translations of the Bhagavad Gita

Sentiment and Semantic Analysis of Bhagavad Gita Translations It is well known that translations of songs and poems not only breaks rhythm and rhyming

Machine learning and Bayesian inference @ UNSW Sydney 3 Aug 01, 2022
A Python framework for conversational search

Chatty Goose Multi-stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting Installation Ma

Castorini 36 Oct 23, 2022