Pixel-wise segmentation on VOC2012 dataset using pytorch.

Last update: Dec 30, 2022

Overview

PiWiSe

Pixel-wise segmentation on the VOC2012 dataset using pytorch.

For a more complete implementation of segmentation networks checkout semseg.

Note:

FCN differs from original implementation see this issue
SegNet does not match original paper performance see here
PSPNet misses "atrous convolution" (conv layers of ResNet101 should be amended to preserve image size)

Keeping this in mind feel free to PR. Thank you!

Setup

See dataset examples here.

Download

Download image archive and extract and do:

mkdir data
mv VOCdevkit/VOC2012/JPEGImages data/images
mv VOCdevkit/VOC2012/SegmentationClass data/classes
rm -rf VOCdevkit

Install

We recommend using pyenv:

pyenv virtualenv 3.6.0 piwise
pyenv activate piwise

then install requirements with pip install -r requirements.txt.

Usage

For latest documentation use:

python main.py --help

Supported model parameters are fcn8, fcn16, fcn32, unet, segnet1, segnet2, pspnet.

Training

If you want to have visualization open an extra tab with:

python -m visdom.server -port 5000

Train the SegNet model 30 epochs with cuda support, visualization and checkpoints every 100 steps:

python main.py --cuda --model segnet2 train --datadir data \
    --num-epochs 30 --num-workers 4 --batch-size 4 \
    --steps-plot 50 --steps-save 100

Evaluation

Then we want to do semantic segmentation on foo.jpg:

python main.py --model segnet2 --state segnet2-30-0 eval foo.jpg foo.png

The segmented class image can now be found at foo.png.

Results

These are some results based on segnet after 40 epoches. Set

loss_weights[0] = 1 / 1

to deal gracefully with the unbalanced problem.

Input	Output	Ground Truth

Pixel-wise segmentation on VOC2012 dataset using pytorch.

Related tags

Overview

PiWiSe

Setup

Download

Install

Usage

Training

Evaluation

Results

Owner

Bodo Kaiser

Implementation for Panoptic-PolarNet (CVPR 2021)

PyTorch implementation of CloudWalk's recent work DenseBody

Makes patches from huge resolution .svs slide files using openslide

PEPit is a package enabling computer-assisted worst-case analyses of first-order optimization methods.

PyTorch Implementation for "ForkGAN with SIngle Rainy NIght Images: Leveraging the RumiGAN to See into the Rainy Night"

yolov5目标检测模型的知识蒸馏（基于响应的蒸馏）

Make a Turtlebot3 follow a figure 8 trajectory and create a robot arm and make it follow a trajectory

Repo for "TableParser: Automatic Table Parsing with Weak Supervision from Spreadsheets" at [email protected]

Simulation of the solar system using various nummerical methods

Doing the asl sign language classification on static images using graph neural networks.

Rocket-recycling with Reinforcement Learning

Magisk module to enable hidden features on Android 12 Developer Preview 1.

PyTorch version implementation of DORN

Object tracking using YOLO and a tracker(KCF, MOSSE, CSRT) in openCV

This is the source code for generating the ASL-Skeleton3D and ASL-Phono datasets. Check out the README.md for more details.

Efficient Speech Processing Tookit for Automatic Speaker Recognition

PhysCap: Physically Plausible Monocular 3D Motion Capture in Real Time

Medical-Image-Triage-and-Classification-System-Based-on-COVID-19-CT-and-X-ray-Scan-Dataset

Tooling for the Common Objects In 3D dataset.

The tl;dr on a few notable transformer/language model papers + other papers (alignment, memorization, etc).