Dilated Convolution for Semantic Image Segmentation

Last update: Dec 26, 2022

Related tags

Overview

Multi-Scale Context Aggregation by Dilated Convolutions

Introduction

Properties of dilated convolution are discussed in our ICLR 2016 conference paper. This repository contains the network definitions and the trained models. You can use this code together with vanilla Caffe to segment images using the pre-trained models. If you want to train the models yourself, please check out the document for training.

If you are looking for dilation models with state-of-the-art performance and Python implementation, please check out Dilated Residual Networks.

Citing

If you find the code or the models useful, please cite this paper:

@inproceedings{YuKoltun2016,
	author    = {Fisher Yu and Vladlen Koltun},
	title     = {Multi-Scale Context Aggregation by Dilated Convolutions},
	booktitle = {ICLR},
	year      = {2016},
}

License

The code and models are released under the MIT License (refer to the LICENSE file for details).

Installation

Caffe

Install Caffe and its Python interface. Make sure that the Caffe version is newer than commit 08c5df.

Python

The companion Python script is used to demonstrate the network definition and trained weights.

The required Python packages are numba numpy opencv. Python release from Anaconda is recommended.

In the case of using Anaconda

conda install numba numpy opencv

Running Demo

predict.py is the main script to test the pre-trained models on images. The basic usage is

python predict.py <dataset name> <image path>

Given the dataset name, the script will find the pre-trained model and network definition. We currently support models trained from four datasets: pascal_voc, camvid, kitti, cityscapes. The steps of using the code is listed below:

Clone the code from Github

git clone [email protected]:fyu/dilation.git
cd dilation

Download pre-trained network
```
sh pretrained/download_pascal_voc.sh
```

Run pascal voc model on GPU 0

python predict.py pascal_voc images/dog.jpg --gpu 0

Training

You are more than welcome to train our model on a new dataset. To do that, please refer to the document for training.

Implementation of Dilated Convolution

Besides Caffe support, dilated convolution is also implemented in other deep learning packages. For example,

Torch: SpatialDilatedConvolution
Lasagne: DilatedConv2DLayer

Dilated Convolution for Semantic Image Segmentation

Related tags

Overview

Multi-Scale Context Aggregation by Dilated Convolutions

Introduction

Citing

License

Installation

Caffe

Python

Running Demo

Training

Implementation of Dilated Convolution

Owner

Fisher Yu

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Human Dynamics from Monocular Video with Dynamic Camera Movements

The 2nd place solution of 2021 google landmark retrieval on kaggle.

A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

Transformer in Vision

"Segmenter: Transformer for Semantic Segmentation" reproduced via mmsegmentation

BoxInst: High-Performance Instance Segmentation with Box Annotations

TensorFlow 2 AI/ML library wrapper for openFrameworks

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

A Distributional Approach To Controlled Text Generation

This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

Torch-mutable-modules - Use in-place and assignment operations on PyTorch module parameters with support for autograd

My implementation of transformers related papers for computer vision in pytorch

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

Dilated Convolution for Semantic Image Segmentation

Related tags

Overview

Multi-Scale Context Aggregation by Dilated Convolutions

Introduction

Citing

License

Installation

Caffe

Python

Running Demo

Training

Implementation of Dilated Convolution

Owner

Fisher Yu

Online Pseudo Label Generation by Hierarchical Cluster Dynamics for Adaptive Person Re-identification

Human Dynamics from Monocular Video with Dynamic Camera Movements

The 2nd place solution of 2021 google landmark retrieval on kaggle.

A high-performance anchor-free YOLO. Exceeding yolov3~v5 with ONNX, TensorRT, NCNN, and Openvino supported.

The official PyTorch implementation of the paper: *Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." *.

SSD: A Unified Framework for Self-Supervised Outlier Detection [ICLR 2021]

Transformer in Vision

"Segmenter: Transformer for Semantic Segmentation" reproduced via mmsegmentation

BoxInst: High-Performance Instance Segmentation with Box Annotations

TensorFlow 2 AI/ML library wrapper for openFrameworks

ObjectDrawer-ToolBox: a graphical image annotation tool to generate ground plane masks for a 3D object reconstruction system

Deep Anomaly Detection with Outlier Exposure (ICLR 2019)

A Distributional Approach To Controlled Text Generation

This repository contains code for the paper "Decoupling Representation and Classifier for Long-Tailed Recognition", published at ICLR 2020

Torch-mutable-modules - Use in-place and assignment operations on PyTorch module parameters with support for autograd

My implementation of transformers related papers for computer vision in pytorch

A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN.

BBB streaming without Xorg and Pulseaudio and Chromium and other nonsense (heavily WIP)

Deep deconfounded recommender (Deep-Deconf) for paper "Deep causal reasoning for recommendations"

Lowest memory consumption and second shortest runtime in NTIRE 2022 challenge on Efficient Super-Resolution

The official PyTorch implementation of the paper: Xili Dai, Xiaojun Yuan, Haigang Gong, Yi Ma. "Fully Convolutional Line Parsing." .