A pytorch-based real-time segmentation model for autonomous driving

Last update: Dec 22, 2022

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

This project contains the Pytorch implementation for the proposed CFPNet: paper

Real-time semantic segmentation is playing a more important role in computer vision, due to the growing demand for mobile devices and autonomous driving. Therefore, it is very important to achieve a good trade-off among performance, model size and inference speed. In this paper, we propose a Channel-wise Feature Pyramid (CFP) module to balance those factors. Based on the CFP module, we built CFPNet for real-time semantic segmentation which applied a series of dilated convolution channels to extract effective features. Experiments on Cityscapes and CamVid datasets show that the proposed CFPNet achieves an effective combination of those factors. For the Cityscapes test dataset, CFPNet achievse 70.1% class-wise mIoU with only 0.55 million parameters and 2.5 MB memory. The inference speed can reach 30 FPS on a single RTX 2080Ti GPU (GPU usage 60%) with a 1024×2048-pixel image.

Installation

Enviroment: Python 3.6; Pytorch 1.0; CUDA 9.0; cuDNN V7
Install some packages:

pip install opencv-python pillow numpy matplotlib

Clone this repository

git clone https://github.com/AngeLouCN/CFPNet

One GPU with 11GB memory is needed

Dataset

You need to download the two dataset——CamVid and Cityscapes, and put the files in the datasetfolder with following structure.

|—— camvid
|    ├── train
|    ├── test
|    ├── val 
|    ├── trainannot
|    ├── testannot
|    ├── valannot
|    ├── camvid_trainval_list.txt
|    ├── camvid_train_list.txt
|    ├── camvid_test_list.txt
|    └── camvid_val_list.txt
├── cityscapes
|    ├── gtCoarse
|    ├── gtFine
|    ├── leftImg8bit
|    ├── cityscapes_trainval_list.txt
|    ├── cityscapes_train_list.txt
|    ├── cityscapes_test_list.txt
|    └── cityscapes_val_list.txt

Training

You can run: python train.py -hto check the detail of optional arguments. In the train.py, you can set the dataset, train type, epochs and batch size, etc.
training on Cityscapes train set.

python train.py --dataset cityscapes

training on Camvid train and val set.

python train.py --dataset camvid --train_type trainval --max_epochs 1000 --lr 1e-3 --batch_size 16

During training course, every 50 epochs, we will record the mean IoU of train set, validation set and training loss to draw a plot, so you can check whether the training process is normal.

Val mIoU vs Epochs	Train loss vs Epochs

Testing

After training, the checkpoint will be saved at checkpointfolder, you can use test.pyto predict the result.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Evalution

For those dataset that do not provide label on the test set (e.g. Cityscapes), you can use predict.py to save all the output images, then submit to official webpage for evaluation.

python test.py --dataset ${camvid, cityscapes} --checkpoint ${CHECKPOINT_FILE}

Inference Speed

You can run the eval_fps.py to test the model inference speed, input the image size such as 1024,2048.

python eval_fps.py 1024,2048

Results

Results for CFPNet-V1, CFPNet-V2 and CFPNet-v3:

Dataset	Model	mIoU
Cityscapes	CFPNet-V1	60.4%
Cityscapes	CFPNet-V2	66.5%
Cityscapes	CFPNet-V3	70.1%

Sample results: (from top to bottom is Original, CFPNet-V1, CFPNet-V2 and CFPNet-v3)

Category_acc vs size	Class_acc vs size

Class_acc vs parameter	Class_acc vs speed

Comparsion

Results of Cityscapes

Results of CamVid

Citation

If you think our work is helpful, please consider to cite:

@article{lou2021cfpnet,
  title={CFPNet: Channel-wise Feature Pyramid for Real-Time Semantic Segmentation},
  author={Lou, Ange and Loew, Murray},
  journal={arXiv preprint arXiv:2103.12212},
  year={2021}
}

A pytorch-based real-time segmentation model for autonomous driving

Related tags

Overview

CFPNet: Channel-Wise Feature Pyramid for Real-Time Semantic Segmentation

Installation

Dataset

Training

Testing

Evalution

Inference Speed

Results

Comparsion

Citation

Owner

Sequential Model-based Algorithm Configuration

PyoMyo - Python Opensource Myo library

The implementation our EMNLP 2021 paper "Enhanced Language Representation with Label Knowledge for Span Extraction".

kapre: Keras Audio Preprocessors

Tensorflow implementation of "Learning Deep Features for Discriminative Localization"

Instance-wise Feature Importance in Time (FIT)

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

StarGAN - Official PyTorch Implementation (CVPR 2018)

基于YoloX目标检测+DeepSort算法实现多目标追踪Baseline

This Jupyter notebook shows one way to implement a simple first-order low-pass filter on sampled data in discrete time.

Contrastive Learning for Metagenomic Binning

We utilize deep reinforcement learning to obtain favorable trajectories for visual-inertial system calibration.

List some popular DeepFake models e.g. DeepFake, FaceSwap-MarekKowal, IPGAN, FaceShifter, FaceSwap-Nirkin, FSGAN, SimSwap, CihaNet, etc.

QMagFace: Simple and Accurate Quality-Aware Face Recognition

Adaptation through prediction: multisensory active inference torque control

This is the official PyTorch implementation of the CVPR 2020 paper "TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting".

Discovering Dynamic Salient Regions with Spatio-Temporal Graph Neural Networks

Towards Open-World Feature Extrapolation: An Inductive Graph Learning Approach

Scikit-learn compatible estimation of general graphical models

Koopman operator identification library in Python