reimpliment of DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

Last update: Oct 21, 2022

Related tags

Overview

DFANet

This repo is an unofficial pytorch implementation of DFANet:Deep Feature Aggregation for Real-Time Semantic Segmentation

log

2019.4.16 After 483 epoches it rases RuntimeError: value cannot be converted to type float without overflow: (9.85073e-06,-3.2007e-06).According to the direction of the stackoverflow the error can be fixed by modifying "self.scheduler.step()" to "self.scheduler.step(loss.cpu().data.numpy())" in train.py.
2019.4.24 An function has been writed to load the pretrained model which was trained on imagenet-1k.The project of training the backbone can be Downloaded from here -https://github.com/huaifeng1993/ILSVRC2012. Limited to my computing resources(only have one RTX2080),I trained the backbone on ILSVRC2012 with only 22 epochs.But it have a great impact on the results.
2019.5.23 It's hard to improve the performance of the model.May be the model's details are different from the original paper's or the hyperparameters ....or the training strategy...or something else...
2020.3.7 rewrited some code of this rep which make the code more modular(model/dfanet.py).

Installation

pytorch==1.0.0
python==3.6
numpy
torchvision
matplotlib
opencv-python
tensorflow
tensorboardX

Dataset and pretrained model

Download CityScape dataset and unzip the dataset into data folder.Then run the command 'python utils/preprocess_data.py' to create labels.

Train the network without pretrained model.

Modify your configuration in main.py.

run the command  'python main.py'

curvs on CityScape set

inference speed

platform	input size	batch size	inference time /ms
rk3399	320*200	1	960
rk3399	200*100	1	589
rk3399	80*80	1	259
rk3399	72*72	1	161
2080	1024*1024	4	40
2080	1024*1024	1	16
2080	2048*1024	1	17
2080	2048*1024	2	39
2080	512*512	1	39
2080	512*512	16	44

Some experimental results was provided by @ShaoqingGong

To do

Train the backbone xceptionA on the ImageNet-1k.
Modify the network and improve the accuracy.
Debug and report the performance.
Schedule the lr
...

reimpliment of DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

Related tags

Overview

DFANet

log

Installation

Dataset and pretrained model

Train the network without pretrained model.

curvs on CityScape set

inference speed

To do

Thanks

Owner

shen hui xiang

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

A more easy-to-use implementation of KPConv

The final project for "Applying AI to Wearable Device Data" course from "AI for Healthcare" - Udacity.

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

(CVPR2021) Kaleido-BERT: Vision-Language Pre-training on Fashion Domain

Embeds a story into a music playlist by sorting the playlist so that the order of the music follows a narrative arc.

Process text, including tokenizing and representing sentences as vectors and Applying some concepts like RNN, LSTM and GRU to create a classifier can detect the language in which a sentence is written from among 17 languages.

People movement type classifier with YOLOv4 detection and SORT tracking.

DTCN SMP Challenge - Sequential prediction learning framework and algorithm

Sign Language is detected in realtime using video sequences. Our approach involves MediaPipe Holistic for keypoints extraction and LSTM Model for prediction.

Personalized Transfer of User Preferences for Cross-domain Recommendation (PTUPCDR)

A curated list of Generative Deep Art projects, tools, artworks, and models

A demo of how to use JAX to create a simple gravity simulation

Yolo ros - YOLO-ROS for HUAWEI ATLAS200

《Fst Lerning of Temporl Action Proposl vi Dense Boundry Genertor》(AAAI 2020)

Object detection GUI based on PaddleDetection

Code for ICE-BeeM paper - NeurIPS 2020

RIM: Reliable Influence-based Active Learning on Graphs.

Code for Talking Face Generation by Adversarially Disentangled Audio-Visual Representation (AAAI 2019)

code for our ECCV-2020 paper: Self-supervised Video Representation Learning by Pace Prediction