An Implementation of Fully Convolutional Networks in Tensorflow.

Overview

Update

An example on how to integrate this code into your own semantic segmentation pipeline can be found in my KittiSeg project repository.

tensorflow-fcn

This is a one file Tensorflow implementation of Fully Convolutional Networks in Tensorflow. The code can easily be integrated in your semantic segmentation pipeline. The network can be applied directly or finetuned to perform semantic segmentation using tensorflow training code.

Deconvolution Layers are initialized as bilinear upsampling. Conv and FCN layer weights using VGG weights. Numpy load is used to read VGG weights. No Caffe or Caffe-Tensorflow is required to run this. The .npy file for [VGG16] to be downloaded before using this needwork. You can find the file here: ftp://mi.eng.cam.ac.uk/pub/mttt2/models/vgg16.npy

No Pascal VOC finetuning was applied to the weights. The model is meant to be finetuned on your own data. The model can be applied to an image directly (see test_fcn32_vgg.py) but the result will be rather coarse.

Requirements

In addition to tensorflow the following packages are required:

numpy scipy pillow matplotlib

Those packages can be installed by running pip install -r requirements.txt or pip install numpy scipy pillow matplotlib.

Tensorflow 1.0rc

This code requires Tensorflow Version >= 1.0rc to run. If you want to use older Version you can try using commit bf9400c6303826e1c25bf09a3b032e51cef57e3b. This Commit has been tested using the pip version of 0.12, 0.11 and 0.10.

Tensorflow 1.0 comes with a large number of breaking api changes. If you are currently running an older tensorflow version, I would suggest creating a new virtualenv and install 1.0rc using:

export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.0rc0-cp27-none-linux_x86_64.whl
pip install --upgrade $TF_BINARY_URL

Above commands will install the linux version with gpu support. For other versions follow the instructions here.

Usage

python test_fcn32_vgg.py to test the implementation.

Use this to build the VGG object for finetuning:

vgg = vgg16.Vgg16()
vgg.build(images, train=True, num_classes=num_classes, random_init_fc8=True)

The images is a tensor with shape [None, h, w, 3]. Where h and w can have arbitrary size.

Trick: the tensor can be a placeholder, a variable or even a constant.

Be aware, that num_classes influences the way score_fr (the original fc8 layer) is initialized. For finetuning I recommend using the option random_init_fc8=True.

Training

Example code for training can be found in the KittiSeg project repository.

Finetuning and training

For training build the graph using vgg.build(images, train=True, num_classes=num_classes) were images is q queue yielding image batches. Use a softmax_cross_entropy loss function on top of the output of vgg.up. An Implementation of the loss function can be found in loss.py.

To train the graph you need an input producer and a training script. Have a look at TensorVision to see how to build those.

I had success finetuning the network using Adam Optimizer with a learning rate of 1e-6.

Content

Currently the following Models are provided:

  • FCN32
  • FCN16
  • FCN8

Remark

The deconv layer of tensorflow allows to provide a shape. The crop layer of the original implementation is therefore not needed.

I have slightly altered the naming of the upscore layer.

Field of View

The receptive field (also known as or field of view) of the provided model is:

( ( ( ( ( 7 ) * 2 + 6 ) * 2 + 6 ) * 2 + 6 ) * 2 + 4 ) * 2 + 4 = 404

Predecessors

Weights were generated using Caffe to Tensorflow. The VGG implementation is based on tensorflow-vgg16 and numpy loading is based on tensorflow-vgg. You do not need any of the above cited code to run the model, not do you need caffe.

Install

Installing matplotlib from pip requires the following packages to be installed libpng-dev, libjpeg8-dev, libfreetype6-dev and pkg-config. On Debian, Linux Mint and Ubuntu Systems type:

sudo apt-get install libpng-dev libjpeg8-dev libfreetype6-dev pkg-config
pip install -r requirements.txt

TODO

  • Provide finetuned FCN weights.
  • Provide general training code
Owner
Marvin Teichmann
Germany Phd student. Working on Deep Learning and Computer Vision projects.
Marvin Teichmann
CL-Gym: Full-Featured PyTorch Library for Continual Learning

CL-Gym: Full-Featured PyTorch Library for Continual Learning CL-Gym is a small yet very flexible library for continual learning research and developme

Iman Mirzadeh 36 Dec 25, 2022
NeuralDiff: Segmenting 3D objects that move in egocentric videos

NeuralDiff: Segmenting 3D objects that move in egocentric videos Project Page | Paper + Supplementary | Video About This repository contains the offic

Vadim Tschernezki 14 Dec 05, 2022
The repo for reproducing Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study

ECIR Reproducibility Paper: Seed-driven Document Ranking for Systematic Reviews: A Reproducibility Study This code corresponds to the reproducibility

ielab 3 Mar 31, 2022
Super-Fast-Adversarial-Training - A PyTorch Implementation code for developing super fast adversarial training

Super-Fast-Adversarial-Training This is a PyTorch Implementation code for develo

LBK 26 Dec 02, 2022
Mini-hmc-jax - A simple implementation of Hamiltonian Monte Carlo in JAX

mini-hmc-jax This is a simple implementation of Hamiltonian Monte Carlo in JAX t

Martin Marek 6 Mar 03, 2022
Code for "Offline Meta-Reinforcement Learning with Advantage Weighting" [ICML 2021]

Offline Meta-Reinforcement Learning with Advantage Weighting (MACAW) MACAW code used for the experiments in the ICML 2021 paper. Installing the enviro

Eric Mitchell 28 Jan 01, 2023
Fuzzing tool (TFuzz): a fuzzing tool based on program transformation

T-Fuzz T-Fuzz consists of 2 components: Fuzzing tool (TFuzz): a fuzzing tool based on program transformation Crash Analyzer (CrashAnalyzer): a tool th

HexHive 244 Nov 09, 2022
The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Deep High-Resolution Representation Learning for Human Pose Estimation (CVPR 2019) News [2020/07/05] A very nice blog from Towards Data Science introd

Leo Xiao 3.9k Jan 05, 2023
TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular potentials

TorchMD-net TorchMD-Net provides state-of-the-art graph neural networks and equivariant transformer neural networks potentials for learning molecular

TorchMD 104 Jan 03, 2023
A user-friendly research and development tool built to standardize RL competency assessment for custom agents and environments.

Built with ❤️ by Sam Showalter Contents Overview Installation Dependencies Usage Scripts Standard Execution Environment Development Environment Benchm

SRI-AIC 1 Nov 18, 2021
Pytorch domain adaptation package

DomainAdaptation This package is created to tackle the problem of domain shifts when dealing with two domains of different feature distributions. In d

Institute of Computational Perception 7 Oct 22, 2022
Numerai tournament example scripts using NN and optuna

numerai_NN_example Numerai tournament example scripts using pytorch NN, lightGBM and optuna https://numer.ai/tournament Performance of my model based

Takahiro Maeda 12 Oct 10, 2022
A Closer Look at Reference Learning for Fourier Phase Retrieval

A Closer Look at Reference Learning for Fourier Phase Retrieval This repository contains code for our NeurIPS 2021 Workshop on Deep Learning and Inver

Tobias Uelwer 1 Oct 28, 2021
Code for ViTAS_Vision Transformer Architecture Search

Vision Transformer Architecture Search This repository open source the code for ViTAS: Vision Transformer Architecture Search. ViTAS aims to search fo

46 Dec 17, 2022
Boundary-preserving Mask R-CNN (ECCV 2020)

BMaskR-CNN This code is developed on Detectron2 Boundary-preserving Mask R-CNN ECCV 2020 Tianheng Cheng, Xinggang Wang, Lichao Huang, Wenyu Liu Video

Hust Visual Learning Team 178 Nov 28, 2022
This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Developed By Google!

Machine Learning Hand Detector This is a Machine Learning Based Hand Detector Project, It Uses Machine Learning Models and Modules Like Mediapipe, Dev

Popstar Idhant 3 Feb 25, 2022
A DNN inference latency prediction toolkit for accurately modeling and predicting the latency on diverse edge devices.

Note: This is an alpha (preview) version which is still under refining. nn-Meter is a novel and efficient system to accurately predict the inference l

Microsoft 244 Jan 06, 2023
Official PyTorch Implementation of HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning (NeurIPS 2021 Spotlight)

[NeurIPS 2021 Spotlight] HELP: Hardware-adaptive Efficient Latency Prediction for NAS via Meta-Learning [Paper] This is Official PyTorch implementatio

42 Nov 01, 2022
eXPeditious Data Transfer

xpdt: eXPeditious Data Transfer About xpdt is (yet another) language for defining data-types and generating code for serializing and deserializing the

Gianni Tedesco 3 Jan 06, 2022