An implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network"

Overview

Retina Blood Vessels Segmentation

This is an implementation of the research paper "Retina Blood Vessel Segmentation Using A U-Net Based Convolutional Neural Network" written by Wang Xiancheng, Li Weia, et al.

Check out the standalone demo notebook and run segRetino inferences here.

Open In Colab

Inspiration

Various eye diseases can be diagnosed through the characterization of the retinal blood vessels. The characterization can be extracted by using proper imaging techniques and data analysis methods. In case of eye examination, one of the important tasks is the retinal image segmentation.The paper presents a network and training strategy that relies on the data augmentation to use the available annotated samples more efficiently, to segment retinal blood vessels using a UNET convolutional neural network.

Dataset

We have used the Digital Retinal Images for Vessel Extraction (DRIVE) dataset for retinal vessel segmentation. It consists of a total of JPEG 40 color fundus images; including 7 abnormal pathology cases. Each image resolution is 584x565 pixels with eight bits per color channel (3 channels), resized to 512x512 for our model.

Guidelines to download, setup and use the dataset

The DRIVE dataset may be downloaded here as two files named training.zip and test.zip.

Please write the following commands on your terminal to extract the file in the proper directory.

  $ mkdir drive
  $ unzip </path/to/training.zip> -d </path/to/drive>
  $ unzip </path/to/test.zip> -d </path/to/drive>

The resulting directory structure should be:

/path/to/drive
    -> train
        -> image
            -> 21_training_0.tif
            -> 22_training_0.tif
               ...
        -> mask
            -> 21_training_0.gif
            -> 22_training_0.gif
    -> test
        -> image
            -> 01_test_0.tif
            -> 02_test_0.tif
               ...
        -> mask
            -> 01_test_0.gif
            -> 02_test_0.gif

Model Components

The UNET CNN architecture may be divided into the Encoder, Bottleneck and Decoder blocks, followed by a final segmentation output layer.

  • Encoder: There are 4 Encoder blocks, each consisting of a convolutional block followed by a Spatial Max Pooling layer.
  • Bottleneck: The Bottleneck consists of a single convolutional block.
  • Decoder: There are 4 Decoder blocks, each consisting of a deconvolution operation, followed by a convolutional block, along with skip connections.

Note: The convolutional block consists of 2 conv2d operations each followed by a BatchNorm2d, finally followed by a ReLU activation.

model_arch

Implementation Details

  • Image preprocessing included augmentations like HorizontalFlip, VerticalFlip, Rotate.
  • Dataloader object was created for both training and validation data
  • Training process was carried out for 50 epochs, using the Adam Optimizer with a Learning Rate 1e-4.
  • Validation was carried out using Dice Loss and Intersection over Union Loss.

Installation and Quick Start

To use the repo and run inferences, please follow the guidelines below

  • Cloning the Repository:

      $ git clone https://github.com/srijarkoroy/segRetino
    
  • Entering the directory:

      $ cd segRetino/
    
  • Setting up the Python Environment with dependencies:

      $ pip install -r requirements.txt
    
  • Running the file for inference:

      $ python3 test.py
    

Running the test file downloads the pretrained weights of the UNET Model that we have trained on the DRIVE Dataset. However if you want to re-train the model please mention the path to your dataset on you local machine after augmentations, inside the train.py file, as:

train_x = sorted(glob(<path/to/augmented/train/image/folder/>))
train_y = sorted(glob(<path/to/augmented/mask/image/folder/>))

valid_x = sorted(glob(<path/to/test/image/folder/>))
valid_y = sorted(glob(<path/to/test/mask/folder/>))

Once the path has been mentioned, the model may be trained by running the command:

  $ python3 train.py

Note: If images have not been augmented, please see the instructions for augmentation here.

The test file saves two images in the mentioned paths, a masked image showing only the blood vessels, and a blend image showing the blood vessels within the retina. If you don't want to save the blend image, consider running the following code snippet:

# Creating the SegRetino object initialized with the test image path
seg = SegRetino('<path/to/test/img>')

# Running inference
seg.inference(set_weight_dir = 'unet.pth', path = '<path/to/save/masked/image>', blend=False, blend_path = None)

Check out the standalone demo notebook and run segRetino inferences here.

Note: Is is advisable to use a GPU for running the inferences since performing segmentation on 512x512 images with a heavy UNET architecture is expensive.

Results from Implementation

Original Image Masked Image Blend Image

Contributors

Contribution

Contributions are always welcome! Please check out this doc for Contribution Guidelines.

Owner
Srijarko Roy
AI Enthusiast!
Srijarko Roy
Interactive Visualization to empower domain experts to align ML model behaviors with their knowledge.

An interactive visualization system designed to helps domain experts responsibly edit Generalized Additive Models (GAMs). For more information, check

InterpretML 83 Jan 04, 2023
Deep Markov Factor Analysis (NeurIPS2021)

Deep Markov Factor Analysis (DMFA) Codes and experiments for deep Markov factor analysis (DMFA) model accepted for publication at NeurIPS2021: A. Farn

Sarah Ostadabbas 2 Dec 16, 2022
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation, CVPR2022

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation Paper Links: TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentati

Hust Visual Learning Team 253 Dec 21, 2022
Generate indoor scenes with Transformers

SceneFormer: Indoor Scene Generation with Transformers Initial code release for the Sceneformer paper, contains models, train and test scripts for the

Chandan Yeshwanth 110 Dec 06, 2022
Detectron2 is FAIR's next-generation platform for object detection and segmentation.

Detectron2 is Facebook AI Research's next generation software system that implements state-of-the-art object detection algorithms. It is a ground-up r

Facebook Research 23.3k Jan 08, 2023
Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Code and data to accompany the camera-ready version of "Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation" in EMNLP 2021

Mozhdeh Gheini 16 Jul 16, 2022
Official Implementation of PCT

Official Implementation of PCT Prerequisites python == 3.8.5 Please make sure you have the following libraries installed: numpy torch=1.4.0 torchvisi

32 Nov 21, 2022
Springer Link Download Module for Python

♞ pupalink A simple Python module to search and download books from SpringerLink. 🧪 This project is still in an early stage of development. Expect br

Pupa Corp. 18 Nov 21, 2022
Multi-Modal Fingerprint Presentation Attack Detection: Evaluation On A New Dataset

PADISI USC Dataset This repository analyzes the PADISI-Finger dataset introduced in Multi-Modal Fingerprint Presentation Attack Detection: Evaluation

USC ISI VISTA Computer Vision 6 Feb 06, 2022
Code for Towards Streaming Perception (ECCV 2020) :car:

sAP — Code for Towards Streaming Perception ECCV Best Paper Honorable Mention Award Feb 2021: Announcing the Streaming Perception Challenge (CVPR 2021

Martin Li 85 Dec 22, 2022
A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers.

ViTGAN: Training GANs with Vision Transformers A PyTorch implementation of ViTGAN based on paper ViTGAN: Training GANs with Vision Transformers. Refer

Hong-Jia Chen 127 Dec 23, 2022
Official pytorch implementation of the IrwGAN for unaligned image-to-image translation

IrwGAN (ICCV2021) Unaligned Image-to-Image Translation by Learning to Reweight [Update] 12/15/2021 All dataset are released, trained models and genera

37 Nov 09, 2022
Neon: an add-on for Lightbulb making it easier to handle component interactions

Neon Neon is an add-on for Lightbulb making it easier to handle component interactions. Installation pip install git+https://github.com/neonjonn/light

Neon Jonn 9 Apr 29, 2022
RCD: Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems

RCD: Relation Map Driven Cognitive Diagnosis for Intelligent Education Systems This is our implementation for the paper: Weibo Gao, Qi Liu*, Zhenya Hu

BigData Lab @USTC 中科大大数据实验室 10 Oct 16, 2022
TransVTSpotter: End-to-end Video Text Spotter with Transformer

TransVTSpotter: End-to-end Video Text Spotter with Transformer Introduction A Multilingual, Open World Video Text Dataset and End-to-end Video Text Sp

weijiawu 66 Dec 26, 2022
Repository for publicly available deep learning models developed in Rosetta community

trRosetta2 This package contains deep learning models and related scripts used by Baker group in CASP14. Installation Linux/Mac clone the package git

81 Dec 29, 2022
LSTM-VAE Implementation and Relevant Evaluations

LSTM-VAE Implementation and Relevant Evaluations Before using any file in this repository, please create two directories under the root directory name

Lan Zhang 5 Oct 08, 2022
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.

Regularized Greedy Forest Regularized Greedy Forest (RGF) is a tree ensemble machine learning method described in this paper. RGF can deliver better r

RGF-team 364 Dec 28, 2022
Learning to Draw: Emergent Communication through Sketching

Learning to Draw: Emergent Communication through Sketching This is the official code for the paper "Learning to Draw: Emergent Communication through S

19 Jul 22, 2022
MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space

Update (20 Jan 2020): MODALS on text data is avialable MODALS MODALS: Modality-agnostic Automated Data Augmentation in the Latent Space Table of Conte

38 Dec 15, 2022