Baseline and template code for node21 detection track

Overview

Nodule Detection Algorithm

This codebase implements a baseline model, Faster R-CNN, for the nodule detection track in NODE21. It contains all necessary files to build a docker image which can be submitted as an algorithm on the grand-challenge platform. Participants in the nodule detection track can use this codebase as a template to understand how to create their own algorithm for submission.

To serve this algorithm in a docker container compatible with the requirements of grand-challenge, we used evalutils which provides methods to wrap your algorithm in Docker containers. It automatically generates template scripts for your container files, and creates commands for building, testing, and exporting the algorithm container. We adapted this template code for our algorithm by following the general tutorial on how to create a grand-challenge algorithm.

We also explain this template repository, and how to set up your docker container in the video. Before diving into the details of this template code we recommend readers have the pre-requisites installed and have cloned this repository as described below:

Prerequisites

The code in this repository is based on docker and evalutils.

Windows Tip: For participants using Windows, it is highly recommended to install Windows Subsystem for Linux (WSL) to work with Docker on a Linux environment within Windows. Please make sure to install WSL 2 by following the instructions on the same page. The alternative is to work purely out of Ubuntu, or any other flavor of Linux. Also, note that the basic version of WSL 2 does not come with GPU support. Please watch the official tutorial by Microsoft on installing WSL 2 with GPU support.

Please clone the repository as follows:

git clone git@github.com:node21challenge/node21_detection_baseline.git

Table of Contents

An overview of the baseline algorithm
Configuring the Docker File
Export your algorithm container
Submit your algorithm

An overview of the baseline algorithm

The baseline nodule detection algorithm is a Faster R-CNN model, which was implemented using pytorch library. The main file executed by the docker container is process.py.

Input and Output Interfaces

The algorithm needs to perform nodule detection on a given chest X-ray image (CXR), predict a nodule bounding box where a nodule is suspected and return the bounding boxes with an associated likelihood for each one. The algorithm takes a CXR as input and outputs a nodules.json file. All algorithms submitted to the nodule detection track must comply with these input and output interfaces. It reads the input :

  • CXR at "/input/ .mha"

and writes the output to

  • nodules.json file at "/output/nodules.json".

The nodules.json file contains the predicted bounding box locations and associated nodule likelihoods (probabilities). This file is a dictionary and contains multiple 2D bounding boxes coordinates in CIRRUS compatible format. The coordinates are expected in milimiters when spacing information is available. We provide a function in process.py which converts the predictions of the Faster R-CNN model (2D pixel coordinates) to this format. An example json file is as follows:

{
    "type": "Multiple 2D bounding boxes",
    "boxes": [
        {
        "corners": [
            [ 92.66666412353516, 136.06668090820312, 0],
            [ 54.79999923706055, 136.06668090820312, 0],
            [ 54.79999923706055, 95.53333282470703, 0],
            [ 92.66666412353516, 95.53333282470703, 0]
        ]
        probability=0.6
        },
        {
        "corners": [
            [ 92.66666412353516, 136.06668090820312, 0],
            [ 54.79999923706055, 136.06668090820312, 0],
            [ 54.79999923706055, 95.53333282470703, 0],
            [ 92.66666412353516, 95.53333282470703, 0]
        ]}
    ],
    "version": { "major": 1, "minor": 0 }
}

The implementation of the algorithm inference in process.py is straightforward (and must be followed by participants creating their own algorithm): load the model in the init function of the class, and implement a function called predict to perform inference on a CXR image. The function predict is run by evalutils when the process function is called. Since we want to save the predictions produced by the predict function directly as a nodules.json file, we have overwritten the function process_case of evalutils.
We recommend that you copy this implementation in your file as well.

Operating on a 3D image (Stack of 2D CXR images)

For the sake of time efficiency in the evaluation process of NODE21, the submitted algorithms to NODE21 are expected to operate on a 3D image which consists of multiple CXR images stacked together. The algorithm should go through the slices (CXR images) one by one and process them individually, as shown in predict. When outputting results, the third coordinate of the bounding box in nodules.json file is used to identify the CXR from the stack. If the algorithm processes the first CXR image in 3D volume, the z coordinate output should be 0, if it processes the third CXR image, it should be 2, etc.

Running the container in multiple phases:

A selection of NODE21 algorithms will be chosen, based on performance and diversity of methodology, for further experimentation and inclusion in a peer-reviewed article. The owners of these algorithms (maximum 3 per algorithm) will be co-authors on this publication.
For this reason, we request that the container submissions to NODE21 detection track should implement training functionality as well as testing. This should be implemented in the train function which receives the input (containing images and metadata.csv) and output directory as arguments. The input directory is expected to look like this:

Input_dir/
├── metadata.csv
├── Images
│   ├── 1.mha
│   ├── 2.mha
│   └── 3.mha

The algorithm should train a model by reading the images and associated label file (metadata.csv) from the input directory and it should save the model file to the output folder. The model file (model_retrained) should be saved to the output folder frequently since the containers will be executed in training mode with a pre-defined time-limit, and training could be stopped before the defined stopping condition is reached.

The algorithms should have the possibility of running in four different phases depending on the pretrained model in test or train phase:

  1. no arguments given (test phase): Load the 'model' file, and test the model on a given image. This is the default mode.
  2. --train phase: Train the model from scratch given the folder with training images and metadata.csv. Save the model frequently as model_retrained.
  3. --retrain phase: Load the 'model' file, and retrain the model given the folder with training images and metadata.csv. Save the model frequently as model_retrained.
  4. --retest phase: Load 'model_retrain' which was created during the training phase, and test it on a given image.

This may look complicated, but it is not, no worries! Once the training function is implemented, implementing these phases is just a few lines of code (see init function).

The algorithms submitted to NODE21 detection track will be run in default mode (test phase) by grand-challenge. All other phases will be used for further collaborative experiments for the peer-reviewed paper.

📌 NOTE: in case the selected solutions cannot be run in the training phase (or --retrain and --retest phases), the participants will be contacted one time only to fix their docker image. If the solution is not fixed on time or the participants are not responsive, we will have to exclude their algorithm and they will not be eligible for co-authorship in the overview paper.

💡 To test this container locally without a docker container, you should the execute_in_docker flag to False - this sets all paths to relative paths. You should set it back to True when you want to switch back to the docker container setting.

Configure the Docker file

We recommend that you use our dockerfile as a template, and update it according to your algorithm requirements. There are three main components you need to define in your docker file in order to wrap your algorithm in a docker container:

  1. Choose the right base image (official base image from the library you need (tensorflow, pytorch etc.) recommended)
FROM pytorch/pytorch:1.9.0-cuda11.1-cudnn8-runtime

📌 NOTE: You should use a base image that is compatible with CUDA 11.x since that is what will be used on the grand-challenge system.

  1. Copy all the files you need to run your model : model weights, requirements.txt, all the python files you need etc.
COPY --chown=algorithm:algorithm requirements.txt /opt/algorithm/
COPY --chown=algorithm:algorithm entrypoint.sh /opt/algorithm/
COPY --chown=algorithm:algorithm model /opt/algorithm/
COPY --chown=algorithm:algorithm resnet50-19c8e357.pth  /home/algorithm/.cache/torch/hub/checkpoints/resnet50-19c8e357.pth
COPY --chown=algorithm:algorithm training_utils /opt/algorithm/training_utils
  1. Install all the dependencies, defined in reqirements.txt, in your dockerfile.
RUN python -m pip install --user -rrequirements.txt

Ensure that all of the dependencies with their versions are specified in requirements.txt:

evalutils==0.2.4
scikit-learn==0.20.2
scipy==1.2.1
--find-links https://download.pytorch.org/whl/torch_stable.html 
torchvision==0.10.0+cu111 
torchaudio==0.9.0
scikit-image==0.17.2

Build, test and export your container

  1. Switch to the correct algorithm folder at algorithms/noduledetection. To test if all dependencies are met, you can run the file build.bat (Windows) / build.sh (Linux) to build the docker container. Please note that the next step (testing the container) also runs a build, so this step is not necessary if you are certain that everything is set up correctly.

    build.sh/build.bat files will run the following command to build the docker for you:

    docker build -t noduledetector .
  2. To test the docker container to see if it works as expected, test.sh/test.bat will run the container on images provided in test/ folder, and it will check the results (nodules.json produced by your algorithm) against test/expected_output.json. Please update your test/expected_output.json according to your algorithm result when it is run on the test data.

    . ./test.sh

    If the test runs successfully you will see the message Tests successfully passed... at the end of the output.

    Once you validated that the algorithm works as expected, you might want to simply run the algorithm on the test folder and check the nodules.json file for yourself. If you are on a native Linux system you will need to create a results folder that the docker container can write to as follows (WSL users can skip this step). (Note that $SCRIPTPATH was created in the previous test script)

    mkdir $SCRIPTPATH/results
    chmod 777 $SCRIPTPATH/results

    To write the output of the algorithm to the results folder use the following command (note that $SCRIPTPATH was created in the previous test script):

    docker run --rm --memory=11g -v $SCRIPTPATH/test:/input/ -v $SCRIPTPATH/results:/output/ noduledetector
  3. If you would like to run the algorithm in training mode (or any other modes), please make sure your training folder (which is mapped to /input) has 'metadata.csv' and images/ folder as described above. If you are on a native Linux system make sure that your output folder has 777 permissions as mentioned in the previous step. You can use the following command to start training -(you may also need to add the flag --shm-size 8G (for example) to specify shared memory that the container can use:

    docker run --rm --gpus all --memory=11g -v path_to_your_training_folder/:/input/ -v path_to_your_output_folder/:/output/ noduledetector --train
  4. Run export.sh/export.bat to save the docker image which runs the following command:

     docker save noduledetector | gzip -c > noduledetector.tar.gz

Submit your algorithm

Details of how to create an algorithm on grand-challenge and submit it to the node21 challenge will be added here soon.
Please make sure all steps described above work as expected before proceeding. Ensure also that you have an account on grand-challenge.org and that you are a
verified user there.

You might also like...
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Code for technical report "An Improved Baseline for Sentence-level Relation Extraction".

RE_improved_baseline Code for technical report "An Improved Baseline for Sentence-level Relation Extraction". Requirements torch = 1.8.1 transformers

Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.
Project code for weakly supervised 3D object detectors using wide-baseline multi-view traffic camera data: WIBAM.

WIBAM (Work in progress) Weakly Supervised Training of Monocular 3D Object Detectors Using Wide Baseline Multi-view Traffic Camera Data 3D object dete

This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.
This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model.

BALLAD This is the official code repository for A Simple Long-Tailed Rocognition Baseline via Vision-Language Model. Requirements Python3 Pytorch(1.7.

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps[AAAI2021]

Simple is not Easy: A Simple Strong Baseline for TextVQA and TextCaps Here is the code for ssbassline model. We also provide OCR results/features/mode

This is an official implementation for "DeciWatch: A Simple Baseline for 10x Efficient 2D and 3D Pose Estimation"

DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation This repo is the official implementation of "DeciWatch: A Simple Baseline for

MOT-Tracking-by-Detection-Pipeline - For Tracking-by-Detection format MOT (Multi Object Tracking), is it a framework that separates Detection and Tracking processes?
Tracking code for the winner of track 1 in the MMP-Tracking Challenge at ICCV 2021 Workshop.

Tracking Code for the winner of track1 in MMP-Trakcing challenge This repository contains our tracking code for the Multi-camera Multiple People Track

Official implementation of ETH-XGaze dataset baseline

ETH-XGaze baseline Official implementation of ETH-XGaze dataset baseline. ETH-XGaze dataset ETH-XGaze dataset is a gaze estimation dataset consisting

Comments
  • Loading in model weights when retesting

    Loading in model weights when retesting

    Shouldn't this weights file be read from self.output_path?

    https://github.com/node21challenge/node21_detection_baseline/blob/800a027e82fc5ceaada738a14175d46bc52b0871/process.py#L65

    opened by ckolluru 0
  • Error when ./test.sh

    Error when ./test.sh

    loading the model.pth file :
    Traceback (most recent call last):
      File "process.py", line 228, in <module>
        Noduledetection(parsed_args.input_dir, parsed_args.output_dir, retest=parsed_args.retest).process()
      File "process.py", line 57, in __init__
        map_location=self.device,
      File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 608, in load
        return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
      File "/opt/conda/lib/python3.7/site-packages/torch/serialization.py", line 777, in _legacy_load
        magic_number = pickle_module.load(f, **pickle_load_args)
    _pickle.UnpicklingError: invalid load key, 'v'.
    ./test.sh: line 15: python3: command not found
    cat: /output/nodules.json: No such file or directory
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    FileNotFoundError: [Errno 2] No such file or directory: '/output/nodules.json'
    Expected output was not found...
    noduledetection-output
    
    opened by Barnonewdm 4
Releases(v1.0addedtag)
Owner
node21challenge
Repositories associated with the grand challenge at https://node21.grand-challenge.org/
node21challenge
Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Database

Python cx_Oracle Notebooks, 2022 The repository contains Jupyter notebooks showing best practices for using cx_Oracle, the Python DB API for Oracle Da

Christopher Jones 13 Dec 15, 2022
Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search

CLIP-GLaSS Repository for the paper Generating images from caption and vice versa via CLIP-Guided Generative Latent Space Search An in-browser demo is

Federico Galatolo 172 Dec 22, 2022
PyTorch implementation of our CVPR2021 (oral) paper "Prototype Augmentation and Self-Supervision for Incremental Learning"

PASS - Official PyTorch Implementation [CVPR2021 Oral] Prototype Augmentation and Self-Supervision for Incremental Learning Fei Zhu, Xu-Yao Zhang, Chu

67 Dec 27, 2022
LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

Simon Boehm 183 Jan 02, 2023
Convert scikit-learn models to PyTorch modules

sk2torch sk2torch converts scikit-learn models into PyTorch modules that can be tuned with backpropagation and even compiled as TorchScript. Problems

Alex Nichol 101 Dec 16, 2022
Who calls the shots? Rethinking Few-Shot Learning for Audio (WASPAA 2021)

rethink-audio-fsl This repo contains the source code for the paper "Who calls the shots? Rethinking Few-Shot Learning for Audio." (WASPAA 2021) Table

Yu Wang 34 Dec 24, 2022
Place holder for HOPE: a human-centric and task-oriented MT evaluation framework using professional post-editing

HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professional Post-Editing Towards More Effective MT Evaluation Place holder for dat

Lifeng Han 1 Apr 25, 2022
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce (EMNLP Founding 2021)

Introduction K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce. Installation PyTor

Xu Song 21 Nov 16, 2022
Parameter-ensemble-differential-evolution - Shows how to do parameter ensembling using differential evolution.

Ensembling parameters with differential evolution This repository shows how to ensemble parameters of two trained neural networks using differential e

Sayak Paul 9 May 04, 2022
Implementation of paper "Towards a Unified View of Parameter-Efficient Transfer Learning"

A Unified Framework for Parameter-Efficient Transfer Learning This is the official implementation of the paper: Towards a Unified View of Parameter-Ef

Junxian He 216 Dec 29, 2022
RoFormer_pytorch

PyTorch RoFormer 原版Tensorflow权重(https://github.com/ZhuiyiTechnology/roformer) chinese_roformer_L-12_H-768_A-12.zip (提取码:xy9x) 已经转化为PyTorch权重 chinese_r

yujun 283 Dec 12, 2022
A framework for analyzing computer vision models with simulated data

3DB: A framework for analyzing computer vision models with simulated data Paper Quickstart guide Blog post Installation Follow instructions on: https:

3DB 112 Jan 01, 2023
Implementation of EMNLP 2017 Paper "Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog" using PyTorch and ParlAI

Language Emergence in Multi Agent Dialog Code for the Paper Natural Language Does Not Emerge 'Naturally' in Multi-Agent Dialog Satwik Kottur, José M.

Karan Desai 105 Nov 25, 2022
Synthesize photos from PhotoDNA using machine learning 🌱

Ribosome Synthesize photos from PhotoDNA. See the blog post for more information. Installation Dependencies You can install Python dependencies using

Anish Athalye 112 Nov 23, 2022
[CVPR22] Official codebase of Semantic Segmentation by Early Region Proxy.

RegionProxy Figure 2. Performance vs. GFLOPs on ADE20K val split. Semantic Segmentation by Early Region Proxy Yifan Zhang, Bo Pang, Cewu Lu CVPR 2022

Yifan 54 Nov 29, 2022
From a body shape, infer the anatomic skeleton.

OSSO: Obtaining Skeletal Shape from Outside (CVPR 2022) This repository contains the official implementation of the skeleton inference from: OSSO: Obt

Marilyn Keller 166 Dec 28, 2022
AI-based, context-driven network device ranking

Batea A batea is a large shallow pan of wood or iron traditionally used by gold prospectors for washing sand and gravel to recover gold nuggets. Batea

Secureworks Taegis VDR 269 Nov 26, 2022
(CVPR 2022) A minimalistic mapless end-to-end stack for joint perception, prediction, planning and control for self driving.

LAV Learning from All Vehicles Dian Chen, Philipp Krähenbühl CVPR 2022 (also arXiV 2203.11934) This repo contains code for paper Learning from all veh

Dian Chen 300 Dec 15, 2022
A web application that provides real time temperature and humidity readings of a house.

About A web application which provides real time temperature and humidity readings of a house. If you're interested in the data collected so far click

Ben Thompson 3 Jan 28, 2022
[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022