Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning"

Overview

Mixed supervision for surface-defect detection: from weakly to fully supervised learning [Computers in Industry 2021]

Official PyTorch implementation for "Mixed supervision for surface-defect detection: from weakly to fully supervised learning" published in journal Computers in Industry 2021.

The same code is also an offical implementation of the method used in "End-to-end training of a two-stage neural network for defect detection" published in International Conference on Pattern Recognition 2020.

Citation

Please cite our Computers in Industry 2021 paper when using this code:

@article{Bozic2021COMIND,
  author = {Bo{\v{z}}i{\v{c}}, Jakob and Tabernik, Domen and 
  Sko{\v{c}}aj, Danijel},
  journal = {Computers in Industry},
  title = {{Mixed supervision for surface-defect detection: from weakly to fully supervised learning}},
  year = {2021}
}

How to run:

Requirements

Code has been tested to work on:

  • Python 3.8
  • PyTorch 1.6, 1.8
  • CUDA 10.0, 10.1
  • using additional packages as listed in requirements.txt

Datasets

You will need to download the datasets yourself. For DAGM and Severstal Steel Defect Dataset you will also need a Kaggle account.

  • DAGM available here.
  • KolektorSDD available here.
  • KolektorSDD2 available here.
  • Severstal Steel Defect Dataset available here.

For details about data structure refer to README.md in datasets folder.

Cross-validation splits, train/test splits and weakly/fully labeled splits for all datasets are located in splits directory of this repository, alongside the instructions on how to use them.

Using on other data

Refer to README.md in datasets for instructions on how to use the method on other datasets.

Demo - fully supervised learning

To run fully supervised learning and evaluation on all four datasets run:

./DEMO.sh
# or by specifying multiple GPU ids 
./DEMO.sh 0 1 2

Results will be written to ./results folder.

Replicating paper results

To replicate the results published in the paper run:

./EXPERIMENTS_COMIND.sh
# or by specifying multiple GPU ids 
./EXPERIMENTS_COMIND.sh 0 1 2

To replicate the results from ICPR 2020 paper:

@misc{Bozic2020ICPR,
    title={End-to-end training of a two-stage neural network for defect detection},
    author={Jakob Božič and Domen Tabernik and Danijel Skočaj},
    year={2020},
    eprint={2007.07676},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

run:

./EXPERIMENTS_ICPR.sh
# or by specifying multiple GPU ids 
./EXPERIMENTS_ICPR.sh 0 1 2

Results will be written to ./results-comind and ./results-icpr folders.

Usage of training/evaluation code

The following python files are used to train/evaluate the model:

  • train_net.py Main entry for training and evaluation
  • models.py Model file for network
  • data/dataset_catalog.py Contains currently supported datasets

In order to train and evaluate a network you can also use EXPERIMENTS_ROOT.sh, which contains several functions that will make training and evaluation easier for you. For more details see the file EXPERIMENTS_ROOT.sh.

Running code

Simplest way to train and evaluate a network is to use EXPERIMENTS_ROOT.sh, you can see examples of use in EXPERIMENTS_ICPR.sh and in EXPERIMENTS_COMIND.sh

If you wish to do it the other way you can do it by running train_net.py and passing the parameters as keyword arguments. Bellow is an example of how to train a model for a single fold of KSDD dataset.

python -u train_net.py  \
    --GPU=0 \
    --DATASET=KSDD \
    --RUN_NAME=RUN_NAME \
    --DATASET_PATH=/path/to/dataset \
    --RESULTS_PATH=/path/to/save/results \
    --SAVE_IMAGES=True \
    --DILATE=7 \
    --EPOCHS=50 \
    --LEARNING_RATE=1.0 \
    --DELTA_CLS_LOSS=0.01 \
    --BATCH_SIZE=1 \
    --WEIGHTED_SEG_LOSS=True \
    --WEIGHTED_SEG_LOSS_P=2 \
    --WEIGHTED_SEG_LOSS_MAX=1 \
    --DYN_BALANCED_LOSS=True \
    --GRADIENT_ADJUSTMENT=True \
    --FREQUENCY_SAMPLING=True \
    --TRAIN_NUM=33 \
    --NUM_SEGMENTED=33 \
    --FOLD=0

Some of the datasets do not require you to specify --TRAIN_NUM or --FOLD- After training, each model is also evaluated.

For KSDD you need to combine the results of evaluation from all three folds, you can do this by using join_folds_results.py:

python -u join_folds_results.py \
    --RUN_NAME=SAMPLE_RUN \
    --RESULTS_PATH=/path/to/save/results \
    --DATASET=KSDD 

You can use read_results.py to generate a table of results f0r all runs for selected dataset.
Note: The model is sensitive to random initialization and data shuffles during the training and will lead to different performance with different runs unless --REPRODUCIBLE_RUN is set.

Owner
ViCoS Lab
ViCoS Lab
This is a GUI for scrapping PDFs with the help of optical character recognition making easier than ever to scrape PDFs.

pdf-scraper-with-ocr With this tool I am aiming to facilitate the work of those who need to scrape PDFs either by hand or using tools that doesn't imp

Jacobo José Guijarro Villalba 75 Oct 21, 2022
fishington.io bot with OpenCV and NumPy

fishington.io-bot fishington.io bot with using OpenCV and NumPy bot can continue to fishing fully automatically how to use Open cmd in fishington.io-b

Bahadır Araz 77 Jan 02, 2023
PianoVisuals - Create background videos synced with piano music using opencv

Steps Record piano video Use Neural Network to do body segmentation (video matti

Solbiati Alessandro 4 Jan 24, 2022
WACV 2022 Paper - Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching

Is An Image Worth Five Sentences? A New Look into Semantics for Image-Text Matching Code based on our WACV 2022 Accepted Paper: https://arxiv.org/pdf/

Andres 13 Dec 17, 2022
Resizing Canny Countour In Python

Resizing_Canny_Countour Install Visual Studio Code , https://code.visualstudio.com/download Select Python and install with terminal( pip install openc

Walter Ng 1 Nov 07, 2021
Python tool that takes the OCR.space JSON output as input and draws a text overlay on top of the image.

OCR.space OCR Result Checker = Draw OCR overlay on top of image Python tool that takes the OCR.space JSON output as input, and draws an overlay on to

a9t9 4 Oct 18, 2022
huoyijie 1.2k Dec 29, 2022
Um RPG de texto orientado a objetos.

RPG de texto Um RPG de texto orientado a objetos, sem história. Um RPG (Role-playing game) baseado em texto em que você pode viajar para alguns locais

Vinicius 3 Oct 05, 2022
Drowsiness Detection and Alert System

A countless number of people drive on the highway day and night. Taxi drivers, bus drivers, truck drivers, and people traveling long-distance suffer from lack of sleep.

Astitva Veer Garg 4 Aug 01, 2022
A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

Attention-based OCR Visual attention-based OCR model for image recognition with additional tools for creating TFRecords datasets and exporting the tra

Ed Medvedev 933 Dec 29, 2022
MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition

MORAN: A Multi-Object Rectified Attention Network for Scene Text Recognition Python 2.7 Python 3.6 MORAN is a network with rectification mechanism for

Canjie Luo 595 Dec 27, 2022
This is the code for our paper DAAIN: Detection of Anomalous and AdversarialInput using Normalizing Flows

Merantix-Labs: DAAIN This is the code for our paper DAAIN: Detection of Anomalous and Adversarial Input using Normalizing Flows which can be found at

Merantix 14 Oct 12, 2022
A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports

A buffered and threaded wrapper for the OpenCV VideoCapture object. Can speed up video decoding significantly. Supports "with"-syntax.

Patrice Matz 0 Oct 30, 2021
Generates a message from the infamous Jerma Impostor image

Generate your very own jerma sus imposter message. Modes: Default Mode: Only supports the characters " ", !, a, b, c, d, e, h, i, m, n, o, p, q, r, s,

Giorno420 1 Oct 27, 2022
👄 The most accurate natural language detection library for Java and the JVM, suitable for long and short text alike

Quick Info this library tries to solve language detection of very short words and phrases, even shorter than tweets makes use of both statistical and

Peter M. Stahl 532 Dec 28, 2022
Deskew is a command line tool for deskewing scanned text documents. It uses Hough transform to detect "text lines" in the image. As an output, you get an image rotated so that the lines are horizontal.

Deskew by Marek Mauder https://galfar.vevb.net/deskew https://github.com/galfar/deskew v1.30 2019-06-07 Overview Deskew is a command line tool for des

Marek Mauder 127 Dec 03, 2022
Isearch (OSINT) 🔎 Face recognition reverse image search on Instagram profile feed photos.

isearch is an OSINT tool on Instagram. Offers a face recognition reverse image search on Instagram profile feed photos.

Malek salem 20 Oct 25, 2022
Program created with opencv that allows you to automatically count your repetitions on several fitness exercises.

Virtual partner of gym Description Program created with opencv that allows you to automatically count your repetitions on several fitness exercises li

1 Jan 04, 2022
Implementation of our paper 'PixelLink: Detecting Scene Text via Instance Segmentation' in AAAI2018

Code for the AAAI18 paper PixelLink: Detecting Scene Text via Instance Segmentation, by Dan Deng, Haifeng Liu, Xuelong Li, and Deng Cai. Contributions

758 Dec 22, 2022