Industrial knn-based anomaly detection for images. Visit streamlit link to check out the demo.

Overview

Industrial KNN-based Anomaly Detection

⭐ Now has streamlit support! ⭐ Run $ streamlit run streamlit_app.py

This repo aims to reproduce the results of the following KNN-based anomaly detection methods:

  1. SPADE (Cohen et al. 2021) - knn in z-space and distance to feature maps spade schematic
  2. PaDiM* (Defard et al. 2020) - distance to multivariate Gaussian of feature maps padim schematic
  3. PatchCore (Roth et al. 2021) - knn distance to avgpooled feature maps patchcore schematic

* actually does not have any knn mechanism, but shares many things implementation-wise.


Install

$ pipenv install -r requirements.txt

Note: I used torch cu11 wheels.

Usage

CLI:

$ python indad/run.py METHOD [--dataset DATASET]

Results can be found under ./results/.

Code example:

from indad.model import SPADE

model = SPADE(k=5, backbone_name="resnet18")

# feed healthy dataset
model.fit(...)

# get predictions
img_lvl_anom_score, pxl_lvl_anom_score = model.predict(...)

Custom datasets

πŸ‘οΈ

Check out one of the downloaded MVTec datasets. Naming of images should correspond among folders. Right now there is no support for no ground truth pixel masks.

πŸ“‚datasets
 β”— πŸ“‚your_custom_dataset
  ┣ πŸ“‚ ground_truth/defective
  ┃ ┣ πŸ“‚ defect_type_1
  ┃ β”— πŸ“‚ defect_type_2
  ┣ πŸ“‚ test
  ┃ ┣ πŸ“‚ defect_type_1
  ┃ ┣ πŸ“‚ defect_type_2
  ┃ β”— πŸ“‚ good
  β”— πŸ“‚ train/good
$ python indad/run.py METHOD --dataset your_custom_dataset

Results

πŸ“ = paper, πŸ‘‡ = this repo

Image-level

class SPADE πŸ“ SPADE πŸ‘‡ PaDiM πŸ“ PaDiM πŸ‘‡ PatchCore πŸ“ PatchCore πŸ‘‡
bottle - 98.3 98.3 99.9 100.0 100.0
cable - 88.1 96.7 87.8 99.5 96.2
capsule - 80.4 98.5 87.6 98.1 95.3
carpet - 62.5 99.1 99.5 98.7 98.7
grid - 25.6 97.3 95.5 98.2 93.0
hazelnut - 92.8 98.2 86.1 100.0 100.0
leather - 85.6 99.2 100.0 100.0 100.0
metal_nut - 78.6 97.2 97.6 100.0 98.3
pill - 78.8 95.7 92.7 96.6 92.8
screw - 66.1 98.5 79.6 98.1 96.7
tile - 96.4 94.1 99.5 98.7 99.0
toothbrush - 83.9 98.8 94.7 100.0 98.1
transistor - 89.4 97.5 95.0 100.0 99.7
wood - 85.3 94.7 99.4 99.2 98.8
zipper - 97.1 98.5 93.8 99.4 98.4
averages 85.5 80.6 97.5 93.9 99.1 97.7

Pixel-level

class SPADE πŸ“ SPADE πŸ‘‡ PaDiM πŸ“ PaDiM πŸ‘‡ PatchCore πŸ“ PatchCore πŸ‘‡
bottle 97.5 97.7 94.8 97.6 98.6 97.8
cable 93.7 94.4 88.8 95.5 98.5 97.4
capsule 97.6 98.7 93.5 98.1 98.9 98.3
carpet 87.4 99.0 96.2 98.7 99.1 98.3
grid 88.5 96.4 94.6 96.4 98.7 96.7
hazelnut 98.4 98.4 92.6 97.3 98.7 98.1
leather 97.2 99.1 97.8 98.6 99.3 98.4
metal_nut 99.0 96.1 85.6 95.8 98.4 96.2
pill 99.1 93.5 92.7 94.4 97.6 98.7
screw 98.1 98.9 94.4 97.5 99.4 98.4
tile 96.5 93.1 86.0 92.6 95.9 94.0
toothbrush 98.9 98.9 93.1 98.5 98.7 98.1
transistor 97.9 95.8 84.5 96.9 96.4 97.5
wood 94.1 94.5 91.1 92.9 95.1 91.9
zipper 96.5 98.3 95.9 97.0 98.9 97.6
averages 96.9 96.6 92.1 96.5 98.1 97.2

PatchCore-10 was used.

Hyperparams

The following parameters were used to calculate the results. They more or less correspond to the parameters used in the papers.

spade:
  backbone: wide_resnet50_2
  k: 50
padim:
  backbone: wide_resnet50_2
  d_reduced: 250
  epsilon: 0.04
patchcore:
  backbone: wide_resnet50_2
  f_coreset: 0.1
  n_reweight: 3

Progress

  • Datasets
  • Code skeleton
  • Config files
  • CLI
  • Logging
  • SPADE
  • PADIM
  • PatchCore
  • Add custom dataset option
  • Add dataset progress bar
  • Add schematics
  • Unit tests

Design considerations

  • Data is processed in single images to avoid batch statistics interference.
  • I decided to implement greedy kcenter from scratch and there is room for improvement.
  • torch.nn.AdaptiveAvgPool2d for feature map resizing, torch.nn.functional.interpolate for score map resizing.
  • GPU is used for backbones and coreset selection. GPU coreset selection currently runs at:
    • 400-500 it/s @ float32 (RTX3080)
    • 1000+ it/s @ float16 (RTX3080)

Acknowledgements

  • hcw-00 for tipping sklearn.random_projection.SparseRandomProjection

References

SPADE:

@misc{cohen2021subimage,
      title={Sub-Image Anomaly Detection with Deep Pyramid Correspondences}, 
      author={Niv Cohen and Yedid Hoshen},
      year={2021},
      eprint={2005.02357},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PaDiM:

@misc{defard2020padim,
      title={PaDiM: a Patch Distribution Modeling Framework for Anomaly Detection and Localization}, 
      author={Thomas Defard and Aleksandr Setkov and Angelique Loesch and Romaric Audigier},
      year={2020},
      eprint={2011.08785},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

PatchCore:

@misc{roth2021total,
      title={Towards Total Recall in Industrial Anomaly Detection}, 
      author={Karsten Roth and Latha Pemula and Joaquin Zepeda and Bernhard SchΓΆlkopf and Thomas Brox and Peter Gehler},
      year={2021},
      eprint={2106.08265},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}
Owner
aventau
Into graphics and modelling. Computer Vision / Machine Learning Engineer.
aventau
Official Pytorch implementation for "End2End Occluded Face Recognition by Masking Corrupted Features, TPAMI 2021"

End2End Occluded Face Recognition by Masking Corrupted Features This is the Pytorch implementation of our TPAMI 2021 paper End2End Occluded Face Recog

Haibo Qiu 25 Oct 31, 2022
Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices

Face-Mesh Face Mesh is a face geometry solution that estimates 468 3D face landmarks in real-time even on mobile devices. It employs machine learning

Farnam Javadi 9 Dec 21, 2022
Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions"

ModelNet-C Code for the paper "Benchmarking and Analyzing Point Cloud Classification under Corruptions". For the latest updates, see: sites.google.com

Jiawei Ren 45 Dec 28, 2022
A PyTorch Implementation of Gated Graph Sequence Neural Networks (GGNN)

A PyTorch Implementation of GGNN This is a PyTorch implementation of the Gated Graph Sequence Neural Networks (GGNN) as described in the paper Gated G

Ching-Yao Chuang 427 Dec 13, 2022
a simple, efficient, and intuitive text editor

Oxygen beta a simple, efficient, and intuitive text editor Overview oxygen is a simple, efficient, and intuitive text editor designed as more featured

Aarush Gupta 1 Feb 23, 2022
πŸ’‘ Learnergy is a Python library for energy-based machine learning models.

Learnergy: Energy-based Machine Learners Welcome to Learnergy. Did you ever reach a bottleneck in your computational experiments? Are you tired of imp

Gustavo Rosa 57 Nov 17, 2022
Scientific Computation Methods in C and Python (Open for Hacktoberfest 2021)

Sci - cpy README is a stub. Do expand it. Objective This repository is meant to be a ready reference for scientific computation methods. Do ⭐ it if yo

Sandip Dutta 7 Oct 12, 2022
Hypersearch weight debugging and losses tutorial

tutorial Activate tensorboard option Running TensorBoard remotely When working on a remote server, you can use SSH tunneling to forward the port of th

1 Dec 11, 2021
Code for paper "Learning to Reweight Examples for Robust Deep Learning"

learning-to-reweight-examples Code for paper Learning to Reweight Examples for Robust Deep Learning. [arxiv] Environment We tested the code on tensorf

Uber Research 261 Jan 01, 2023
Multiband spectro-radiometric satellite image analysis with K-means cluster algorithm

Multi-band Spectro Radiomertric Image Analysis with K-means Cluster Algorithm Overview Multi-band Spectro Radiomertric images are images comprising of

Chibueze Henry 6 Mar 16, 2022
Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

RNN-for-Joint-NLU Pytorch implementation of "Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling"

Kim SungDong 194 Dec 28, 2022
The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

machen 11 Nov 27, 2022
This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

This is the codebase for Diffusion Models Beat GANS on Image Synthesis.

OpenAI 3k Dec 26, 2022
Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution

Deep Learning: Architectures & Methods Project: Deep Learning for Audio Super-Resolution Figure: Example visualization of the method and baseline as a

Oliver Hahn 16 Dec 23, 2022
Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning And private Server services

Tensorflow solution of NER task Using BiLSTM-CRF model with Google BERT Fine-tuning

MaCan 4.2k Dec 29, 2022
Synthesizing Long-Term 3D Human Motion and Interaction in 3D in CVPR2021

Long-term-Motion-in-3D-Scenes This is an implementation of the CVPR'21 paper "Synthesizing Long-Term 3D Human Motion and Interaction in 3D". Please ch

Jiashun Wang 76 Dec 13, 2022
This application is the basic of automated online-class-joiner(for YΔ±ldΔ±zEdu) within the right time. Gets the ZOOM link by scheduled date and time.

This application is the basic of automated online-class-joiner(for YΔ±ldΔ±zEdu) within the right time. Gets the ZOOM link by scheduled date and time.

215355 1 Dec 16, 2021
MiniSom is a minimalistic implementation of the Self Organizing Maps

MiniSom Self Organizing Maps MiniSom is a minimalistic and Numpy based implementation of the Self Organizing Maps (SOM). SOM is a type of Artificial N

Giuseppe Vettigli 1.2k Jan 03, 2023
This repository is the offical Pytorch implementation of ContextPose: Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021).

Context Modeling in 3D Human Pose Estimation: A Unified Perspective (CVPR 2021) Introduction This repository is the offical Pytorch implementation of

37 Nov 21, 2022
A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network The official code of VisionLAN (ICCV2021). VisionLAN successfully a

81 Dec 12, 2022