Anomaly Detection

시계열 데이터에 대한 이상치 탐지

1. Kernel Density Estimation을 활용한 이상치 탐지

train_data_path와 test_data_path에 존재하는 시점 정보를 포함하고 있는 csv 형태의 train data와 test data를 input으로 사용함
Train data로 kernel density estimation 모델을 적합하여 정상 데이터의 분포를 추정함
추정된 분포를 기반으로 test data의 각 시점에 대한 anomaly score를 도출하고 이를 csv 파일 및 그래프로 save_root_path에 저장함

python kde.py --train_data_path='./data/nasa_bearing_train.csv' \
              --test_data_path='./data/nasa_bearing_test.csv' \
              --save_root_path='./result/kde'

2. Local Outlier Factor를 활용한 이상치 탐지

train_data_path와 test_data_path에 존재하는 시점 정보를 포함하고 있는 csv 형태의 train data와 test data를 input으로 사용함
Train data로 Local Outlier Factor 모델을 적합하여 n_neighbors 개수의 이웃을 기반으로 정상 데이터의 밀도를 추정함
추정된 밀도를 기반으로 test data의 각 시점에 대한 anomaly score를 도출하고 이를 csv 파일 및 그래프로 save_root_path에 저장함

python lof.py --train_data_path='./data/nasa_bearing_train.csv' \
              --test_data_path='./data/nasa_bearing_test.csv' \
              --save_root_path='./result/lof' \
              --n_neighbors=5

3. Isolation Forest를 활용한 이상치 탐지

train_data_path와 test_data_path에 존재하는 시점 정보를 포함하고 있는 csv 형태의 train data와 test data를 input으로 사용함
Train data로 isolation forest 모델을 적합함
Train data를 reference set으로 사용하여 test data의 각 시점에 대한 anomaly score를 도출하고 이를 csv 파일 및 그래프로 save_root_path에 저장함

python iforest.py --train_data_path='./data/nasa_bearing_train.csv' \
                  --test_data_path='./data/nasa_bearing_test.csv' \
                  --save_root_path='./result/iforest'

4. Spectral Residual을 활용한 이상치 탐지

설정된 window size 와 score window size 를 통해 window 구간 내 이상치를 탐지함
score window size 는 window size 보다 크게 설정해야함

python spectral.py --window= 24 \
                  --score_window=100

Anomaly Detection 이상치 탐지 전처리 모듈

Related tags

Overview

Anomaly Detection

1. Kernel Density Estimation을 활용한 이상치 탐지

2. Local Outlier Factor를 활용한 이상치 탐지

3. Isolation Forest를 활용한 이상치 탐지

4. Spectral Residual을 활용한 이상치 탐지

Owner

CLUST-consortium

Ecco is a python library for exploring and explaining Natural Language Processing models using interactive visualizations.

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Generate text line images for training deep learning OCR model (e.g. CRNN)

SentimentArcs: a large ensemble of dozens of sentiment analysis models to analyze emotion in text over time

Code for the Python code smells video on the ArjanCodes channel.

Learning Spatio-Temporal Transformer for Visual Tracking

A script that automatically creates a branch name using google translation api and jira api

Proquabet - Convert your prose into proquints and then you essentially have Vogon poetry

Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks

FB ID CLONER WUTHOT CHECKPOINT, FACEBOOK ID CLONE FROM FILE

ChainKnowledgeGraph, 产业链知识图谱包括A股上市公司、行业和产品共3类实体

A desktop GUI providing an audio interface for GPT3.

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents

Diaformer: Automatic Diagnosis via Symptoms Sequence Generation

Transformers implementation for Fall 2021 Clinic

State-of-the-art NLP through transformer models in a modular design and consistent APIs.

Yodatranslator is a simple translator English to Yoda-language

Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

Module for automatic summarization of text documents and HTML pages.

A practical and feature-rich paraphrasing framework to augment human intents in text form to build robust NLU models for conversational engines. Created by Prithiviraj Damodaran. Open to pull requests and other forms of collaboration.