FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Last update: Sep 06, 2022

Overview

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Flexible EM-Inspired Discriminant Analysis is a robust supervised classification algorithm that performs well in noisy and contaminated datasets.

Authors

Andrew Wang, University of Cambridge, Cambridge, UK Pierre Houdouin, CentraleSupélec, Paris, France

Instllation

pip install -i https://test.pypi.org/simple/ femda

Get started

>>> from sklearn.datasets import load_iris
>>> from femda import FEMDA
>>> X, y = load_iris(return_X_y=True)
>>> clf = FEMDA()
>>> clf.fit(X, y)
FEMDA()
>>> clf.score(X, y)
0.9666666666666667

Using a specific dataset...

>> FEMDA().fit(X_train, y_train).score(X_test, y_test) ...">

>>> import femda.experiments.preprocessing as pre
>>> X_train, y_train, X_test, y_test = pre.statlog(r"root\datasets\\")
>>> FEMDA().fit(X_train, y_train).score(X_test, y_test)
...

Using a sklearn.pipeline.Pipeline...

>>> from sklearn.datasets import load_digits
>>> from sklearn.pipeline import make_pipeline
>>> from sklearn.decomposition import PCA
>>> X, y = load_digits(return_X_y=True)
>>> pipe = make_pipeline(PCA(n_components=5), FEMDA()).fit(X, y)
>>> pipe.predict(X)
...

Run all experiments presented in the paper

>>> from femda.experiments import run_experiments()
>>> run_experiments()
...

See for more.

Abstract

Linear and Quadraic Discriminant Analysis are well-known classical methods but suffer heavily from non-Gaussian class distributions and are very non-robust in contaminated datasets. In this paper, we present a new discriminant analysis style classification algorithm that directly models noise and diverse shapes which can deal with a wide range of datasets.

Each data point is modelled by its own arbitrary Elliptically Symmetrical (ES) distribution and its own arbitrary scale parameter, modelling directly very heterogeneous, non-i.i.d datasets. We show that maximum-likelihood parameter estimation and classification are simple and fast under this model.

We highlight the flexibility of the model to a wide range of Elliptically Symmetrical distribution shapes and varying levels of contamination in synthetic datasets. Then, we show that our algorithm outperforms other robust methods on contaminated datasets from Computer Vision and NLP.

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Related tags

Overview

FEMDA: Robust classification with Flexible Discriminant Analysis in heterogeneous data

Authors

Instllation

Get started

Run all experiments presented in the paper

Abstract

Owner

Visualizer for neural network, deep learning, and machine learning models

StyleGAN2 - Official TensorFlow Implementation

Principled Detection of Out-of-Distribution Examples in Neural Networks

[CVPR 2021] NormalFusion: Real-Time Acquisition of Surface Normals for High-Resolution RGB-D Scanning

Segcache: a memory-efficient and scalable in-memory key-value cache for small objects

Repo for the ACMMM20 submission: "Personalized breath based biometric authentication with wearable multimodality".

Pre-Training Graph Neural Networks for Cold-Start Users and Items Representation.

a practicable framework used in Deep Learning. So far UDL only provide DCFNet implementation for the ICCV paper (Dynamic Cross Feature Fusion for Remote Sensing Pansharpening)

DIVeR: Deterministic Integration for Volume Rendering

Mixed Neural Likelihood Estimation for models of decision-making

Unofficial TensorFlow implementation of the Keyword Spotting Transformer model

Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

CONetV2: Efficient Auto-Channel Size Optimization for CNNs

Code for ICE-BeeM paper - NeurIPS 2020

A simple Tensorflow based library for deep and/or denoising AutoEncoder.

Source code for NAACL 2021 paper "TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference"

Yolox-bytetrack-sample - Python sample of MOT (Multiple Object Tracking) using YOLOX and ByteTrack

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

Optimizing DR with hard negatives and achieving SOTA first-stage retrieval performance on TREC DL Track (SIGIR 2021 Full Paper).

Code for the paper "MASTER: Multi-Aspect Non-local Network for Scene Text Recognition" (Pattern Recognition 2021)