Relevance Vector Machine implementation using the scikit-learn API.

Last update: Nov 18, 2022

Related tags

Overview

scikit-rvm

scikit-rvm is a Python module implementing the Relevance Vector Machine (RVM) machine learning technique using the scikit-learn API.

Quickstart

With NumPy, SciPy and scikit-learn available in your environment, install with:

pip install https://github.com/JamesRitchie/scikit-rvm/archive/master.zip

Regression is done with the RVR class:

>>> from skrvm import RVR
>>> X = [[0, 0], [2, 2]]
>>> y = [0.5, 2.5 ]
>>> clf = RVR(kernel='linear')
>>> clf.fit(X, y)
RVR(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='linear', n_iter=3000,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.predict([[1, 1]])
array([ 1.49995187])

Classification is done with the RVC class:

>>> from skrvm import RVC
>>> from sklearn.datasets import load_iris
>>> clf = RVC()
>>> clf.fit(iris.data, iris.target)
RVC(alpha=1e-06, beta=1e-06, beta_fixed=False, bias_used=True, coef0=0.0,
coef1=None, degree=3, kernel='rbf', n_iter=3000, n_iter_posterior=50,
threshold_alpha=1000000000.0, tol=0.001, verbose=False)
>>> clf.score(iris.data, iris.target)
0.97999999999999998

Theory

The RVM is a sparse Bayesian analogue to the Support Vector Machine, with a number of advantages:

It provides probabilistic estimates, as opposed to the SVM's point estimates.
Typically provides a sparser solution than the SVM, which tends to have the number of support vectors grow linearly with the size of the training set.
Does not need a complexity parameter to be selected in order to avoid overfitting.

However it is more expensive to train than the SVM, although prediction is faster and no cross-validation runs are required.

The RVM's original creator Mike Tipping provides a selection of papers offering detailed insight into the formulation of the RVM (and sparse Bayesian learning in general) on a dedicated page, along with a Matlab implementation.

Most of this implementation was written working from Section 7.2 of Christopher M. Bishops's Pattern Recognition and Machine Learning.

Contributors

Future Improvements

Implement the fast Sequential Sparse Bayesian Learning Algorithm outlined in Section 7.2.3 of Pattern Recognition and Machine Learning
Handle ill-conditioning errors more gracefully.
Implement more kernel choices.
Create more detailed examples with IPython notebooks.

Relevance Vector Machine implementation using the scikit-learn API.

Related tags

Overview

scikit-rvm

Quickstart

Theory

Contributors

Future Improvements

Owner

James Ritchie

Machine Learning toolbox for Humans

Decentralized deep learning in PyTorch. Built to train models on thousands of volunteers across the world.

Primitives for machine learning and data science.

Python package for stacking (machine learning technique)

database for artificial intelligence/machine learning data

stability-selection - A scikit-learn compatible implementation of stability selection

fMRIprep Pipeline To Machine Learning

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

Kaggle Tweet Sentiment Extraction Competition: 1st place solution (Dark of the Moon team)

The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

Accelerating model creation and evaluation.

This is a public repo where code samples are stored for the book Practical MLOps.

Python 3.6+ toolbox for submitting jobs to Slurm

A mindmap summarising Machine Learning concepts, from Data Analysis to Deep Learning.

Decision Weights in Prospect Theory

A logistic regression model for health insurance purchasing prediction

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

PennyLane is a cross-platform Python library for differentiable programming of quantum computers

Implementation of linesearch Optimization Algorithms in Python