onelearn: Online learning in Python

Last update: Nov 06, 2022

Overview

onelearn: Online learning in Python

Documentation | Reproduce experiments |

onelearn stands for ONE-shot LEARNning. It is a small python package for online learning with Python. It provides :

online (or one-shot) learning algorithms: each sample is processed once, only a single pass is performed on the data
including multi-class classification and regression algorithms
For now, only ensemble methods, namely Random Forests

Installation

The easiest way to install onelearn is using pip

pip install onelearn

But you can also use the latest development from github directly with

pip install git+https://github.com/onelearn/onelearn.git

References

@article{mourtada2019amf,
  title={AMF: Aggregated Mondrian Forests for Online Learning},
  author={Mourtada, Jaouad and Ga{\"\i}ffas, St{\'e}phane and Scornet, Erwan},
  journal={arXiv preprint arXiv:1906.10529},
  year={2019}
}

Comments

Unable to pickle AMFClassifier.
I would like to save the AMFClassifier, but am unable to pickle it. I have also tried to use dill or joblib, but they also don't seem to work.

Is there maybe another way to somehow export the AMFClassifier in any way, such that I can save it and load it in another kernel?

Below I added a snippet of code which reproduces the error. Note that only after the partial_fit method an error occurs when pickling. When the AMFClassifier has not been fit yet, pickling happens without problems, however, exporting an empty model is pretty useless.

Any help or tips is much appreciated.

from onelearn import AMFClassifier import dill as pickle from sklearn import datasets iris = datasets.load_iris() X = iris.data y = iris.target amf = AMFClassifier(n_classes=3) dump = pickle.dumps(amf) amf = pickle.loads(dump) amf.partial_fit(X,y) dump = pickle.dumps(amf) amf = pickle.loads(dump)
opened by w-feijen 1
Move experiments of the paper in a experiments folder
Update the documentation

Explain that we must clone the repo

Move also the short experiments to a examples folder and build a sphinx gallery with it
enhancement
opened by stephanegaiffas 1
Add some extra tests
Test that batch versus online training leads to the exact same forest

Test the behavior of reserve_samples, with several calls to partial_fit to check that memory is correctly allocated and

tests
opened by stephanegaiffas 1
What if predict_proba receives a single sample

get_amf_decision_online amf.partial_fit(X_train[iteration - 1], y_train[iteration - 1]) File "/Users/stephanegaiffas/Code/onelearn/onelearn/forest.py", line 259, in partial_fit n_samples, n_features = X.shape

opened by stephanegaiffas 1
Improve coverage

A problem is that @jit functions don't work with coverage... a workaround is to disable using the NUMBA_DISABLE_JIT environment variable, but breaks the code that use @jitclass and .class_type.instance_type attributes
enhancement bug fix

opened by stephanegaiffas 1

Releases(v0.3)

v0.3(Sep 29, 2021)
This release adds the following improvements

AMFClassifier and AMFRegressor can be serialized to files (using internally pickle) using the save and load methods

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 6, 2020)
This release adds the following improvements

SampleCollection pre-allocates more samples instead of the bare minimum for faster computation

The playground can be launched from the library

A documentation on readthedocs

Faster computations and a lot of code cleaning

Unittests for python 3.6-3.8

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://onelearn.readthedocs.io

Machine Learning University: Accelerated Natural Language Processing Class

Machine Learning University: Accelerated Natural Language Processing Class This repository contains slides, notebooks and datasets for the Machine Lea

2k Jan 01, 2023

Banpei is a Python package of the anomaly detection.

Banpei Banpei is a Python package of the anomaly detection. Anomaly detection is a technique used to identify unusual patterns that do not conform to

282 Jan 03, 2023

A unified framework for machine learning with time series

Welcome to sktime A unified framework for machine learning with time series We provide specialized time series algorithms and scikit-learn compatible

6k Jan 06, 2023

Forecast dynamically at scale with this unique package. pip install scalecast

🌄 Scalecast: Dynamic Forecasting at Scale About This package uses a scaleable forecasting approach in Python with common scikit-learn and statsmodels

158 Jan 03, 2023

Time Series Prediction with tf.contrib.timeseries

TensorFlow-Time-Series-Examples Additional examples for TensorFlow Time Series(TFTS). Read a Time Series with TFTS From a Numpy Array: See "test_input

476 Nov 17, 2022

This machine learning model was developed for House Prices

This machine learning model was developed for House Prices - Advanced Regression Techniques competition in Kaggle by using several machine learning models such as Random Forest, XGBoost and LightGBM.

1 Mar 02, 2022

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

scikit-multimodallearn is a Python package implementing algorithms multimodal data. It is compatible with scikit-learn, a popul

12 Jun 29, 2022

As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

Machine Learning Loot Crate 💻 🧰 🔴 Welcome contributors! As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Cra

89 Dec 28, 2022

Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

Toolkit for Building Robust ML models that generalize to unseen domains (RobustDG) Divyat Mahajan, Shruti Tople, Amit Sharma Privacy & Causal Learning

149 Jan 06, 2023

Winning solution for the Galaxy Challenge on Kaggle

483 Jan 02, 2023

Bayesian optimization based on Gaussian processes (BO-GP) for CFD simulations.

BO-GP Bayesian optimization based on Gaussian processes (BO-GP) for CFD simulations. The BO-GP codes are developed using GPy and GPyOpt. The optimizer

8 Mar 31, 2022

neurodsp is a collection of approaches for applying digital signal processing to neural time series

neurodsp is a collection of approaches for applying digital signal processing to neural time series, including algorithms that have been proposed for the analysis of neural time series. It also inclu

224 Dec 02, 2022

机器学习检测webshell

ai-webshell-detect 机器学习检测webshell,利用textcnn+简单二分类网络,基于keras,花了七天检测原理: 从文件熵文件长度文件语句提取出特征,然后文件熵与长度送入二分类网络,文件语句送入textcnn 项目原理,介绍,怎么做出来的

56 Dec 14, 2022

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill This is a port of the amazing openskill.js package

156 Dec 14, 2022

Machine Learning for Time-Series with Python.Published by Packt

Machine-Learning-for-Time-Series-with-Python Become proficient in deriving insights from time-series data and analyzing a model’s performance Links Am

124 Dec 28, 2022

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

95 Dec 28, 2022

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.

1 Jan 11, 2022

onelearn: Online learning in Python

Related tags

Overview

onelearn: Online learning in Python

Installation

References

Comments

Unable to pickle AMFClassifier.

Move experiments of the paper in a experiments folder

Add some extra tests

What if predict_proba receives a single sample

Improve coverage

Releases(v0.3)

v0.3(Sep 29, 2021)

v0.2.0(Apr 6, 2020)

Owner

Machine Learning University: Accelerated Natural Language Processing Class

Banpei is a Python package of the anomaly detection.

A unified framework for machine learning with time series

Forecast dynamically at scale with this unique package. pip install scalecast

Time Series Prediction with tf.contrib.timeseries

This machine learning model was developed for House Prices

scikit-multimodallearn is a Python package implementing algorithms multimodal data.

As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

Toolkit for building machine learning models that generalize to unseen domains and are robust to privacy and other attacks.

Winning solution for the Galaxy Challenge on Kaggle

Bayesian optimization based on Gaussian processes (BO-GP) for CFD simulations.

neurodsp is a collection of approaches for applying digital signal processing to neural time series

机器学习检测webshell

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

Machine Learning for Time-Series with Python.Published by Packt

Model Agnostic Confidence Estimator (MACEST) - A Python library for calibrating Machine Learning models' confidence scores

Napari sklearn decomposition

Mesh TensorFlow: Model Parallelism Made Easier

Apache (Py)Spark type annotations (stub files).

To design and implement the Identification of Iris Flower species using machine learning using Python and the tool Scikit-Learn.