TensorFlow implementation of an arbitrary order Factorization Machine

Last update: Dec 21, 2022

Overview

This is a TensorFlow implementation of an arbitrary order (>=2) Factorization Machine based on paper Factorization Machines with libFM.

It supports:

dense and sparse inputs
different (gradient-based) optimization methods
classification/regression via different loss functions (logistic and mse implemented)
logging via TensorBoard

The inference time is linear with respect to the number of features.

Tested on Python3.5, but should work on Python2.7

This implementation is quite similar to the one described in Blondel's et al. paper [https://arxiv.org/abs/1607.07195], but was developed independently and prior to the first appearance of the paper.

Dependencies

Installation

Stable version can be installed via pip install tffm.

Usage

The interface is similar to scikit-learn models. To train a 6-order FM model with rank=10 for 100 iterations with learning_rate=0.01 use the following sample

from tffm import TFFMClassifier
model = TFFMClassifier(
    order=6,
    rank=10,
    optimizer=tf.train.AdamOptimizer(learning_rate=0.01),
    n_epochs=100,
    batch_size=-1,
    init_std=0.001,
    input_type='dense'
)
model.fit(X_tr, y_tr, show_progress=True)

See example.ipynb and gpu_benchmark.ipynb for more details.

It's highly recommended to read tffm/core.py for help.

Testing

Just run python test.py in the terminal. nosetests works too, but you must pass the --logging-level=WARNING flag to avoid printing insane amounts of TensorFlow logs to the screen.

Citation

If you use this software in academic research, please, cite it using the following BibTeX:

@misc{trofimov2016,
author = {Mikhail Trofimov, Alexander Novikov},
title = {tffm: TensorFlow implementation of an arbitrary order Factorization Machine},
year = {2016},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/geffy/tffm}},
}

TensorFlow implementation of an arbitrary order Factorization Machine

Related tags

Overview

Dependencies

Installation

Usage

Testing

Citation

Owner

Mikhail Trofimov

Extended Isolation Forest for Anomaly Detection

Metric learning algorithms in Python

Timeseries analysis for neuroscience data

Classification based on Fuzzy Logic(C-Means).

GRaNDPapA: Generator of Rad Names from Decent Paper Acronyms

A quick reference guide to the most commonly used patterns and functions in PySpark SQL

whylogs: A Data and Machine Learning Logging Standard

End to End toy example of MLOps

Neighbourhood Retrieval (Nearest Neighbours) with Distance Correlation.

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

Continuously evaluated, functional, incremental, time-series forecasting

Automatically build ARIMA, SARIMAX, VAR, FB Prophet and XGBoost Models on Time Series data sets with a Single Line of Code. Now updated with Dask to handle millions of rows.

UpliftML: A Python Package for Scalable Uplift Modeling

ML Optimizers from scratch using JAX

Decision tree is the most powerful and popular tool for classification and prediction

A Collection of Conference & School Notes in Machine Learning 🦄📝🎉

A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.