Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...

Last update: Jan 02, 2023

Overview

MLOps Platforms

MLOps is an especially confusing landscape with hundreds of tools available. This project helps to navigate the space of MLOps platforms.

Understanding MLOps platforms is complex. Platforms have their own specializations and there is no clear line between a tool (with a narrow focus) and a platform (which supports many ML lifecycle activities). The below (from the Thoughtworks Guide to MLOps Platforms) illustrates how some of the platforms specialize in particular areas (bottom) and others aim to cover the whole lifecycle with equal focus (top):

Even platforms that have a similar scope have different concepts and strategies, making them hard to compare directly. This repository provides resources for evaluating MLOps platforms.

If you're wondering what process to use to evaluate MLOps platforms, see the Thoughtworks Guide. If you know how to evaluate MLOps platforms and want materials, read on.

Comparison Matrix Format

The matrix an open format of categories with links to vendor documentation within cells to highlight features. This lets vendors do things their own ways and helps readers find the detail they need.

Comparison Matrix

We suggest to click through to the master spreadsheet in google sheets:

If you can't access or don't like google sheets then there is a translation of the matrix into Github markdown

Platform Profiles

These profiles are concise marketing-free introductions to key concepts of MLOps platforms. This provides just enough context to make sense of the features in the matrix.

AWS Sagemaker, Google Vertex and Microsoft Azure ML
Databricks, Dataiku
h2o and KNIME
kubeflow and mlflow

Contributions

Everyone is welcome to contribute, including vendors. Language should be neutral - marketing language will not be accepted.

Changes are welcome by PR or issues - please create a copy of the spreadsheet, link to or upload your copy and explain which parts are changed. Please follow the existing format or raise an issue in advance to suggest changes to the format. On approval a maintainer will then update the master spreadsheet used to generate the markdown.

Disclaimer

We do our best to keep this information accurate and up-to-date but cannot provide guarantees. References to documentation are provided throughout so readers can check for themselves. If you spot anything inaccurate then please raise an issue or pull request (see Contributing section).

Compare MLOps Platforms. Breakdowns of SageMaker, VertexAI, AzureML, Dataiku, Databricks, h2o, kubeflow, mlflow...

Related tags

Overview

MLOps Platforms

Comparison Matrix Format

Comparison Matrix

Platform Profiles

Contributions

Disclaimer

Owner

Thoughtworks

李航《统计学习方法》复现

Cryptocurrency price prediction and exceptions in python

Lightweight Machine Learning Experiment Logging 📖

whylogs: A Data and Machine Learning Logging Standard

Simulate & classify transient absorption spectroscopy (TAS) spectral features for bulk semiconducting materials (Post-DFT)

Machine Learning University: Accelerated Natural Language Processing Class

AutoOED: Automated Optimal Experiment Design Platform

DirectML is a high-performance, hardware-accelerated DirectX 12 library for machine learning.

A Python toolbox to churn out organic alkalinity calculations with minimal brain engagement.

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

A Time Series Library for Apache Spark

Tribuo - A Java machine learning library

Library of Stan Models for Survival Analysis

Automated Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning

QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.

Reproducibility and Replicability of Web Measurement Studies

Scikit-learn compatible wrapper of the Random Bits Forest program written by (Wang et al., 2016)

A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

Backtesting an algorithmic trading strategy using Machine Learning and Sentiment Analysis.