This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Overview

Interpretable Machine Learning with Python

Interpretable Machine Learning with Pythone

This is the code repository for Interpretable Machine Learning with Python, published by Packt.

Learn to build interpretable high-performance models with hands-on real-world examples

What is this book about?

Do you want to understand your models and mitigate the risks associated with poor predictions using practical machine learning (ML) interpretation? Interpretable Machine Learning with Python can help you overcome these challenges, using interpretation methods to build fairer and safer ML models.

This book covers the following exciting features:

  • Recognize the importance of interpretability in business
  • Study models that are intrinsically interpretable such as linear models, decision trees, and Naïve Bayes
  • Become well-versed in interpreting models with model-agnostic methods
  • Visualize how an image classifier works and what it learns
  • Understand how to mitigate the influence of bias in datasets

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders. For example, Chapter02.

The code will look like the following:

base_classifier = KerasClassifier(model=base_model,\
                                  clip_values=(min_, max_))
y_test_mdsample_prob = np.max(y_test_prob[sampl_md_idxs],\
                                                       axis=1)
y_test_smsample_prob = np.max(y_test_prob[sampl_sm_idxs],\
                                                       axis=1)

Following is what you need for this book: This book is for data scientists, machine learning developers, and data stewards who have an increasingly critical responsibility to explain how the AI systems they develop work, their impact on decision making, and how they identify and manage bias. Working knowledge of machine learning and the Python programming language is expected.

With the following software and hardware list you can run all code files present in the book (Chapter 1-14).

Software and Hardware List

You can install the software required in any operating system by first installing Jupyter Notebook or Jupyter Lab with the most recent version of Python, or install Anaconda which can install everything at once. While hardware requirements for Jupyter are relatively modest, we recommend a machine with at least 4 cores of 2Ghz and 8Gb of RAM.

Alternatively, to installing the software locally, you can run the code in the cloud using Google Colab or another cloud notebook service.

Either way, the following packages are required to run the code in all the chapters (Google Colab has all the packages denoted with a ^):

Chapter Software required OS required
1 - 13 ^ Python 3.6+ Windows, Mac OS X, and Linux (Any)
1 - 13 ^ matplotlib 3.2.2+ Windows, Mac OS X, and Linux (Any)
1 - 13 ^ scikit-learn 0.22.2+ Windows, Mac OS X, and Linux (Any)
1 - 12 ^ pandas 1.1.5+ Windows, Mac OS X, and Linux (Any)
2 - 13 machine-learning-datasets 0.01.16+ Windows, Mac OS X, and Linux (Any)
2 - 13 ^ numpy 1.19.5+ Windows, Mac OS X, and Linux (Any)
3 - 13 ^ seaborn 0.11.1+ Windows, Mac OS X, and Linux (Any)
3 - 13 ^ tensorflow 2.4.1+ Windows, Mac OS X, and Linux (Any)
5 - 12 shap 0.38.1+ Windows, Mac OS X, and Linux (Any)
1, 5, 10, 12 ^ scipy 1.4.1+ Windows, Mac OS X, and Linux (Any)
5, 10-12 ^ xgboost 0.90+ Windows, Mac OS X, and Linux (Any)
6, 11, 12 ^ lightgbm 2.2.3+ Windows, Mac OS X, and Linux (Any)
7 - 9 alibi 0.5.5+ Windows, Mac OS X, and Linux (Any)
10 - 13 ^ tqdm 4.41.1+ Windows, Mac OS X, and Linux (Any)
2, 9 ^ statsmodels 0.10.2+ Windows, Mac OS X, and Linux (Any)
3, 5 rulefit 0.3.1+ Windows, Mac OS X, and Linux (Any)
6, 8 lime 0.2.0.1+ Windows, Mac OS X, and Linux (Any)
7, 12 catboost 0.24.4+ Windows, Mac OS X, and Linux (Any)
8, 9 ^ Keras 2.4.3+ Windows, Mac OS X, and Linux (Any)
11, 12 ^ pydot 1.3.0+ Windows, Mac OS X, and Linux (Any)
11, 12 xai 0.0.4+ Windows, Mac OS X, and Linux (Any)
1 ^ beautifulsoup4 4.6.3+ Windows, Mac OS X, and Linux (Any)
1 ^ requests 2.23.0+ Windows, Mac OS X, and Linux (Any)
3 cvae 0.0.3+ Windows, Mac OS X, and Linux (Any)
3 interpret 0.2.2+ Windows, Mac OS X, and Linux (Any)
3 ^ six 1.15.0+ Windows, Mac OS X, and Linux (Any)
3 skope-rules 1.0.1+ Windows, Mac OS X, and Linux (Any)
4 PDPbox 0.2.0+ Windows, Mac OS X, and Linux (Any)
4 pycebox 0.0.1+ Windows, Mac OS X, and Linux (Any)
5 alepython 0.1+ Windows, Mac OS X, and Linux (Any)
5 tensorflow-docs 0.0.02+ Windows, Mac OS X, and Linux (Any)
6 ^ nltk 3.2.5+ Windows, Mac OS X, and Linux (Any)
7 witwidget 1.7.0+ Windows, Mac OS X, and Linux (Any)
8 ^ opencv-python 4.1.2.30+ Windows, Mac OS X, and Linux (Any)
8 ^ scikit-image 0.16.2+ Windows, Mac OS X, and Linux (Any)
8 tf-explain 0.2.1+ Windows, Mac OS X, and Linux (Any)
8 tf-keras-vis 0.5.5+ Windows, Mac OS X, and Linux (Any)
9 SALib 1.3.12+ Windows, Mac OS X, and Linux (Any)
9 distython 0.0.3+ Windows, Mac OS X, and Linux (Any)
10 ^ mlxtend 0.14.0+ Windows, Mac OS X, and Linux (Any)
10 sklearn-genetic 0.3.0+ Windows, Mac OS X, and Linux (Any)
11 aif360==0.3.0 Windows, Mac OS X, and Linux (Any)
11 BlackBoxAuditing==0.1.54 Windows, Mac OS X, and Linux (Any)
11 dowhy 0.5.1+ Windows, Mac OS X, and Linux (Any)
11 econml 0.9.0+ Windows, Mac OS X, and Linux (Any)
11 ^ networkx 2.5+ Windows, Mac OS X, and Linux (Any)
12 bayesian-optimization 1.2.0+ Windows, Mac OS X, and Linux (Any)
12 ^ graphviz 0.10.1+ Windows, Mac OS X, and Linux (Any)
12 tensorflow-lattice 2.0.7+ Windows, Mac OS X, and Linux (Any)
13 adversarial-robustness-toolbox 1.5.0+ Windows, Mac OS X, and Linux (Any)

NOTE: the library machine-learning-datasets is the official name of what in the book is referred to as mldatasets. Due to naming conflicts, it had to be changed.

The exact versions of each library, as tested, can be found in the requirements.txt file and installed like this should you have a dedicated environment for them:

> pip install -r requirements.txt

You might get some conflicts specifically with libraries cvae, alepython, pdpbox and xai. If this is the case, try:

> pip install --no-deps -r requirements.txt

Alternatively, you can install libraries one chapter at a time inside of a local Jupyter environment using cells with !pip install or run all the code in Google Colab with the following links:

Remember to make sure you click on the menu item "File > Save a copy in Drive" as soon you open each link to ensure that your notebook is saved as you run it. Also, notebooks denoted with plus sign (+) are relatively compute-intensive, and will take an extremely long time to run on Google Colab but if you must go to "Runtime > Change runtime type" and select "High-RAM" for runtime shape. Otherwise, a better cloud enviornment or local environment is preferable.

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Summary

The book does much more than explain technical topics, but here's a summary of the chapters:

Chapters topics

Related products

Get to Know the Authors

Serg Masís has been at the confluence of the internet, application development, and analytics for the last two decades. Currently, he's a Climate and Agronomic Data Scientist at Syngenta, a leading agribusiness company with a mission to improve global food security. Before that role, he co-founded a startup, incubated by Harvard Innovation Labs, that combined the power of cloud computing and machine learning with principles in decision-making science to expose users to new places and events. Whether it pertains to leisure activities, plant diseases, or customer lifetime value, Serg is passionate about providing the often-missing link between data and decision-making — and machine learning interpretation helps bridge this gap more robustly.

Owner
Packt
Providing books, eBooks, video tutorials, and articles for IT developers, administrators, and users.
Packt
Generate music from midi files using BPE and markov model

Generate music from midi files using BPE and markov model

Aditya Khadilkar 37 Oct 24, 2022
PyCaret is an open-source, low-code machine learning library in Python that automates machine learning workflows.

An open-source, low-code machine learning library in Python 🚀 Version 2.3.5 out now! Check out the release notes here. Official • Docs • Install • Tu

PyCaret 6.7k Jan 08, 2023
A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Demand-Forecasting Business Problem A chain of stores, 10 different stores and 50 different requests a 3-month demand forecast for its product.

Ayşe Nur Türkaslan 3 Mar 06, 2022
Azure MLOps (v2) solution accelerators.

Azure MLOps (v2) solution accelerator Welcome to the MLOps (v2) solution accelerator repository! This project is intended to serve as the starting poi

Microsoft Azure 233 Jan 01, 2023
Both social media sentiment and stock market data are crucial for stock price prediction

Relating-Social-Media-to-Stock-Movement-Public - We explore the application of Machine Learning for predicting the return of the stock by using the information of stock returns. A trading strategy ba

Vishal Singh Parmar 15 Oct 29, 2022
A high performance and generic framework for distributed DNN training

BytePS BytePS is a high performance and general distributed training framework. It supports TensorFlow, Keras, PyTorch, and MXNet, and can run on eith

Bytedance Inc. 3.3k Dec 28, 2022
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models. Solve a variety of tasks with pre-trained models or finetune them in

Backprop 227 Dec 10, 2022
Responsible Machine Learning with Python

Examples of techniques for training interpretable ML models, explaining ML models, and debugging ML models for accuracy, discrimination, and security.

ph_ 624 Jan 06, 2023
A Python package for time series classification

pyts: a Python package for time series classification pyts is a Python package for time series classification. It aims to make time series classificat

Johann Faouzi 1.4k Jan 01, 2023
Python Research Framework

Python Research Framework

EleutherAI 106 Dec 13, 2022
End to End toy example of MLOps

churn_model MLOps Toy Example End to End You might find below links useful Connect VSCode to Git MLFlow Port Heroku App Project Organization ├── LICEN

Ashish Tele 6 Feb 06, 2022
Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máquina.

Estatistica para Ciência de Dados e Machine Learning Arquivos do curso online sobre a estatística voltada para ciência de dados e aprendizado de máqui

Renan Barbosa 1 Jan 10, 2022
ETNA is an easy-to-use time series forecasting framework.

ETNA is an easy-to-use time series forecasting framework. It includes built in toolkits for time series preprocessing, feature generation, a variety of predictive models with unified interface - from

Tinkoff.AI 674 Jan 07, 2023
This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you ask it.

Crypto-Currency-Predictor This machine-learning algorithm takes in data from the last 60 days and tries to predict tomorrow's price of any crypto you

Hazim Arafa 6 Dec 04, 2022
A Collection of Conference & School Notes in Machine Learning 🦄📝🎉

Machine Learning Conference & Summer School Notes. 🦄📝🎉

558 Dec 28, 2022
A simple application that calculates the probability distribution of a normal distribution

probability-density-function General info An application that calculates the probability density and cumulative distribution of a normal distribution

1 Oct 25, 2022
BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models.

Model Serving Made Easy BentoML is a flexible, high-performance framework for serving, managing, and deploying machine learning models. Supports multi

BentoML 4.4k Jan 04, 2023
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.

AI Fairness 360 (AIF360) The AI Fairness 360 toolkit is an extensible open-source library containg techniques developed by the research community to h

1.9k Jan 06, 2023
CD) in machine learning projectsImplementing continuous integration & delivery (CI/CD) in machine learning projects

CML with cloud compute This repository contains a sample project using CML with Terraform (via the cml-runner function) to launch an AWS EC2 instance

Iterative 19 Oct 03, 2022
An open-source library of algorithms to analyse time series in GPU and CPU.

An open-source library of algorithms to analyse time series in GPU and CPU.

Shapelets 216 Dec 30, 2022