A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization

Overview

Website, Tutorials, and Docs     

 
Uncertainty Toolbox

A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization.
Also: a glossary of useful terms and a collection of relevant papers and references.

 
Many machine learning methods return predictions along with uncertainties of some form, such as distributions or confidence intervals. This begs the questions: How do we determine which predictive uncertanties are best? What does it mean to produce a best or ideal uncertainty? Are our uncertainties accurate and well calibrated?

Uncertainty Toolbox provides standard metrics to quantify and compare predictive uncertainty estimates, gives intuition for these metrics, produces visualizations of these metrics/uncertainties, and implements simple "re-calibration" procedures to improve these uncertainties. This toolbox currently focuses on regression tasks.

Toolbox Contents

Uncertainty Toolbox contains:

  • Glossary of terms related to predictive uncertainty quantification.
  • Metrics for assessing quality of predictive uncertainty estimates.
  • Visualizations for predictive uncertainty estimates and metrics.
  • Recalibration methods for improving the calibration of a predictor.
  • Paper list: publications and references on relevant methods and metrics.

Installation

Uncertainty Toolbox requires Python 3.6+. For a lightweight installation of the package only, run:

pip install git+https://github.com/uncertainty-toolbox/uncertainty-toolbox

For a full installation with examples and tests, run:

git clone https://github.com/uncertainty-toolbox/uncertainty-toolbox.git
cd uncertainty-toolbox
pip install -e .

To verify correct installation, you can run the test suite via:

source shell/run_all_tests.sh

Quick Start

import uncertainty_toolbox as uct

# Load an example dataset of 100 predictions, uncertainties, and ground truth values
predictions, predictions_std, y, x = uct.data.synthetic_sine_heteroscedastic(100)

# Compute all uncertainty metrics
metrics = uct.metrics.get_all_metrics(predictions, predictions_std, y)

This example computes metrics for a vector of predicted values (predictions) and associated uncertainties (predictions_std, a vector of standard deviations), taken with respect to a corresponding set of ground truth values y.

Colab notebook: You can also take a look at this Colab notebook, which walks through a use case of Uncertainty Toolbox.

Metrics

Uncertainty Toolbox provides a number of metrics to quantify and compare predictive uncertainty estimates. For example, the get_all_metrics function will return:

  1. average calibration: mean absolute calibration error, root mean squared calibration error, miscalibration area.
  2. adversarial group calibration: mean absolute adversarial group calibration error, root mean squared adversarial group calibration error.
  3. sharpness: expected standard deviation.
  4. proper scoring rules: negative log-likelihood, continuous ranked probability score, check score, interval score.
  5. accuracy: mean absolute error, root mean squared error, median absolute error, coefficient of determination, correlation.

Visualizations

The following plots are a few of the visualizations provided by Uncertainty Toolbox. See this example for code to reproduce these plots.

Overconfident (too little uncertainty)

Underconfident (too much uncertainty)

Well calibrated

And here are a few of the calibration metrics for the above three cases:

Mean absolute calibration error (MACE) Root mean squared calibration error (RMSCE) Miscalibration area (MA)
Overconfident 0.19429 0.21753 0.19625
Underconfident 0.20692 0.23003 0.20901
Well calibrated 0.00862 0.01040 0.00865

Recalibration

The following plots show the results of a recalibration procedure provided by Uncertainty Toolbox, which transforms a set of predictive uncertainties to improve average calibration. The algorithm is based on isotonic regression, as proposed by Kuleshov et al.

See this example for code to reproduce these plots.

Recalibrating overconfident predictions

Mean absolute calibration error (MACE) Root mean squared calibration error (RMSCE) Miscalibration area (MA)
Before Recalibration 0.19429 0.21753 0.19625
After Recalibration 0.01124 0.02591 0.01117

Recalibrating underconfident predictions

Mean absolute calibration error (MACE) Root mean squared calibration error (RMSCE) Miscalibration area (MA)
Before Recalibration 0.20692 0.23003 0.20901
After Recalibration 0.00157 0.00205 0.00132

Contributing

We welcome and greatly appreciate contributions from the community! Please see our contributing guidelines for details on how to help out.

Citation

If you found this toolbox helpful, please cite the following paper:

@article{chung2021uncertainty,
  title={Uncertainty Toolbox: an Open-Source Library for Assessing, Visualizing, and Improving Uncertainty Quantification},
  author={Chung, Youngseog and Char, Ian and Guo, Han and Schneider, Jeff and Neiswanger, Willie},
  journal={arXiv preprint arXiv:2109.10254},
  year={2021}
}

Additionally, here are papers that led to the development of the toolbox:

@article{chung2020beyond,
  title={Beyond Pinball Loss: Quantile Methods for Calibrated Uncertainty Quantification},
  author={Chung, Youngseog and Neiswanger, Willie and Char, Ian and Schneider, Jeff},
  journal={arXiv preprint arXiv:2011.09588},
  year={2020}
}

@article{tran2020methods,
  title={Methods for comparing uncertainty quantifications for material property predictions},
  author={Tran, Kevin and Neiswanger, Willie and Yoon, Junwoong and Zhang, Qingyang and Xing, Eric and Ulissi, Zachary W},
  journal={Machine Learning: Science and Technology},
  volume={1},
  number={2},
  pages={025006},
  year={2020},
  publisher={IOP Publishing}
}

Acknowledgments

Development of Uncertainty Toolbox is supported by the following organizations.

               

   

Comments
  • Use flit, conda-souschef, and grayskull to make PyPI/Anaconda uploads straightforward.

    Use flit, conda-souschef, and grayskull to make PyPI/Anaconda uploads straightforward.

    #59 #58 Right now, I have a version of uncertainty_toolbox uploaded to PyPI and Anaconda.

    pip install uncertainty_toolbox
    
    conda install -c sgbaird uncertainty_toolbox
    

    The basic instructions (after some one-time setup) are to install flit (e.g. conda install flit), update the version number in uncertainty_toolbox/__init__.py and run:

    flit publish
    

    to upload a new version to PyPI. I probably need to add other people's usernames to PyPI so I'm not the only one that can upload new versions.

    For the Anaconda upload, install conda-souschef and grayskull, run a slightly customized script, and build it within the scratch folder.

    conda install conda-souschef grayskull
    python run_grayskull.py
    cd scratch
    conda build .
    

    For this, you probably have to set a few things with conda first, such as automatic uploads when building, credentials, and configuring it to look in certain channels. These are all one-time setup instructions. I also have some GitHub workflow code in mat_discover that can take care of the uploads (and testing) automatically when you make a new release. Just need to change a couple lines and add credentials to GitHub secrets.

    opened by sgbaird 7
  • Added more UTs for util.py for dev branch

    Added more UTs for util.py for dev branch

    New added UTs consider:

    • function calls with no arguments for both assert_is_flat_same_shape() and assert_is_positive()

    • new UTs for assert_is_positive() function

    • based on assert_is_positive() description, a UT with 2D ndarray was included too.

    @willieneis @YoungseogChung @IanChar @HanGuo97

    Test execution output: image

    opened by marcemq 5
  • Using this package on machine learning results

    Using this package on machine learning results

    Hi,

    Thanks for making this package available to us! I have a simple question:

    I have a data set split into train/validation sets. A regressor (or classifier) is trained, and then by using the validation set I get an estimate Y_prediction. From the same validation set, I have the Y_true. So, how would you suggest to compute the standard deviation on predictions from a regressor (or classifier)?

    Many thanks,

    Ivan

    opened by ivan-marroquin 4
  • Installation on google colab

    Installation on google colab

    Hello I have tried to install the package in google colab according to your instruction. It dose not work. would you help me?

    Thanks for the great package

    opened by dara1400 4
  • uncertainty quantification of a neural network

    uncertainty quantification of a neural network

    Hi, I have a trained neural network. How can I used the Uncertainty Toolbox to quantify uncertainty of the neural network. How should I calculate predictions_std (a vector of standard deviations)?

    opened by admodadmod 3
  • Added more UTs for util.py

    Added more UTs for util.py

    New added UTs consider:

    • function calls with no arguments for both assert_is_flat_same_shape() and assert_is_positive()
    • new UTs for assert_is_positive() function
    • based on assert_is_positive() description, a UT with 2D ndarray was included too.

    @willieneis @YoungseogChung @IanChar @HanGuo97

    Test execution output: image

    opened by marcemq 2
  • Convert from symmetric confidence intervals to standard deviation

    Convert from symmetric confidence intervals to standard deviation

    Incorporating some code into another repository, and wanted to check with some people with (very likely) a more thorough statistics background than myself.

    Conversion from standard deviation to confidence intervals takes place in: https://github.com/uncertainty-toolbox/uncertainty-toolbox/blob/b2f342f6606d1d667bf9583919a663adf8643efe/uncertainty_toolbox/metrics_scoring_rule.py#L187

    What is the inverse conversion from symmetric confidence intervals back to standard deviation? (e.g. using scipy.stats.norm but doesn't have to be)

    question 
    opened by sgbaird 2
  • "layman" definition of interval score using width and coverage

    I'm trying to figure out how to describe the math behind interval score in a simple way. Based on reading the paper and related literature, I get that it incorporates width and coverage. Is the idea that it penalizes the following two scenarios:

    • wide intervals
    • when many predictions fall outside of the intervals

    In other words, the interval should be as narrow as possible while having the interval "cross the parity line" as frequently as possible, correct? How does this play out in the math?

    opened by sgbaird 2
  • MACE vs ECE

    MACE vs ECE

    Hello, thanks for sharing your code!

    I noticed that you implement the metric "MACE", while I was more familiar with the term "ECE". Is there a difference between the two?

    opened by tfjgeorge 2
  • Quantile Regression

    Quantile Regression

    Hello!

    Thanks for this convenient tool! I was wondering how you best suggest using it for quantile regression. I see that a lot of functions take in a pred-mean and std. However, I thought it would also be nice if there were additional support for quantiles and not only std + mean support. Is there a way of hacking my way through what is now so that I could use the quantile intervals I get instead?

    opened by velezbeltran 1
  • [Feature Request] Option for bivariate distribution plots in lieu of scatter plots

    [Feature Request] Option for bivariate distribution plots in lieu of scatter plots

    When dealing with "large" datasets of over 100 data points, scatter plots can become cluttered. Additionally, it's difficult to really distinguish points that are on top of each other. So a cluster of 1000 data points could look the same as a cluster of 100, but they mean very different things.

    Seaborn's bivariate displots can get around this issue by showing a density of points rather than each individual point. It'd be nice to have the option to make such plots for, say, plot_xy or plot_residuals_vs_stdevs.

    I acknowledge this would be a messy request, but it'd be a feature I'd use.

    opened by KevinTran-TRI 1
  • Make requirements even more lightweight.

    Make requirements even more lightweight.

    Some requirements are not needed in the regular requirements right now.

    1. Shapely is no longer needed. The imports in several files first need to be removed, and plot_calibration (in viz.py) needs to use the new calibration calculation code.
    2. Black and py-test can be added to the dev requirements instead of regular requirements.
    opened by IanChar 0
  • Pytorch GPU Acceleration

    Pytorch GPU Acceleration

    @tailintalent found that many of the evaluation metrics can be greatly sped up with gpu support via pytorch. This code is currently in an experimental branch called "torch". We will keep it separate from the main package for now. In order to be merged into main we will need the following:

    • The code relying on torch needs to be separated out because we do not want to make torch a mandatory install. This needs to be done in the most robust way possible, ideally separating by pinpointing the exact operations where torch provides a speedup rather than making a torch version of the pre-existing code.
    • Set up an optional installation that allows users to select the torch-enabled version of the toolbox.
    • Add options to the functions that allows users to select to use the torch version of the metric rather than the numpy version.
    enhancement 
    opened by IanChar 1
  • Should the axes on the calibration curves be switched?

    Should the axes on the calibration curves be switched?

    I've had at least two different folks come to me being very confused about the shapes of calibration curves. After a few minutes of discussion, it turns out the confusion arose from how we plot the observed values are on the y-axis and predicted values are on the x-axis. This is the opposite of conventional parity plots, where observations are on the x-axis and predictions are on the y-axis.

    What do folks thing about flipping the axes? I know this would be different than the original Kuleshov paper's graphs (and also the paper that @willieneis and I worked on together), but a part of me would rather plot what folks would expect to minimize future confusion.

    opened by KevinTran-TRI 2
  • `Could not find lib geos_c.dll or load any of its variants []` when trying to import uncertainty_toolbox (shapely issue)

    `Could not find lib geos_c.dll or load any of its variants []` when trying to import uncertainty_toolbox (shapely issue)

    After running:

    pip install git+https://github.com/uncertainty-toolbox/uncertainty-toolbox
    

    on Windows 10, VS Code, inside of conda environment, I get:

    Could not find lib geos_c.dll or load any of its variants [].
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\shapely\geos.py", line 54, in load_dll
        raise OSError(
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\shapely\geos.py", line 178, in <module>
        _lgeos = load_dll("geos_c.dll")
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\shapely\coords.py", line 10, in <module>
        from shapely.geos import lgeos
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\shapely\geometry\base.py", line 20, in <module>
        from shapely.coords import CoordinateSequence
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\shapely\geometry\__init__.py", line 4, in <module>
        from .base import CAP_STYLE, JOIN_STYLE
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\uncertainty_toolbox\metrics_calibration.py", line 10, in <module>
        from shapely.geometry import Polygon, LineString
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\uncertainty_toolbox\metrics.py", line 9, in <module>
        from uncertainty_toolbox.metrics_calibration import (
      File "C:\Users\sterg\miniconda3\envs\vickers-hardness\Lib\site-packages\uncertainty_toolbox\__init__.py", line 9, in <module>
        from .metrics import (
      File "C:\Users\sterg\Documents\GitHub\sparks-baird\VickersHardnessPrediction\hv_prediction.py", line 21, in <module>
        import uncertainty_toolbox as uct
    

    By installing shapely from conda (https://stackoverflow.com/questions/56813083/oserror-could-not-find-geos-c-dll-or-load-any-of-its-variants),

    conda install shapely
    

    For some reason (not sure if this would be the same for everyone), I need to do an uninstall and reinstall it one more time:

    conda uninstall shapely
    conda install shapely
    

    and it seems to work OK now.

    opened by sgbaird 0
Releases(v0.1.0)
  • v0.1.0(Sep 22, 2021)

    Initial release v0.1.0 for Uncertainty Toolbox.

    Highlights

    • Metrics for assessing quality of predictive uncertainty estimates.
    • Visualizations for predictive uncertainty estimates and metrics.
    • Recalibration methods for improving the calibration of a predictor.
    • Website with a tutorial on how to use Uncertainty Toolbox.
    • Documentation and API reference for Uncertainty Toolbox.
    • Glossary of terms related to predictive uncertainty quantification.
    • Publications and references on relevant methods and metrics.
    Source code(tar.gz)
    Source code(zip)
Owner
Uncertainty Toolbox
A python toolbox for predictive uncertainty quantification, calibration, metrics, and visualization.
Uncertainty Toolbox
Object DGCNN and DETR3D, Our implementations are built on top of MMdetection3D.

This repo contains the implementations of Object DGCNN (https://arxiv.org/abs/2110.06923) and DETR3D (https://arxiv.org/abs/2110.06922). Our implementations are built on top of MMdetection3D.

Wang, Yue 539 Jan 07, 2023
Repository for MuSiQue: Multi-hop Questions via Single-hop Question Composition

🎵 MuSiQue: Multi-hop Questions via Single-hop Question Composition This is the repository for our paper "MuSiQue: Multi-hop Questions via Single-hop

21 Jan 02, 2023
This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Con

401 Dec 16, 2022
SymPy-powered, Wolfram|Alpha-like answer engine totally in your browser, without backend computation

SymPy Beta SymPy Beta is a fork of SymPy Gamma. The purpose of this project is to run a SymPy-powered, Wolfram|Alpha-like answer engine totally in you

Liumeo 25 Dec 21, 2022
[MICCAI'20] AlignShift: Bridging the Gap of Imaging Thickness in 3D Anisotropic Volumes

AlignShift NEW: Code for our new MICCAI'21 paper "Asymmetric 3D Context Fusion for Universal Lesion Detection" will also be pushed to this repository

Medical 3D Vision 42 Jan 06, 2023
Music Source Separation; Train & Eval & Inference piplines and pretrained models we used for 2021 ISMIR MDX Challenge.

Introduction 1. Usage (For MSS) 1.1 Prepare running environment 1.2 Use pretrained model 1.3 Train new MSS models from scratch 1.3.1 How to train 1.3.

Leo 100 Dec 25, 2022
Extremely simple and fast extreme multi-class and multi-label classifiers.

napkinXC napkinXC is an extremely simple and fast library for extreme multi-class and multi-label classification, that focus of implementing various m

Marek Wydmuch 43 Nov 14, 2022
Official Pytorch implementation of Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations

Scene Representation Networks This is the official implementation of the NeurIPS submission "Scene Representation Networks: Continuous 3D-Structure-Aw

Vincent Sitzmann 365 Jan 06, 2023
This project is for a Twitter bot that monitors a bird feeder in my backyard. Any detected birds are identified and posted to Twitter.

Backyard Birdbot Introduction This is a silly hobby project to use existing ML models to: Detect any birds sighted by a webcam Identify whic

Chi Young Moon 71 Dec 25, 2022
A sketch extractor for anime/illustration.

Anime2Sketch Anime2Sketch: A sketch extractor for illustration, anime art, manga By Xiaoyu Xiang Updates 2021.5.2: Upload more example results of anim

Xiaoyu Xiang 1.6k Jan 01, 2023
A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

A boosting-based Multiple Instance Learning (MIL) package that includes MIL-Boost and MCIL-Boost

Jun-Yan Zhu 27 Aug 08, 2022
Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

MixFaceNets This is the official repository of the paper: MixFaceNets: Extremely Efficient Face Recognition Networks. (Accepted in IJCB2021) https://i

Fadi Boutros 51 Dec 13, 2022
SelfRemaster: SSL Speech Restoration

SelfRemaster: Self-Supervised Speech Restoration Official implementation of SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesi

Takaaki Saeki 46 Jan 07, 2023
Collective Multi-type Entity Alignment Between Knowledge Graphs (WWW'20)

CG-MuAlign A reference implementation for "Collective Multi-type Entity Alignment Between Knowledge Graphs", published in WWW 2020. If you find our pa

Bran Zhu 28 Dec 11, 2022
[CVPRW 2022] Attentions Help CNNs See Better: Attention-based Hybrid Image Quality Assessment Network

Attention Helps CNN See Better: Hybrid Image Quality Assessment Network [CVPRW 2022] Code for Hybrid Image Quality Assessment Network [paper] [code] T

IIGROUP 49 Dec 11, 2022
TensorFlow implementation for Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How

Bayesian Modeling and Uncertainty Quantification for Learning to Optimize: What, Why, and How TensorFlow implementation for Bayesian Modeling and Unce

Shen Lab at Texas A&M University 8 Sep 02, 2022
Generalized Proximal Policy Optimization with Sample Reuse (GePPO)

Generalized Proximal Policy Optimization with Sample Reuse This repository is the official implementation of the reinforcement learning algorithm Gene

Jimmy Queeney 9 Nov 28, 2022
Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency[ECCV 2020]

Self-Supervised Monocular 3D Face Reconstruction by Occlusion-Aware Multi-view Geometry Consistency(ECCV 2020) This is an official python implementati

304 Jan 03, 2023
Source code of our work: "Benchmarking Deep Models for Salient Object Detection"

SALOD Source code of our work: "Benchmarking Deep Models for Salient Object Detection". In this works, we propose a new benchmark for SALient Object D

22 Dec 30, 2022
Demo project for real time anomaly detection using kafka and python

kafkaml-anomaly-detection Project for real time anomaly detection using kafka and python It's assumed that zookeeper and kafka are running in the loca

Rodrigo Arenas 36 Dec 12, 2022