Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Related tags

Data Analysiselicited
Overview

Elicited

Helper tools to construct probability distributions built from expert elicited data for use in monte carlo simulations.

Credit to Brett Hoover, packaging by @magoo

Usage

pip install elicited
import elicited as e

elicited is just a helper tool when using numpy and scipy, so you'll need these in your code.

import numpy as np
from scipy.stats import poisson, zipf, beta, pareto, lognorm

Lognormal

See Occurance and Applications for examples of lognormal distributions in nature.

Expert: Most customers hold around $20K (mode) but I could imagine a customer with $2.5M (max)

mode = 20000
max = 2500000

mean, stdv = e.elicitLogNormal(mode, max)
asset_values = lognorm(s=stdv, scale=np.exp(mean))
asset_values.rvs(100)

Pareto

The 80/20 rule. See Occurance and Applications

Expert: The legal costs of an incident could be devastating. Typically costs are almost zero (val_min) but a black swan could be $100M (val_max).

b = e.elicitPareto(val_min, val_max)
p = pareto(b, loc=val_min-1., scale=1.))

PERT

See PERT Distribution

Expert: Our customers have anywhere from $500-$6000 (val_min / val_max), but it's most typically around $4500 (val_mod)

PERT_a, PERT_b = e.elicitPERT(val_min, val_mod, val_max)
pert = beta(PERT_a, PERT_b, loc=val_min, scale=val_max-val_min)

Zipf's

See Applications

Expert: If we get sued, there will only be a few litigants (nMin). Very rarely it could be 30 or more litigants (nMax), maybe once every thousand cases (pMax) it would be more.

nMin = 1
nMax = 30
pMax = 1/1000

Zs = e.elicitZipf(nMin, nMax, pMax, report=True)

litigants = zipf(Zs, nMin-1)

litigants.rvs(100)

Reference: Other Useful Elicitations

Listed as a courtesy, these distributions are simple enough to elicit data into directly without a helper function.

Uniform

A "zero knowledge" distribution where all values within the range have equal probability of appearing. Similar to random.randint(a, b)

Expert: The crowd will be between 50 (min) and 500 (max) due to fire code restrictions and the existing residents in the building.

from scipy.stats import uniform

min = 50
max = 500

range = max - min

crowd_size = uniform(min, range)
crowd_size.rvs(100)

Poisson

Expert: About 3000 Customers (average) add a credit card to their account every quarter.

from scipy.stats import poisson
average = 3000
upsells = poisson(average)
upsells.rvs(100)
Owner
Ryan McGeehan
Founder / Advisor @ HackerOne Former Director of Security @ Coinbase Former Director of Security @ Facebook
Ryan McGeehan
Desafio proposto pela IGTI em seu bootcamp de Cloud Data Engineer

Desafio Modulo 4 - Cloud Data Engineer Bootcamp - IGTI Objetivos Criar infraestrutura como código Utuilizando um cluster Kubernetes na Azure Ingestão

Otacilio Filho 4 Jan 23, 2022
A Python package for modular causal inference analysis and model evaluations

Causal Inference 360 A Python package for inferring causal effects from observational data. Description Causal inference analysis enables estimating t

International Business Machines 506 Dec 19, 2022
MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

MIR Cheatsheet - Survival Guidebook for MIR Researchers in the Lab

SeungHeonDoh 3 Jul 02, 2022
Python package to transfer data in a fast, reliable, and packetized form.

pySerialTransfer Python package to transfer data in a fast, reliable, and packetized form.

PB2 101 Dec 07, 2022
In this tutorial, raster models of soil depth and soil water holding capacity for the United States will be sampled at random geographic coordinates within the state of Colorado.

Raster_Sampling_Demo (Resulting graph of this demo) Background Sampling values of a raster at specific geographic coordinates can be done with a numbe

2 Dec 13, 2022
The official pytorch implementation of ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias

ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias Introduction | Updates | Usage | Results&Pretrained Models | Statement | Intr

104 Nov 27, 2022
Data-sets from the survey and analysis

bachelor-thesis "Umfragewerte.xlsx" contains the orginal survey results. "umfrage_alle.csv" contains the survey results but one participant is cancele

1 Jan 26, 2022
ToeholdTools is a Python package and desktop app designed to facilitate analyzing and designing toehold switches, created as part of the 2021 iGEM competition.

ToeholdTools Category Status Repository Package Build Quality A library for the analysis of toehold switch riboregulators created by the iGEM team Cit

0 Dec 01, 2021
ASTR 302: Python for Astronomy (Winter '22)

ASTR 302, Winter 2022, University of Washington: Python for Astronomy Mario Jurić Location When: 2:30-3:50, Monday & Wednesday, Winter quarter 2022 Wh

UW ASTR 302: Python for Astronomy 4 Jan 12, 2022
Investigating EV charging data

Investigating EV charging data Introduction: Got an opportunity to work with a home monitoring technology company over the last 6 months whose goal wa

Yash 2 Apr 07, 2022
Retail-Sim is python package to easily create synthetic dataset of retaile store.

Retailer's Sale Data Simulation Retail-Sim is python package to easily create synthetic dataset of retaile store. Simulation Model Simulator consists

Corca AI 7 Sep 30, 2022
Evaluation of a Monocular Eye Tracking Set-Up

Evaluation of a Monocular Eye Tracking Set-Up As part of my master thesis, I implemented a new state-of-the-art model that is based on the work of Che

Pascal 19 Dec 17, 2022
A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.

TennisBusinessIntelligenceProject - A project consists in a set of assignements corresponding to a BI process: data integration, construction of an OLAP cube, qurying of a OPLAP cube and reporting.

carlo paladino 1 Jan 02, 2022
A powerful data analysis package based on mathematical step functions. Strongly aligned with pandas.

The leading use-case for the staircase package is for the creation and analysis of step functions. Pretty exciting huh. But don't hit the close button

48 Dec 21, 2022
Tkinter Izhikevich Neuron Model With Python

TKINTER IZHIKEVICH NEURON MODEL WITH PYTHON Hodgkin-Huxley Model It is a mathematical model for the generation and transmission of action potentials i

Rabia KOÇ 8 Jul 16, 2022
yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data.

The yt Project yt is an open-source, permissively-licensed Python library for analyzing and visualizing volumetric data. yt supports structured, varia

The yt project 367 Dec 25, 2022
Analyzing Covid-19 Outbreaks in Ontario

My group and I took Covid-19 outbreak statistics from ontario, and analyzed them to find different patterns and future predictions for the virus

Vishwaajeeth Kamalakkannan 0 Jan 20, 2022
ETL pipeline on movie data using Python and postgreSQL

Movies-ETL ETL pipeline on movie data using Python and postgreSQL Overview This project consisted on a automated Extraction, Transformation and Load p

Juan Nicolas Serrano 0 Jul 07, 2021
Data collection, enhancement, and metrics calculation.

l3_data_collection Data collection, enhancement, and metrics calculation. Summary Repository containing code for QuantDAO's JDT data collection task.

Ruiwyn 3 Dec 23, 2022
Top 50 best selling books on amazon

It's a dashboard that shows the detailed information about each book in the top 50 best selling books on amazon over the last ten years

Nahla Tarek 1 Nov 18, 2021