A Python Package for Convex Regression and Frontier Estimation

Last update: Jan 08, 2023

Overview

pyStoNED

pyStoNED is a Python package that provides functions for estimating multivariate convex regression, convex quantile regression, convex expectile regression, isotonic regression, stochastic nonparametric envelopment of data, and related methods. It also facilitates eﬃciency measurement using the conventional Data Envelopement Analysis (DEA) and Free Disposable Hull (FDH) approaches. The pyStoNED package allows practitioners to estimate these models in an open access environment under a GPL-3.0 License.

Installation

The pyStoNED package is now avaiable on PyPI and the latest development version can be installed from the Github repository pyStoNED. Please feel free to download and test it. We welcome any bug reports and feedback.

PyPI

pip install pystoned

GitHub

pip install -U git+https://github.com/ds2010/pyStoNED

Authors

Sheng Dai, Ph.D. candidate, Aalto University School of Business.
Yu-Hsueh Fang, Computer Engineer, Institute of Manufacturing Information and Systems, National Cheng Kung University.
Chia-Yen Lee, Professor, College of Management, National Taiwan University.
Timo Kuosmanen, Professor, Aalto University School of Business.

Citation

If you use pyStoNED for published work, we encourage you to cite our following paper and other related works. We appreciate it.

Dai S, Fang YH, Lee CY, Kuosmanen T. (2021). pyStoNED: A Python Package for Convex Regression and Frontier Estimation. arXiv preprint arXiv:2109.12962.

Comments

StoNED and Plot2d/3d: can not plot the StoNED frontier

Hi @JulianATA, I found we can not plot the StoNED frontier using the plot. It should be OK. Please check the following error.

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-cfd06442dd17> in <module>
      2 rd = StoNED.StoNED(model)
      3 model_new = rd.get_frontier(RED_MOM)
----> 4 plot2d(model_new, x_select=0, label_name="StoNED frontier", fig_name="stoned_2d")

C:\Anaconda3\lib\site-packages\pystoned\plot.py in plot2d(model, x_select, label_name, fig_name)
     15         fig_name (String, optional): The name of figure to save. Defaults to None.
     16     """
---> 17     x = np.array(model.x).T[x_select]
     18     y = np.array(model.y).T
     19     if y.ndim != 1:

AttributeError: 'numpy.ndarray' object has no attribute 'x'

I have tried to add the following line to StoNED. https://github.com/ds2010/pyStoNED/blob/b673006ff8fe7152125f42702173f9ce49d1d83e/pystoned/StoNED.py#L17

But it still does not work. Could you please help to fix it? Many thanks in advance!

Sheng

opened by ds2010 8

StoNED: can not get unconditional expected inefficiency

Hi @JulianATA , It seems that there is a bug in StoNED.py when calculating the unconditional expected inefficiency. Please check the following error and fix it. Thanks in advance!

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-4-8d44572d25fb> in <module>
      1 # retrive the unconditional expected inefficiency \mu
      2 rd = StoNED.StoNED(model)
----> 3 print(model.get_unconditional_expected_inefficiency('KDE'))

AttributeError: 'CNLS' object has no attribute 'get_unconditional_expected_inefficiency'

opened by ds2010 8

Refactor basic DEA and FDH by class

This pr provides the refactored basic DEA and FDH You can test the classes with fallowing codes:

DEA

import pandas as pd
import numpy as np

# import the package pystoned
from pystoned import DEA

# import Finnish electricity distribution firms data
url = 'https://raw.githubusercontent.com/ds2010/pyStoNED-Tutorials/master/Data/firms.csv'
df = pd.read_csv(url, error_bad_lines=False)

# output
y = df['Energy']

# inputs
x1 = df['OPEX']
x1 = np.asmatrix(x1).T
x2 = df['CAPEX']
x2 = np.asmatrix(x2).T
x = np.concatenate((x1, x2), axis=1)

model = DEA.DEA(y,x,"oo","vrs")
model.optimize(False)
model.display_theta()

FDH

import pandas as pd
import numpy as np

# import the package pystoned
from pystoned import FDH

# import Finnish electricity distribution firms data
url = 'https://raw.githubusercontent.com/ds2010/pyStoNED-Tutorials/master/Data/firms.csv'
df = pd.read_csv(url, error_bad_lines=False)

# output
y = df['Energy']

# inputs
x1 = df['OPEX']
x1 = np.asmatrix(x1).T
x2 = df['CAPEX']
x2 = np.asmatrix(x2).T
x = np.concatenate((x1, x2), axis=1)

model = FDH.FDH(y,x,"oo")
model.optimize(False)
model.display_theta()

The results are identical to the original codes.
The StoNED + CNLSDDF has been implemented, however it is a little bit complicated.
- To reduce the complexity of StoNED + CNLSDDF, I'm recently working on get_frontier function.
- The get_frontier function can make the implementation of StoNED/StoNED+DDF more consistency.

opened by Fangop 7

feat(StoNEZD): Implement StoNEZD by classes

This pr provides a tiny refactor of CNLSZ and an implementation of StoNEZD. First, thanks for the major refactoring of CNLSZ class, this pr used inheritance to reduce the duplicated codes for consistency of our package. Second, the StoNEZD model has been implemented. It is now simple to implement this kind o advanced model since we have so many basic models as complements.

The testing codes are provided below:

CNLSZ

from pystoned import CNLSZ

import pandas as pd
import numpy as np

# import Finnish electricity distribution firms data
url = 'https://raw.githubusercontent.com/ds2010/pyStoNED-Tutorials/master/Data/firms.csv'
df = pd.read_csv(url, error_bad_lines=False)
df.head(5)

# output (total cost)
y  = df['TOTEX']

# inputs 
x1  = df['Energy']
x1  = np.asmatrix(x1).T
x2  = df['Length']
x2  = np.asmatrix(x2).T
x3  = df['Customers']
x3  = np.asmatrix(x3).T
x   = np.concatenate((x1, x2, x3), axis=1)

# Z variables
z = df['PerUndGr']

# import the CNLSZ module
cet = "mult"
fun = "cost"
rts = "crs"

model = CNLSZ.CNLSZ(y, x, z, cet, fun, rts)
model.optimize()

model.display_residual()

StoNEDZ

# import package pystoned
from pystoned import StoNEDZ

import pandas as pd
import numpy as np

# import Finnish electricity distribution firms data
url = 'https://raw.githubusercontent.com/ds2010/pyStoNED-Tutorials/master/Data/firms.csv'
df = pd.read_csv(url, error_bad_lines=False)
df.head(5)

# output (total cost)
y  = df['TOTEX']

# inputs 
x1  = df['Energy']
x1  = np.asmatrix(x1).T
x2  = df['Length']
x2  = np.asmatrix(x2).T
x3  = df['Customers']
x3  = np.asmatrix(x3).T
x   = np.concatenate((x1, x2, x3), axis=1)

# Z variables
z = df['PerUndGr']

# import the CNLSZ module
cet = "mult"
fun = "cost"
rts = "crs"

model = StoNEDZ.StoNEDZ(y, x, z, cet, fun, rts)
model.optimize()

model.display_residual()

print(model.get_technical_inefficiency("MOM"))

The models based on CNLS/StoNED are now available!

The implementation of get_frontier is in the next pr.
CNLSG and CNLSZG looks nice, maybe just a little modification to structure the files.
Maybe implement some well-know models for users, like StoNED+DDF.
Some other user functions can be provided, like marginal productivity.

The models have not been refactored are below:

Free disposal hull
DEA/DEADDF

opened by Fangop 7

feat(CNLS/CNLSDDF): Implementation of get_frontier.
The get_frontier function is for getting the value of estimated frontier(y value) by CNLS/CNLSDDF. Here is the some thought for better implementation of get_frontier. Please help me justify if my thought have some logical error.

Since true y value = estimated y value + residual for additive models, we may implement the frontier like below:

CNLS

The fallowing y refer to the true y value; frontier refer to estimated y value.

Additive

frontier = y - residual

Multiplicative

frontier = y/(exp(residual)) -1

CNLSDDF

The fallowing y refer to the true y value; frontier refer to estimated y value.

frontier list = y list - residual list
opened by Fangop 4
feat(pyStoned): Implement data checking

This draft pr provides for pyStoNED. Both basic models and directional distance function based models are included. However, it is tricky to test all the circumstance of input. Hence this is just a draft pr.

Please help me check if the message are providing clear information.

I'm on the work for testing all the models, and trying to adapt DEA and FDH models to these checking. So, please do not merge this pr yet.

opened by Fangop 3

feat(CNLS/tools): Implement basic error/exception system

Here is a draft for exception system.

Sometimes, we comes to a situation that should stop the process and inform the users are defined as exceptions in python. Built-in Exceptions

The error is included in exception. The pyStoNED brings in at least 2 types of exceptions here.

Basic exception:

Additive CNLS with CRS

The additive CNLS model with CRS does not exist (or be needed). Hence when creating and additive CNLS model with CRS should raise an exception, since it is not an error but an exception of existing model.

The following codes may halt and bring out an exception

# import packages
from pystoned import CNLS
from pystoned.constant import CET_ADDI, FUN_PROD, OPT_LOCAL, RTS_CRS
from pystoned.dataset import load_Finnish_electricity_firm

# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['Energy', 'Length', 'Customers'],
                                      y_select=['TOTEX'])

# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None,
                    cet = CET_ADDI, fun = FUN_PROD, rts = RTS_CRS)

Please help me justify the discussion above and polish the exception message.

Retrieving variables without optimization

User should optimize the model before retrieving and printing any variables. If not, the program will halt the program and inform the users to optimize the model.

The following codes may halt and bring out an exception

# import packages
from pystoned import CNLS
from pystoned.constant import CET_ADDI, FUN_PROD, OPT_LOCAL, RTS_VRS
from pystoned.dataset import load_Finnish_electricity_firm

# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['Energy', 'Length', 'Customers'],
                                      y_select=['TOTEX'])

# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None,
                    cet = CET_ADDI, fun = FUN_PROD, rts = RTS_VRS)


model.display_alpha()

Value error:

Construct model with unknown parameters

User should construct a model with constant labels in pystoned.constant. If a random string is giving, the program will halt the program and inform the users the model parameter is not defined.

This example construct a model with a random string as cet, causing the value error.

from pystoned import CNLS
from pystoned.constant import CET_ADDI, FUN_PROD, OPT_LOCAL, RTS_VRS
from pystoned.dataset import load_Finnish_electricity_firm

# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['Energy', 'Length', 'Customers'],
                                      y_select=['TOTEX'])

# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None,
                    cet = "Not an CET label", fun = FUN_PROD, rts = RTS_VRS)

Note: This does not affect the default setting.

Invalid email address

When users using remote optimization, the user may use incorrect string(not an email address and OPT_LOCAL label).

This should leads to a halt and informs the user.

# import packages
from pystoned import CNLS
from pystoned.constant import CET_ADDI, FUN_PROD, OPT_LOCAL, RTS_VRS
from pystoned.dataset import load_Finnish_electricity_firm

# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['Energy', 'Length', 'Customers'],
                                      y_select=['TOTEX'])

# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None,
                    cet = CET_ADDI, fun = FUN_PROD, rts = RTS_VRS)

model.optimize(email="NotAnEmailAddress")

Optimization multiplicative model without specifying solvers.

When users using local optimization, the user should specify the solver for optimization.

This should leads to a halt and informs the user to choose a installed solver.

# import packages
from pystoned import CNLS
from pystoned.constant import CET_MULT, FUN_PROD, OPT_LOCAL, RTS_VRS
from pystoned.dataset import load_Finnish_electricity_firm

# import Finnish electricity distribution firms data
data = load_Finnish_electricity_firm(x_select=['Energy', 'Length', 'Customers'],
                                      y_select=['TOTEX'])

# define and solve the CNLS model
model = CNLS.CNLS(y=data.y, x=data.x, z=None,
                    cet = CET_MULT, fun = FUN_PROD, rts = RTS_VRS)

model.optimize(email=OPT_LOCAL)

These modification is a draft for discussing error/exception types, situation, and the messages. Hence only the CNLS and the utils/tools module are modified as examples. Any other exceptions and errors can be included and discussed!

Thanks!

Note: This part may not be included in the document. Since the document should indicate the right way to use the program, and here is for prevention of wrong ways.

opened by Fangop 3

Solver Binding Error

Hello. Great work. I have been looking for something like this for a while.

I am trying to run some examples but I am facing some issues with bindings ro the solver. Error message:

"No Python bindings available for <class 'pyomo.solvers.plugins.solvers.mosek_direct.MOSEKDirect'> solver plugin"

Any hints on how to solve this?

opened by fmobrj 3
CNLSG: return then error when using the local solver
Hi @JulianATA, it seems that there is another bug in line 122 CNLSG. I have used the CNLSG to estimate the multiplicative cost function using a local solver MINOS, but it returns the following error:

File "/home/dais2/anaconda3/lib/python3.8/site-packages/pystoned/CNLSG.py", line 122, in __convergence_test self.Active2[i, j] = - alpha[i] - np.sum(beta[i, :] * x[i, :]) + \ TypeError: bad operand type for unary -: 'NoneType'.

Interestingly, when I using the 'NEOS' to solve the same model, there is no error, and I can receive the final estimation results. Further, there is no problem when we estimate the additive production function using the local solver MOSEK.

Could you please help to check and fix it? Many thanks! For your convenience, please see the following example:

Example

import numpy as np import pandas as pd from pystoned import CNLSG from pystoned.constant import CET_MULT, FUN_COST, OPT_LOCAL, RTS_VRS url='https://raw.githubusercontent.com/ds2010/pyStoNED/master/pystoned/data/electricityFirms.csv' df = pd.read_csv(url, error_bad_lines=False) # output y = df['TOTEX'] # inputs x1 = df['Energy'] x1 = np.asmatrix(x1).T x2 = df['Length'] x2 = np.asmatrix(x2).T x3 = df['Customers'] x3 = np.asmatrix(x3).T x = np.concatenate((x1, x2, x3), axis=1) model = CNLSG.CNLSG(y, x, z=None, cet=CET_MULT, fun=FUN_COST, rts=RTS_VRS) model.optimize(OPT_LOCAL) model.display_beta()
opened by ds2010 3
feat(dataset): Implement dataset support
Hi, I recently considered about the example we used for testing pystoned could be a feature.

This is inspired by sklearn, which provides user toy datasets for better comprehension of the usage/feature of the model. The toy datasets made sklearn the wildly used all over the world, since it is pretty easy to use/comprehend for the beginners.

This pr reduce the complexity of the use of the datasets Original:

import pandas as pd import numpy as np url = 'https://raw.githubusercontent.com/ds2010/pyStoNED-Tutorials/master/Data/firms.csv' df = pd.read_csv(url, error_bad_lines=False) df.head(5) # output y = df['Energy'] # inputs x1 = df['OPEX'] x1 = np.asmatrix(x1).T x2 = df['CAPEX'] x2 = np.asmatrix(x2).T x = np.concatenate((x1, x2), axis=1)

This pr:

from pystoned import dataset x, y = dataset.firm(['OPEX', 'CAPEX'], 'Energy')

This pr is not yet finished

Please give me the information of the datasets, in order to:

making sure the datasets are used in rational way

give the user the brief introduction of the dataset

etc..

thanks for your review, do not merge yet!
opened by Fangop 3
API documentations

The new pr #23 (Autodoc) works well locally but does not on the ReadTheDocs. You can check the CNLS API in the website generated by ReadTheDocs. It is empty. However, if we compile the sphinx locally using make html, the docstring will show in the HTML file. See the following screenshot.

I failed to fix it. Since the website is automatically generated by the ReadTheDocs, @JulianATA , could you please help me to fix it? Thanks in advance!

opened by ds2010 2

Releases(v0.5.8)

v0.5.8(Apr 30, 2022)

Source code(tar.gz)
Source code(zip)
pystoned-0.5.8.tar.gz(66.92 KB)
v0.5.0(Aug 16, 2021)

Source code(tar.gz)
Source code(zip)
pystoned-0.5.0.tar.gz(24.54 KB)
v0.4.0(Dec 1, 2020)

Source code(tar.gz)
Source code(zip)
pystoned-0.4.0.tar.gz(18.45 KB)
v0.3.0(Jun 14, 2020)

Source code(tar.gz)
Source code(zip)
pystoned-0.3.0.tar.gz(17.32 KB)
v0.2.0(Apr 22, 2020)

Source code(tar.gz)
Source code(zip)
pystoned-0.2.0.tar.gz(6.32 KB)

Owner

Sheng Dai

Ph.D student in Management Science at Aalto University School of Business. My research area is productivity and efficiency analysis.

GitHub Repository https://pystoned.readthedocs.io

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Face-Detection-with-MTCNN Face detection is a computer vision problem that involves finding faces in photos. It is a trivial problem for humans to sol

3 Oct 07, 2022

Source code for CAST - Crisis Domain Adaptation Using Sequence-to-sequence Transformers (Accepted to ISCRAM 2021, CorePaper).

Source code for CAST: Crisis Domain Adaptation UsingSequence-to-sequenceTransformers (Paper, BibTeX, Accepted to ISCRAM 2021, CorePaper) Quick start D

0 Jul 14, 2021

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction [Paper] [Official Paddle Implementation] [Huggingface Gradio Demo] [Unofficial

442 Dec 16, 2022

Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

Bridging Multi-Task Learning and Meta-Learning Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Trainin

57 Dec 15, 2022

BoxInst: High-Performance Instance Segmentation with Box Annotations

Introduction This repository is the code that needs to be submitted for OpenMMLab Algorithm Ecological Challenge, the paper is BoxInst: High-Performan

88 Dec 21, 2022

Instance Semantic Segmentation List

Instance Semantic Segmentation List This repository contains lists of state-or-art instance semantic segmentation works. Papers and resources are list

87 Mar 06, 2022

Few-shot Neural Architecture Search

One-shot Neural Architecture Search uses a single supernet to approximate the performance each architecture. However, this performance estimation is super inaccurate because of co-adaption among oper

38 Oct 18, 2022

Dark Finix: All in one hacking framework with almost 100 tools

Dark Finix - Hacking Framework. Dark Finix is a all in one hacking framework wit

2 Feb 18, 2022

Official implementation of EfficientPose

EfficientPose This is the official implementation of EfficientPose. We based our work on the Keras EfficientDet implementation xuannianz/EfficientDet

2 May 17, 2022

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models Code and supplementary materials Repository of the p

4 Jul 13, 2022

Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL"

Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL" This is the official codebase for Pessimism Meets I

3 Sep 19, 2022

Locally cache assets that are normally streamed in POPULATION: ONE

Population One Localizer This is no longer needed as of the build shipped on 03/03/22, thank you bigbox :) Locally cache assets that are normally stre

2 Mar 04, 2022

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

SBEVNet: End-to-End Deep Stereo Layout Estimation This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by D

19 Dec 17, 2022

A Python Package for Convex Regression and Frontier Estimation

Related tags

Overview

Installation

PyPI

GitHub

Authors

Citation

Comments

DEA

FDH

CNLSZ

StoNEDZ

CNLS

Additive

Multiplicative

CNLSDDF

Basic exception:

Additive CNLS with CRS

Retrieving variables without optimization

Value error:

Construct model with unknown parameters

Invalid email address

Optimization multiplicative model without specifying solvers.

Example

Releases(v0.5.8)

v0.5.8(Apr 30, 2022)

v0.5.0(Aug 16, 2021)

v0.4.0(Dec 1, 2020)

v0.3.0(Jun 14, 2020)

v0.2.0(Apr 22, 2020)

Owner

Sheng Dai

Face Detection and Alignment using Multi-task Cascaded Convolutional Networks (MTCNN)

Source code for CAST - Crisis Domain Adaptation Using Sequence-to-sequence Transformers (Accepted to ISCRAM 2021, CorePaper).

Homepage of paper: Paint Transformer: Feed Forward Neural Painting with Stroke Prediction, ICCV 2021.

Code for the ICML 2021 paper "Bridging Multi-Task Learning and Meta-Learning: Towards Efficient Training and Effective Adaptation", Haoxiang Wang, Han Zhao, Bo Li.

BoxInst: High-Performance Instance Segmentation with Box Annotations

Instance Semantic Segmentation List

Few-shot Neural Architecture Search

Dark Finix: All in one hacking framework with almost 100 tools

Official implementation of EfficientPose

Repository of the paper Compressing Sensor Data for Remote Assistance of Autonomous Vehicles using Deep Generative Models at ML4AD @ NeurIPS 2021.

Sample Code for "Pessimism Meets Invariance: Provably Efficient Offline Mean-Field Multi-Agent RL"

Locally cache assets that are normally streamed in POPULATION: ONE

This repository contains the code for "SBEVNet: End-to-End Deep Stereo Layout Estimation" paper by Divam Gupta, Wei Pu, Trenton Tabor, Jeff Schneider

QuadTree Attention for Vision Transformers (ICLR2022)

Dahua Camera and Doorbell Home Assistant Integration

Implementation of the bachelor's thesis "Real-time stock predictions with deep learning and news scraping".

g9.py - Torch interactive graphics

A 35mm camera, based on the Canonet G-III QL17 rangefinder, simulated in Python.

Docker containers of baseline agents for the Crafter environment

MacroTools provides a library of tools for working with Julia code and expressions.