Forecast dynamically at scale with this unique package. pip install scalecast

Overview

🌄 Scalecast: Dynamic Forecasting at Scale

About

This package uses a scaleable forecasting approach in Python with common scikit-learn and statsmodels, as well as Facebook Prophet, Microsoft LightGBM and LinkedIn Silverkite models, to forecast time series. Use your own regressors or load the object with its own seasonal, auto-regressive, and other regressors, or combine all of the above. All forecasting is dynamic by default so that auto-regressive terms can be used without leaking data into the test set, setting it apart from other time-series libraries. Dynamic model testing can be disabled to improve model evaluation speed. Differencing to achieve stationarity is built into the library and metrics can be compared across the time series' original level or first or second difference. This library was written to easily apply and compare many forecasts fairly across the same series.

import pandas as pd
import pandas_datareader as pdr
from scalecast import GridGenerator
from scalecast.Forecaster import Forecaster

models = ('mlr','knn','svr','xgboost','elasticnet','mlp','prophet')
df = pdr.get_data_fred('HOUSTNSA',start='2009-01-01',end='2021-06-01')
GridGenerator.get_example_grids()
f = Forecaster(y=df.HOUSTNSA,current_dates=df.index) # to initialize, specify y and current_dates (must be arrays of the same length)
f.set_test_length(12) # specify a test length for your models - do this before eda
f.generate_future_dates(24) # this will create future dates that are on the same interval as the current dates and it will also set the forecast length
f.add_ar_terms(4) # add AR terms before differencing
f.add_AR_terms((2,12)) # seasonal AR terms
f.integrate() # automatically decides if the y term and all ar terms should be differenced to make the series stationary
f.add_seasonal_regressors('month',raw=False,sincos=True) # uses pandas attributes: raw=True creates integers (default), sincos=True creates wave functions
f.add_seasonal_regressors('year')
f.add_covid19_regressor() # dates are flexible, default is from when disney world closed to when US CDC lifted mask recommendations
f.add_time_trend()
f.set_validation_length(6) # length, different than test_length, to tune the hyperparameters 
f.tune_test_forecast(models)
f.plot(order_by='LevelTestSetMAPE',level=True) # plots the forecast

Why switch to Scalecast?

  • Much simpler to set up than a tensorflow neural network
  • Extends scikit-learn regression modeling concepts to be useful for time-series forecasting
    • propogates lagged y terms dynamically
    • differences and undifferences series with ease to model stationary series only
  • Allows comparison of many different modeling concepts, including ARIMA, MLR, MLP, and Prophet so you never have to be in doubt about which model is right for your series
  • Your results and accuracy metrics can always be level, even if you need to difference the series to model it effectively

Installation

  1. pip install scalecast
    • installs the base package and most dependencies
  2. pip install fbprophet
    • only necessary if you plan to forecast with Facebook prophet models
    • to resolve a common installation issue, see this Stack Overflow post
  3. pip install greykite
    • only necessary if you plan to forecast with LinkedIn Silverkite
  4. If using notebook functions:
    • pip install tqdm
    • pip install ipython
    • pip install ipywidgets
    • jupyter nbextension enable --py widgetsnbextension
    • if using Jupyter Lab: jupyter labextension install @jupyter-widgets/jupyterlab-manager

Documentation

Documentation
📋 Examples Get straight to the process
Towards Data Science Series Read the 3-part series
📓 Binder Notebook Play with an example in your browser
🛠️ Change Log See what's changed
📚 Documentation Markdown Files Review all high-level concepts in the library

Contribute

The following contributions are needed (contact [email protected])

  1. Documentation moved to a proper website with better organization
  2. Confidence intervals for all models (need to be consistently derived and able to toggle on/off or at different levels: 95%, 80%, etc.)
  3. Error/issue reporting
Comments
  • ModuleNotFoundError: No module named 'src.scalecast'

    ModuleNotFoundError: No module named 'src.scalecast'

    Failing to install on intel macOS Monterey 12.6. Python 3.10.7 pip 22.2.2

    Tried multiple versions (latest, 0.14.7, 0.13.11 all produce the same results:

    Collecting scalecast Using cached SCALECAST-0.15.1.tar.gz (403 kB) Preparing metadata (setup.py) ... error error: subprocess-exited-with-error

    × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [6 lines of output] Traceback (most recent call last): File "", line 2, in File "", line 34, in File "/private/var/folders/0f/m_0sfrcn7c56zl0026b1k62c0000gq/T/pip-install-k0tc2z79/scalecast_d8f7a8ae59984787adfbedc8a557540a/setup.py", line 4, in from src.scalecast.init import version as version ModuleNotFoundError: No module named 'src.scalecast' [end of output]

    bug 
    opened by callmegar 7
  • Is it possible to pickle only trained model and call it back for test and/or forecast without any training process in another .py folder?

    Is it possible to pickle only trained model and call it back for test and/or forecast without any training process in another .py folder?

    I am trying to pickle a trained xgboost model and call it in another .py folder to only do forecast without any training process. Is it possible to do that with auto_forecast() function, if not how can I do that? (code is below)

    """### XGBoost"""

    xgboost_grid = { 'max_depth': [15], 'tree_method': 'gpu_hist' } for i, f in enumerate(forecasts): print('Forecast', i) f.set_estimator('xgboost') f.ingest_grid(xgboost_grid) f.tune() f.auto_forecast()

    question 
    opened by batuhansahincanel 5
  • Possible Bug: f.forecast gives Index error

    Possible Bug: f.forecast gives Index error

    Problem

    This looks to be very promising module. The theta example given in the tutorial runs without error. But when I tried to implement the theta forecasting using this module for my example data, I got the index error for the validation.

    How to fix the IndexError?

    I have created a brand new virtual environment (venv) with python 3.9 and installed darts and scaleforecast.

    Reproducible Example

    import numpy as np
    import pandas as pd
    from scalecast.Forecaster import Forecaster
    
    col_date = 'BillingDate'
    col_val = 'TotWAC'
    
    # data
    url = "https://github.com/bhishanpdl/Shared/blob/master/data/data_scalecast/df_train.csv"
    dfs = pd.read_html(url)
    
    df_train = dfs[0].iloc[:,1:]
    df_train[col_date] = pd.to_datetime(df_train[col_date])
    
    y = df_train[col_val].to_list()
    current_dates = df_train[col_date].to_list()
    
    f = Forecaster(y=y,current_dates=current_dates)
    
    f.set_test_length(.2)
    f.generate_future_dates(90)
    f.set_validation_metric('mape')
    
    from darts.utils.utils import SeasonalityMode, TrendMode, ModelMode
    
    theta_grid = {
        'theta':[0.5,1,1.5,2,2.5,3],
        'model_mode':[
            ModelMode.ADDITIVE,
            ModelMode.MULTIPLICATIVE
        ],
        'season_mode':[
            SeasonalityMode.MULTIPLICATIVE,
            SeasonalityMode.ADDITIVE
        ],
        'trend_mode':[
            TrendMode.EXPONENTIAL,
            TrendMode.LINEAR
        ],
    }
    
    f.set_estimator('theta')
    f.ingest_grid(theta_grid)
    f.cross_validate(k=3)
    f.auto_forecast()
    

    Error

    ---------------------------------------------------------------------------
    IndexError                                Traceback (most recent call last)
    Input In [9], in <cell line: 44>()
         42 f.set_estimator('theta')
         43 f.ingest_grid(theta_grid)
    ---> 44 f.cross_validate(k=3)
         45 f.auto_forecast()
    
    File \venv\py39darts\lib\site-packages\scalecast\Forecaster.py:3422, in Forecaster.cross_validate(self, k, rolling, dynamic_tuning)
       3420 self.grid = grid_evaluated.iloc[:, :-3]
       3421 self.dynamic_tuning = f2.dynamic_tuning
    -> 3422 self._find_best_params(grid_evaluated)
       3423 self.grid_evaluated = grid_evaluated_cv.reset_index(drop=True)
       3424 self.grid = orig_grid
    
    File \venv\py39darts\lib\site-packages\scalecast\Forecaster.py:3434, in Forecaster._find_best_params(self, grid_evaluated)
       3429     best_params_idx = self.grid.loc[
       3430         grid_evaluated["metric_value"]
       3431         == grid_evaluated["metric_value"].max()
       3432     ].index.to_list()[0]
       3433 else:
    -> 3434     best_params_idx = self.grid.loc[
       3435         grid_evaluated["metric_value"]
       3436         == grid_evaluated["metric_value"].min()
       3437     ].index.to_list()[0]
       3438 self.best_params = {
       3439     k: v[best_params_idx]
       3440     for k, v in self.grid.to_dict(orient="series").items()
       3441 }
       3442 self.best_params = {
       3443     k: (
       3444         v
       (...)
       3452     for k, v in self.best_params.items()
       3453 }
    
    IndexError: list index out of range
    

    System Info

    $ pip freeze
    absl-py==1.2.0
    aiohttp==3.8.1
    aiosignal==1.2.0
    argon2-cffi==21.3.0
    argon2-cffi-bindings==21.2.0
    asttokens==2.0.8
    astunparse==1.6.3
    async-timeout==4.0.2
    attrs==22.1.0
    autopep8==1.7.0
    backcall==0.2.0
    beautifulsoup4==4.11.1
    bleach==5.0.1
    cachetools==5.2.0
    catboost==1.0.6
    certifi==2022.6.15
    cffi==1.15.1
    charset-normalizer==2.1.1
    cmdstanpy==1.0.5
    colorama==0.4.5
    convertdate==2.4.0
    cycler==0.11.0
    Cython==0.29.32
    darts==0.21.0
    debugpy==1.6.3
    decorator==5.1.1
    defusedxml==0.7.1
    eli5==0.13.0
    entrypoints==0.4
    ephem==4.1.3
    et-xmlfile==1.1.0
    executing==0.10.0
    fastjsonschema==2.16.1
    flatbuffers==1.12
    fonttools==4.36.0
    frozenlist==1.3.1
    fsspec==2022.7.1
    gast==0.4.0
    google-auth==2.11.0
    google-auth-oauthlib==0.4.6
    google-pasta==0.2.0
    graphviz==0.20.1
    greenlet==1.1.2
    grpcio==1.47.0
    h5py==3.7.0
    hijri-converter==2.2.4
    holidays==0.15
    html5lib==1.1
    idna==3.3
    importlib-metadata==4.12.0
    ipykernel==6.15.1
    ipython==8.4.0
    ipython-genutils==0.2.0
    ipywidgets==8.0.1
    jedi==0.18.1
    Jinja2==3.1.2
    joblib==1.1.0
    jsonschema==4.14.0
    jupyter==1.0.0
    jupyter-client==7.3.4
    jupyter-console==6.4.4
    jupyter-contrib-core==0.4.0
    jupyter-contrib-nbextensions==0.5.1
    jupyter-core==4.11.1
    jupyter-highlight-selected-word==0.2.0
    jupyter-latex-envs==1.4.6
    jupyter-nbextensions-configurator==0.5.0
    jupyterlab-pygments==0.2.2
    jupyterlab-widgets==3.0.2
    keras==2.9.0
    Keras-Preprocessing==1.1.2
    kiwisolver==1.4.4
    korean-lunar-calendar==0.2.1
    libclang==14.0.6
    lightgbm==3.3.2
    llvmlite==0.39.0
    LunarCalendar==0.0.9
    lxml==4.9.1
    Markdown==3.4.1
    MarkupSafe==2.1.1
    matplotlib==3.5.3
    matplotlib-inline==0.1.6
    mistune==2.0.4
    multidict==6.0.2
    nbclient==0.6.6
    nbconvert==7.0.0
    nbformat==5.4.0
    nest-asyncio==1.5.5
    nfoursid==1.0.1
    notebook==6.4.12
    numba==0.56.0
    numpy==1.22.4
    oauthlib==3.2.0
    openpyxl==3.0.10
    opt-einsum==3.3.0
    packaging==21.3
    pandas==1.4.3
    pandas-datareader==0.10.0
    pandocfilters==1.5.0
    parso==0.8.3
    patsy==0.5.2
    pickleshare==0.7.5
    Pillow==9.2.0
    plotly==5.10.0
    pmdarima==2.0.0
    prometheus-client==0.14.1
    prompt-toolkit==3.0.30
    prophet==1.1
    protobuf==3.19.4
    psutil==5.9.1
    pure-eval==0.2.2
    pyasn1==0.4.8
    pyasn1-modules==0.2.8
    pycodestyle==2.9.1
    pycparser==2.21
    pyDeprecate==0.3.2
    Pygments==2.13.0
    PyMeeus==0.5.11
    pyodbc==4.0.34
    pyparsing==3.0.9
    pyrsistent==0.18.1
    python-dateutil==2.8.2
    pytorch-lightning==1.7.2
    pytz==2022.2.1
    pywin32==304
    pywinpty==2.0.7
    PyYAML==6.0
    pyzmq==23.2.1
    qtconsole==5.3.1
    QtPy==2.2.0
    requests==2.28.1
    requests-oauthlib==1.3.1
    rsa==4.9
    SCALECAST==0.13.11
    scikit-learn==1.1.2
    scipy==1.9.0
    seaborn==0.11.2
    Send2Trash==1.8.0
    setuptools-git==1.2
    six==1.16.0
    soupsieve==2.3.2.post1
    SQLAlchemy==1.4.40
    stack-data==0.4.0
    statsforecast==0.6.0
    statsmodels==0.13.2
    tabulate==0.8.10
    tbats==1.1.0
    tenacity==8.0.1
    tensorboard==2.9.1
    tensorboard-data-server==0.6.1
    tensorboard-plugin-wit==1.8.1
    tensorflow==2.9.1
    tensorflow-estimator==2.9.0
    tensorflow-io-gcs-filesystem==0.26.0
    termcolor==1.1.0
    terminado==0.15.0
    threadpoolctl==3.1.0
    tinycss2==1.1.1
    toml==0.10.2
    torch==1.12.1
    torchmetrics==0.9.3
    tornado==6.2
    tqdm==4.64.0
    traitlets==5.3.0
    typing_extensions==4.3.0
    ujson==5.4.0
    urllib3==1.26.12
    watermark==2.3.1
    wcwidth==0.2.5
    webencodings==0.5.1
    Werkzeug==2.2.2
    widgetsnbextension==4.0.2
    wrapt==1.14.1
    xarray==2022.6.0
    xgboost==1.6.2
    yarl==1.8.1
    zipp==3.8.1
    
    
    question 
    opened by bhishanpdl 4
  • Generating future values does not show in plot

    Generating future values does not show in plot

    Hello,

    I am using scalecast for a timeseries project consisting of predicting future asset prices. I am currently deploying my model in production but I cannot generate any future values beyond my testing dataset. At the beginning of my script, I have written "f.generate_future_dates(20)" but the f.plot() method does not return the predictions for the next 20 units.

    Would you please guide me for how to generate and plot these next 20 units?

    Thank you.

    Martin

    question 
    opened by MartinMashalov 4
  • Update export() methods to include CIs more easily + at level=True

    Update export() methods to include CIs more easily + at level=True

    This may already be a feature, but i think some of the export methods could be retooled.

    Specifically, I would like to pull the CIs for test set predictions and forecasts at the level=True.

    export_forecasts_with_cis and export_test set_preds_with_cis don’t currently have a level value to call on according to the docs.

    Additionally, it would be nice to be able to call these dfs in the .export() method. This way, I can more easily call the functions for a list of models.

    enhancement 
    opened by jroy12345 4
  • VECM

    VECM

    Hi Michael,

    VECM has been showing a frequency error when you have gaps in your data. I would like to know if it's possible to correct that for all type of frequency even if we have gaps in the data like other models in scalecast that works well for this cases.

    Best regards,

    Michelle

    bug 
    opened by michellebaugraczyk 3
  • No option to save LSTM model

    No option to save LSTM model

    I trained an LSTM model and couldn't figure out a way to save my model in any format. I've been through all of the docs provided but there's no way to get the job done. Need help.

    enhancement 
    opened by KnightLock 3
  • use_boxcox parameter in holt winters

    use_boxcox parameter in holt winters

    with this grid:

    hwes = { 'trend':['add','mul'], 'seasonal':['add','mul'], 'damped_trend':[True,False], 'initialization_method':[None,'estimated','heuristic'], 'use_boxcox':[True,False], 'seasonal_periods':[168], }

    I get this result which I do not understand as it is True or False:

    `File ~/.local/share/virtualenvs/scalecast-0HQw5DtN/lib/python3.10/site-packages/statsmodels/tsa/holtwinters/model.py:337, in ExponentialSmoothing._boxcox(self) 335 y = boxcox(self._y, self._use_boxcox) 336 else: --> 337 raise TypeError("use_boxcox must be True, False or a float.") 338 return y

    TypeError: use_boxcox must be True, False or a float.`

    I think the issue may be here: File ~/.local/share/virtualenvs/scalecast-0HQw5DtN/lib/python3.10/site-packages/statsmodels/tsa/holtwinters/model.py:291, in ExponentialSmoothing.__init__(self, endog, trend, damped_trend, seasonal, seasonal_periods, initialization_method, initial_level, initial_trend, initial_seasonal, use_boxcox, bounds, dates, freq, missing) 289 self._use_boxcox = use_boxcox 290 self._lambda = np.nan --> 291 self._y = self._boxcox() 292 self._initialize() 293 self._fixed_parameters = {}

    bug 
    opened by callmegar 2
  • Issue when running auto_forecast() or tune_test_forecast() with rf

    Issue when running auto_forecast() or tune_test_forecast() with rf

    Here is the error. Typically I can rerun the codeblock in my notebook and after 1-2 tries it will fix itself.

    ---------------------------------------------------------------------------
    TypeError                                 Traceback (most recent call last)
    ~\AppData\Local\Temp\ipykernel_3112\911625238.py in <module>
          3     j.ingest_grid(model)
          4     j.cross_validate(dynamic_tuning=26)
    ----> 5     j.auto_forecast()
          6     j.save_summary_stats()
          7     print(model)
    
    ~\AppData\Roaming\Python\Python37\site-packages\scalecast\Forecaster.py in auto_forecast(self, call_me, dynamic_testing, test_only)
    
    ~\AppData\Roaming\Python\Python37\site-packages\scalecast\Forecaster.py in manual_forecast(self, call_me, dynamic_testing, test_only, **kwargs)
    
    ~\AppData\Roaming\Python\Python37\site-packages\scalecast\Forecaster.py in _forecast_sklearn(self, fcster, dynamic_testing, tune, Xvars, normalizer, test_only, **kwargs)
    
    ~\AppData\Roaming\Python\Python37\site-packages\scalecast\Forecaster.py in evaluate_model(scaler, regr, X, y, Xvars, fcst_horizon, future_xreg, dynamic_testing, true_forecast)
    
    ~\Anaconda3\envs\time\lib\site-packages\sklearn\ensemble\_forest.py in fit(self, X, y, sample_weight)
        465                     n_samples_bootstrap=n_samples_bootstrap,
        466                 )
    --> 467                 for i, t in enumerate(trees)
        468             )
        469 
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\parallel.py in __call__(self, iterable)
       1041             # remaining jobs.
       1042             self._iterating = False
    -> 1043             if self.dispatch_one_batch(iterator):
       1044                 self._iterating = self._original_iterator is not None
       1045 
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\parallel.py in dispatch_one_batch(self, iterator)
        859                 return False
        860             else:
    --> 861                 self._dispatch(tasks)
        862                 return True
        863 
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\parallel.py in _dispatch(self, batch)
        777         with self._lock:
        778             job_idx = len(self._jobs)
    --> 779             job = self._backend.apply_async(batch, callback=cb)
        780             # A job can complete so quickly than its callback is
        781             # called before we get here, causing self._jobs to
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\_parallel_backends.py in apply_async(self, func, callback)
        206     def apply_async(self, func, callback=None):
        207         """Schedule a func to be run"""
    --> 208         result = ImmediateResult(func)
        209         if callback:
        210             callback(result)
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\_parallel_backends.py in __init__(self, batch)
        570         # Don't delay the application, to avoid keeping the input
        571         # arguments in memory
    --> 572         self.results = batch()
        573 
        574     def get(self):
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\parallel.py in __call__(self)
        261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
        262             return [func(*args, **kwargs)
    --> 263                     for func, args, kwargs in self.items]
        264 
        265     def __reduce__(self):
    
    ~\Anaconda3\envs\time\lib\site-packages\joblib\parallel.py in <listcomp>(.0)
        261         with parallel_backend(self._backend, n_jobs=self._n_jobs):
        262             return [func(*args, **kwargs)
    --> 263                     for func, args, kwargs in self.items]
        264 
        265     def __reduce__(self):
    
    ~\Anaconda3\envs\time\lib\site-packages\sklearn\utils\fixes.py in __call__(self, *args, **kwargs)
        214     def __call__(self, *args, **kwargs):
        215         with config_context(**self.config):
    --> 216             return self.function(*args, **kwargs)
        217 
        218 
    
    ~\Anaconda3\envs\time\lib\site-packages\sklearn\ensemble\_forest.py in _parallel_build_trees(tree, forest, X, y, sample_weight, tree_idx, n_trees, verbose, class_weight, n_samples_bootstrap)
        171 
        172         indices = _generate_sample_indices(
    --> 173             tree.random_state, n_samples, n_samples_bootstrap
        174         )
        175         sample_counts = np.bincount(indices, minlength=n_samples)
    
    ~\Anaconda3\envs\time\lib\site-packages\sklearn\ensemble\_forest.py in _generate_sample_indices(random_state, n_samples, n_samples_bootstrap)
        127 
        128     random_instance = check_random_state(random_state)
    --> 129     sample_indices = random_instance.randint(0, n_samples, n_samples_bootstrap)
        130 
        131     return sample_indices
    
    mtrand.pyx in numpy.random.mtrand.RandomState.randint()
    
    _bounded_integers.pyx in numpy.random._bounded_integers._rand_int32()
    
    TypeError: 'numpy.float64' object cannot be interpreted as an integer
    
    bug 
    opened by jroy12345 2
  • None doesn't work when passed to the 'Xvars' key in a grid and using cross validation

    None doesn't work when passed to the 'Xvars' key in a grid and using cross validation

    When using the cross_validate() function, it throws an error when using None as an argument:

    arima_grid = {
        'order':[
            (1,0,1),
            (1,0,0),
            (0,0,1)
        ],
        'seasonal_order':[
            (1,0,1,7),
            (1,0,0,7),
            (0,0,1,7),
            (0,1,0,7)
        ],
        'Xvars':[
            None,
            [
                'monthsin',
                'monthcos',
                'quartersin',
                'quartercos',
                'dayofyearsin',
                'dayofyearcos',
                'weeksin',
                'weekcos',
            ]
        ],
    }
    
    f.ingest_grid(arima_grid)
    f.cross_validate(k=3) 
    

    image

    bug good first issue 
    opened by uger7 2
  • fbprophet is now named prophet

    fbprophet is now named prophet

    The prophet forecaster still uses the old 'fbprophet' dependency, this has been changed to 'prophet' and installing the previous versions of the library causes some issues if you have the new version/name installed

    opened by callmegar 1
Releases(0.1.4)
Owner
Michael Keith
Data Scientist for Utah Department of Health
Michael Keith
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Jan 06, 2023
Code for the TCAV ML interpretability project

Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) Been Kim, Martin Wattenberg, Justin Gilmer, C

552 Dec 27, 2022
Magenta: Music and Art Generation with Machine Intelligence

Magenta is a research project exploring the role of machine learning in the process of creating art and music. Primarily this involves developing new

Magenta 18.1k Dec 30, 2022
Tools for Optuna, MLflow and the integration of both.

HPOflow - Sphinx DOC Tools for Optuna, MLflow and the integration of both. Detailed documentation with examples can be found here: Sphinx DOC Table of

Telekom Open Source Software 17 Nov 20, 2022
Project to deploy a machine learning model based on Titanic dataset from Kaggle

kaggle_titanic_deploy Project to deploy a machine learning model based on Titanic dataset from Kaggle In this project we used the Titanic dataset from

Vivian Yamassaki 8 May 23, 2022
Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas.

Skoot is a lightweight python library of machine learning transformer classes that interact with scikit-learn and pandas. Its objective is to ex

Taylor G Smith 54 Aug 20, 2022
Programming assignments and quizzes from all courses within the Machine Learning Engineering for Production (MLOps) specialization offered by deeplearning.ai

Machine Learning Engineering for Production (MLOps) Specialization on Coursera (offered by deeplearning.ai) Programming assignments from all courses i

Aman Chadha 173 Jan 05, 2023
Provide an input CSV and a target field to predict, generate a model + code to run it.

automl-gs Give an input CSV file and a target field you want to predict to automl-gs, and get a trained high-performing machine learning or deep learn

Max Woolf 1.8k Jan 04, 2023
MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

SUPSI-DACD-ISAAC 61 Dec 19, 2022
An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

23.3k Dec 31, 2022
CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

CinnaMon is a Python library which offers a number of tools to detect, explain, and correct data drift in a machine learning system

Zelros 67 Dec 28, 2022
Time series forecasting with PyTorch

Our article on Towards Data Science introduces the package and provides background information. Pytorch Forecasting aims to ease state-of-the-art time

Jan Beitner 2.5k Jan 02, 2023
A machine learning toolkit dedicated to time-series data

tslearn The machine learning toolkit for time series analysis in Python Section Description Installation Installing the dependencies and tslearn Getti

2.3k Dec 29, 2022
OptaPy is an AI constraint solver for Python to optimize planning and scheduling problems.

OptaPy is an AI constraint solver for Python to optimize the Vehicle Routing Problem, Employee Rostering, Maintenance Scheduling, Task Assignment, School Timetabling, Cloud Optimization, Conference S

OptaPy 208 Dec 27, 2022
决策树分类与回归模型的实现和可视化

DecisionTree 决策树分类与回归模型,以及可视化 DecisionTree ID3 C4.5 CART 分类 回归 决策树绘制 分类树 回归树 调参 剪枝 ID3 ID3决策树是最朴素的决策树分类器: 无剪枝 只支持离散属性 采用信息增益准则 在data.py中,我们记录了一个小的西瓜数据

Welt Xing 10 Oct 22, 2022
Neighbourhood Retrieval (Nearest Neighbours) with Distance Correlation.

Neighbourhood Retrieval with Distance Correlation Assign Pseudo class labels to datapoints in the latent space. NNDC is a slim wrapper around FAISS. N

The Learning Machines 1 Jan 16, 2022
Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models. Solve a variety of tasks with pre-trained models or finetune them in

Backprop 227 Dec 10, 2022
Fundamentals of Machine Learning

Fundamentals-of-Machine-Learning This repository introduces the basics of machine learning algorithms for preprocessing, regression and classification

Happy N. Monday 3 Feb 15, 2022
Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc). Structured a custom ensemble model and a neural network. Found a outperformed

Chris Yuan 1 Feb 06, 2022
A statistical library designed to fill the void in Python's time series analysis capabilities, including the equivalent of R's auto.arima function.

pmdarima Pmdarima (originally pyramid-arima, for the anagram of 'py' + 'arima') is a statistical library designed to fill the void in Python's time se

alkaline-ml 1.3k Dec 22, 2022