Predictive AI layer for existing databases.

Overview

MindsDB

MindsDB workflow Python supported PyPi Version PyPi Downloads MindsDB Community MindsDB Website

MindsDB is an open-source AI layer for existing databases that allows you to effortlessly develop, train and deploy state-of-the-art machine learning models using SQL queries. Tweet

Predictive AI layer for existing databases
MindsDB

Try it out

Contributing

To contribute to mindsdb, please check out our Contribution guide.

Current contributors

Made with contributors-img.

Report Issues

Please help us by reporting any issues you may have while using MindsDB.

License

Comments
  • [Bug]: Exception

    [Bug]: Exception "Lightgbm mixer not supported for type: tags" when following process plant tutorial

    Is there an existing issue for this?

    • [x] I have searched the existing issues

    Current Behaviour

    I'm following the Manufacturing process quality tutorial and during training I get the error Exception: Lightgbm mixer not supported for type: tags, raised at: /opt/conda/lib/python3.7/site-packages/mindsdb/interfaces/model/learn_process.py#177.

    Expected Behaviour

    I'd expect AutoML to train the model successfully

    Steps To Reproduce

    Follow the tutorial steps until training.
    

    Anything else?

    The tutorials seem to be of mixed quality, some resulting in errors, some in low model performance or "null" predictions (for the bus ride tutorial).

    bug documentation 
    opened by philfuchs 22
  • install issues on windows 10

    install issues on windows 10

    Describe the bug When installing mindsdb, the following error message is output with the following command.

    command: pip install --requirement reqs.txt

    error message: ERROR: Could not find a version that satisfies the requirement torch>=1.0.1.post2 (from lightwood==0.6.4->-r reqs.txt (line 25)) (from versions: 0.1.2, 0.1.2.post1, 0.1.2.post2) ERROR: No matching distribution found for torch>=1.0.1.post2 (from lightwood==0.6.4->-r reqs.txt (line 25))

    Desktop (please complete the following information):

    • OS: windows 10

    Additional context I think pytorch for Windows is currently not available through PYPI, so using the commands in https://pytorch.org is a better way.

    bug 
    opened by YottaGin 22
  • Integration merlion issue2377

    Integration merlion issue2377

    1. Modify sql_query.py to support to define customized return columns and dtype of ml_handlers;
    2. [issue2377] Merlion integrated, forecaster: default, sarima, prophet, mses, detector: default, isolation forest, windstats, prophet;
    3. Corresponding test cases added, /mindsdb/tests/unit/test_merlion_handler.
    opened by rfsiten 21
  • [Bug]:  metadata-generation-failed

    [Bug]: metadata-generation-failed

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    I keep on trying to pip install mindsdb. Screenshot 2022-05-10 11 13 31

    This is the message that it comes up with.

    Things tried:

    • Installing wheel
    • Upgrading pip
    • Installing sktime
    • googling/StackOverflow

    Expected Behavior

    I anticipated that I would be able to download the package.

    Steps To Reproduce

    pip install mindsdb
    

    Anything else?

    No response

    bug 
    opened by TyanBr 20
  • [Bug]: Not able to preview data due to internal server error HTTP 500

    [Bug]: Not able to preview data due to internal server error HTTP 500

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    I was trying the mindsdb get started tutorial, that time i encountered HTTP 500 error, is it due to URL authorization scope not found ? or anything else. I tried googling why this happens but couldn't get a proper explanation. I can't do further steps it keeps on giving me a HTTP 500 error. Is it a common issue, coz i keep getting it on the website while running the query for the tutorial and can't preview my tables. Says that the query is not proper, i ran the exact code from tutorial. Is it is some issue from my side...anyway do help me overcome it. I really like mindsdb concept and an AI/ML enthusiast :)

    Here is the ss of what im getting while running the queries:

    Screenshot 2022-07-14 012410

    It says query with error but i am running the code from the tutorial the syntax is intact. Is there any changes to be made in the syntax?

    Expected Behavior

    I should be able to preview the home rentals data that i will train in the further steps.

    Steps To Reproduce

    Go to the mindsdb website and then start the home rentals demo tutorial. Run the demo code given in the step 1. Try to preview the data
    

    Anything else?

    nothing

    bug 
    opened by prathikshetty2002 19
  • Could not load module ModelInterface

    Could not load module ModelInterface

    hello can someone help to solve this :

    • ERROR:mindsdb-logger-f7442ec0-574d-11ea-a55e-106530eaf271:c:\users\drpbengrir\appdata\local\programs\python\python37\lib\site-packages\mindsdb\libs\controllers\transaction.py:188 - Could not load module ModelInterface
    bug 
    opened by simofilahi 18
  • FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'

    FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'

    Describe the bug A FileNotFoundError occurs when running the predict.py script described below.

    The full traceback is the following:

    Traceback (most recent call last):
      File "predict.py", line 12, in <module>
        result = Predictor(name='suicide_rates').predict(when={'country':'Greece','year':1981,'sex':'male','age':'35-54','population':300000})
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/predictor.py", line 472, in predict
        transaction = Transaction(session=self, light_transaction_metadata=light_transaction_metadata, heavy_transaction_metadata=heavy_transaction_metadata, breakpoint=breakpoint)
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 53, in __init__
        self.run()
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 259, in run
        self._execute_predict()
      File "/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb/libs/controllers/transaction.py", line 157, in _execute_predict
        with open(CONFIG.MINDSDB_STORAGE_PATH + '/' + self.lmd['name'] + '_light_model_metadata.pickle', 'rb') as fp:
    FileNotFoundError: [Errno 2] No such file or directory: '/home/milia/.venvs/mindsdb/lib/python3.6/site-packages/mindsdb_storage/1_0_5/suicide_rates_light_model_metadata.pickle'
    

    To Reproduce Steps to reproduce the behavior:

    1. Create a train.py script using the dataset: https://www.kaggle.com/russellyates88/suicide-rates-overview-1985-to-2016#master.csv. The train.py script is the one below:
    from mindsdb import Predictor
    
    Predictor(name='suicide_rates').learn(
        to_predict='suicides_no', # the column we want to learn to predict given all the data in the file
        from_data="master.csv" # the path to the file where we can learn from, (note: can be url)
    )
    
    1. Run the train.py script.
    2. Create and run the predict.py script:
    from mindsdb import Predictor
    
    # use the model to make predictions
    result = Predictor(name='suicide_rates').predict(when={'country':'Greece','year':1981,'sex':'male','age':'35-54','population':300000})
    
    # you can now print the results
    print(result)
    
    1. See error

    Expected behavior What was expected was to see the results.

    Desktop (please complete the following information):

    • OS: Ubuntu 18.04.2 LTS
    • mindsdb 1.0.5
    • python 3.6.7
    bug 
    opened by mlliarm 18
  • MySQL / Singlestore DB SSL support

    MySQL / Singlestore DB SSL support

    Problem I cannot connect my Singlestore DB (MySQL driver) because Mindsdb doesn't support SSL options.

    Describe the solution you'd like Full support for MySQL SSL (key, cert, ca).

    Describe alternatives you've considered No alternative is possible at the moment, security first.

    enhancement question 
    opened by pierre-b 17
  • Caching historical data for streams

    Caching historical data for streams

    I'll be using an example here and generalize whenever needed, let's say we have the following dataset we train a timeseries predictor on:

    time,gb,target,aux
    1     ,A, 7,        foo
    2     ,A, 10,      foo
    3     ,A,  12,     bar
    4     ,A,  14,     bar
    2     ,B,  5,       foo
    4     ,B,  9,       foo
    

    In this case target is what we are predicting, gb is the column we are grouping on and we are ordering by time. aux is an unrelated column that's not timeseries in nature and just used "normally".

    We train a predictor with a window of n

    Then let's say we have an input stream that looks something like this:

    time, gb, target, aux
    6,      A ,   33,     foo
    7,      A,    54,     foo
    

    Caching

    First, we will need to store, for each value of the column gb n recrods.

    So, for example, if n==1 we would save the last row in the data above, if n==2 we would save both, whem new rows come in, we un-cache the older rows`.

    Infering

    Second, when a new datapoint comes into the input stream we'll need to "infer" that the prediction we have to make is actually for the "next" datapoint. Which is to say that when: 7, A, 54, foo comes in we need to infer that we need to actually make predictions for:

    8, A, <this is what we are predicting>, foo

    The challenge here is how do we infer that the next timestamp is 8, one simple way to do this is to just subtract from the previous record, but that's an issue for the first observation (since we don't have a previous record to substract from, unless we cache part of the training data) or we could add a feature to native, to either:

    a) Provide a delta argument for each group by (representing by how much we increment the order column[s]) b) Have an argument when doing timeseries prediction that tells it to predict for the "next" row and then do the inferences under the cover.

    @paxcema let me know which of these features would be easy to implement in native, since you're now the resident timeseries expert.

    enhancement 
    opened by George3d6 16
  • Data extraction with mindsdb v.2

    Data extraction with mindsdb v.2

    This is a followup to #334, which was about the same use case and dataset, but different version of MindsDB with different errors.

    Your Environment

    Google Colab.

    • Python version: 3.6
    • Pip version: 19.3.1
    • Mindsdb version you tried to install: 2.13.8

    Describe the bug Running .learn() fails.

    [nltk_data]   Package stopwords is already up-to-date!
    
    /usr/local/lib/python3.6/dist-packages/lightwood/mixers/helpers/ranger.py:86: UserWarning: This overload of addcmul_ is deprecated:
    	addcmul_(Number value, Tensor tensor1, Tensor tensor2)
    Consider using one of the following signatures instead:
    	addcmul_(Tensor tensor1, Tensor tensor2, *, Number value) (Triggered internally at  /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
      exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad)
    
    Downloading: 100%
    232k/232k [00:00<00:00, 1.31MB/s]
    
    
    Downloading: 100%
    442/442 [00:06<00:00, 68.0B/s]
    
    
    Downloading: 100%
    268M/268M [00:06<00:00, 43.1MB/s]
    
    
    Token indices sequence length is longer than the specified maximum sequence length for this model (606 > 512). Running this sequence through the model will result in indexing errors
    ERROR:mindsdb-logger-ac470732-3303-11eb-bbe9-0242ac1c0002---eb4b7352-566f-4a1b-aef2-c286163e1a10:/usr/local/lib/python3.6/dist-packages/mindsdb_native/libs/controllers/transaction.py:173 - Could not load module ModelInterface
    
    ERROR:mindsdb-logger-ac470732-3303-11eb-bbe9-0242ac1c0002---eb4b7352-566f-4a1b-aef2-c286163e1a10:/usr/local/lib/python3.6/dist-packages/mindsdb_native/libs/controllers/transaction.py:239 - index out of range in self
    
    ---------------------------------------------------------------------------
    
    IndexError                                Traceback (most recent call last)
    
    <ipython-input-13-a5e3bd095e46> in <module>()
          7 mdb.learn(
          8     from_data=train,
    ----> 9     to_predict='birth_year' # the column we want to learn to predict given all the data in the file
         10 )
    
    20 frames
    
    /usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in embedding(input, weight, padding_idx, max_norm, norm_type, scale_grad_by_freq, sparse)
       1812         # remove once script supports set_grad_enabled
       1813         _no_grad_embedding_renorm_(weight, input, max_norm, norm_type)
    -> 1814     return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
       1815 
       1816 
    
    IndexError: index out of range in self
    

    https://colab.research.google.com/drive/1a6WRSoGK927m3eMkVdlwBhW6-3BtwaAO?usp=sharing

    To Reproduce

    1. Rerun this notebook on Google Colab https://github.com/opendataby/vybary2019/blob/e08c32ac51e181ddce166f8a4fbf968f81bd2339/canal03-parsing-with-mindsdb.ipynb
    bug 
    opened by abitrolly 15
  • Upload new Predictor

    Upload new Predictor

    Your Environment Scout The Scout upload Prediction does not function when a zip file is trying to upload.

    Errror is "Just .zip files are allowed" even though the zip file was selected. 
    
    

    Question on prediction, is there any way to upload a model of Tensorflow, or do we need to Convert TensorFlow model into the MindDB prediction model?

    bug question 
    opened by Winthan 15
  • [Bug]: TS dates not interpreted correctly

    [Bug]: TS dates not interpreted correctly

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Current Behavior

    Predictions from a timeseries model (using >LATEST) gives dates starting Dec 1st, however there are Dec dates (e.g. 6th) within the training data.

    Expected Behavior

    Predictions using ">LATEST" should start 1 day after the latest date in the training data.

    Steps To Reproduce

    CREATE MODEL mindsdb.callsdata_time
    FROM files
    (SELECT * from CallData)
    PREDICT CallsOfferedTo5
    ORDER BY DateTime
    GROUP BY PrecisionQueue
    WINDOW 8    
    HORIZON 4;
    
    
    SELECT m.DateTime as date,
      m.CallsOfferedTo5 as forecast
      FROM mindsdb.callsdata_time as m
      JOIN files.CallData as t
      WHERE t.DateTime > LATEST
      AND t.PrecisionQueue = 5000;
    
    
    
    ### Anything else?
    
    **Training data**
    [export (3) (1).csv](https://github.com/mindsdb/mindsdb/files/10335508/export.3.1.csv)
    
    **Example of dates in Dec**
    ![image](https://user-images.githubusercontent.com/34073127/210332080-d2471ece-5c3a-4a3c-942c-ea1e41a2d4cd.png)
    
    **Model output**
    ![image](https://user-images.githubusercontent.com/34073127/210332293-418f2a5a-c6d7-4bba-98a7-1effb8e1bca2.png)
    
    bug 
    opened by tomhuds 1
  • [ENH] Override dtypes - LW handler

    [ENH] Override dtypes - LW handler

    Description

    Enables support for overriding automatically inferred data types when using the Lightwood handler.

    Type of change

    (Please delete options that are not relevant)

    • [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
    • [x] ⚡ New feature (non-breaking change which adds functionality)
    • [ ] 📢 Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] 📄 This change requires a documentation update

    What is the solution?

    If the user provides:

    CREATE ...
    USING 
    dtype_dict = {
       'col1': 'type',
        ...
    };
    

    The handler will extract this mapping and use it to build a modified problem definition, which will trigger the default encoder belonging to each new dtype (note: manually specifying encoders is out of scope for this PR and will be added at a later date).

    Checklist:

    • [x] My code follows the style guidelines(PEP 8) of MindsDB.
    • [x] I have commented my code, particularly in hard-to-understand areas.
    • [ ] I have updated the documentation, or created issues to update them.
    • [x] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [x] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    opened by paxcema 0
  • updated all links

    updated all links

    Description

    Please include a summary of the change and the issue it solves.

    Fixes #4263 Fixes copied modified content of databases.mdx into its copy databases2.mdx Fixes removed https://docs.mindsdb.com from links

    Type of change

    (Please delete options that are not relevant)

    • [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
    • [ ] ⚡ New feature (non-breaking change which adds functionality)
    • [ ] 📢 Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] 📄 Documentation update

    What is the solution?

    (Describe at a high level how the feature was implemented) As in Description.

    Checklist:

    • [ ] My code follows the style guidelines(PEP 8) of MindsDB.
    • [ ] I have commented my code, particularly in hard-to-understand areas.
    • [x] I have updated the documentation.
    • [ ] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [ ] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    documentation 
    opened by martyna-mindsdb 0
  • updated links

    updated links

    Description

    Please include a summary of the change and the issue it solves.

    Fixes #4261

    Type of change

    (Please delete options that are not relevant)

    • [ ] 🐛 Bug fix (non-breaking change which fixes an issue)
    • [ ] ⚡ New feature (non-breaking change which adds functionality)
    • [ ] 📢 Breaking change (fix or feature that would cause existing functionality to not work as expected)
    • [x] 📄 Documentation update

    What is the solution?

    Updated file names in links wherever necessary.

    Checklist:

    • [ ] My code follows the style guidelines(PEP 8) of MindsDB.
    • [ ] I have commented my code, particularly in hard-to-understand areas.
    • [x] I have updated the documentation.
    • [ ] I fixed|updated|added unit tests and integration tests for each feature (if applicable).
    • [ ] I have checked that my code additions will fail neither code linting checks nor unit test.
    • [ ] I have shared a short loom video or screenshots demonstrating any new functionality.
    documentation 
    opened by martyna-mindsdb 0
  • Predictions for ranges/sets of values

    Predictions for ranges/sets of values

    Is there an existing issue for this?

    • [X] I have searched the existing issues

    Is your feature request related to a problem? Please describe.

    I'd like to retrieve a prediction filtering over a range of values rather than only individual exact values. Example: select rental_price, rental_price_explain from mindsdb.home_rentals_model where sqft between 1000 and 1200 and neighborhood in ('berkeley_hills', 'westbrae', 'downtown');

    This currently generates an error: Only 'and' and '=' operations allowed in WHERE clause, found: BetweenOperation(op='between', args=( Identifier(parts=['sqft']), Constant(value=1000), Constant(value=1200) )

    Describe the solution you'd like.

    I'd like to have the ability to use filter criteria such as "in", "between", or "or" logic to retrieve a single prediction.

    Describe an alternate solution.

    No response

    Anything else? (Additional Context)

    https://mindsdbcommunity.slack.com/archives/C01S2T35H18/p1672320517307839

    opened by rbkrejci 0
Releases(v22.12.4.3)
Exploration-Exploitation Dilemma Solving Methods

Exploration-Exploitation Dilemma Solving Methods Medium article for this repo - HERE In ths repo I implemented two techniques for tackling mentioned t

Aman Mishra 6 Jan 25, 2022
Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy

Creating a Linear Program Solver by Implementing the Simplex Method in Python with NumPy Simplex Algorithm is a popular algorithm for linear programmi

Reda BELHAJ 2 Oct 12, 2022
SegNet model implemented using keras framework

keras-segnet Implementation of SegNet-like architecture using keras. Current version doesn't support index transferring proposed in SegNet article, so

185 Aug 30, 2022
Multiple Object Extraction from Aerial Imagery with Convolutional Neural Networks

This is an implementation of Volodymyr Mnih's dissertation methods on his Massachusetts road & building dataset and my original methods that are publi

Shunta Saito 255 Sep 07, 2022
Neural network for recognizing the gender of people in photos

Neural Network For Gender Recognition How to test it? Install requirements.txt file using pip install -r requirements.txt command Run nn.py using pyth

Valery Chapman 1 Sep 18, 2022
Code for "Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo"

Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo This repository includes the source code for our CVPR 2021 paper on multi-view mult

Jiahao Lin 66 Jan 04, 2023
Codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense neural networks

DominoSearch This is repository for codes and models of NeurIPS2021 paper - DominoSearch: Find layer-wise fine-grained N:M sparse schemes from dense n

11 Sep 10, 2022
Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Oral)

CMT Code for paper Video Background Music Generation with Controllable Music Transformer (ACM MM 2021 Best Paper Award) [Paper] [Site] Directory Struc

Zhaokai Wang 198 Dec 27, 2022
This is the official repository of XVFI (eXtreme Video Frame Interpolation)

XVFI This is the official repository of XVFI (eXtreme Video Frame Interpolation), https://arxiv.org/abs/2103.16206 Last Update: 20210607 We provide th

Jihyong Oh 195 Dec 29, 2022
A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

A small demonstration of using WebDataset with ImageNet and PyTorch Lightning

Tom 50 Dec 16, 2022
CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021

crfill Usage | Web App | | Paper | Supplementary Material | More results | code for paper ``CR-Fill: Generative Image Inpainting with Auxiliary Contex

182 Dec 20, 2022
official implementation for the paper "Simplifying Graph Convolutional Networks"

Simplifying Graph Convolutional Networks Updates As pointed out by #23, there was a subtle bug in our preprocessing code for the reddit dataset. After

Tianyi 727 Jan 01, 2023
OMLT: Optimization and Machine Learning Toolkit

OMLT is a Python package for representing machine learning models (neural networks and gradient-boosted trees) within the Pyomo optimization environment.

C⚙G - Imperial College London 179 Jan 02, 2023
Open-source Monocular Python HawkEye for Tennis

Tennis Tracking 🎾 Objectives Track the ball Detect court lines Detect the players To track the ball we used TrackNet - deep learning network for trac

ArtLabs 188 Jan 08, 2023
MARE - Multi-Attribute Relation Extraction

MARE - Multi-Attribute Relation Extraction Repository for the paper submission: #TODO: insert link, when available Environment Tested with Ubuntu 18.0

0 May 11, 2021
Code for "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds", CVPR 2021

PV-RAFT This repository contains the PyTorch implementation for paper "PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clou

Yi Wei 43 Dec 05, 2022
Code for the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness"

DU-VAE This is the pytorch implementation of the paper "Regularizing Variational Autoencoder with Diversity and Uncertainty Awareness" Acknowledgement

Dazhong Shen 4 Oct 19, 2022
MASS (Mueen's Algorithm for Similarity Search) - a python 2 and 3 compatible library used for searching time series sub-sequences under z-normalized Euclidean distance for similarity.

Introduction MASS allows you to search a time series for a subquery resulting in an array of distances. These array of distances enable you to identif

Matrix Profile Foundation 79 Dec 31, 2022
COVID-VIT: Classification of Covid-19 from CT chest images based on vision transformer models

COVID-ViT COVID-VIT: Classification of Covid-19 from CT chest images based on vision transformer models This code is to response to te MIA-COV19 compe

17 Dec 30, 2022
[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

RCIL [CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation Chang-Bin Zhang1, Jia-Wen Xiao1, Xialei Liu1, Ying-Cong Chen2

Chang-Bin Zhang 71 Dec 28, 2022