PyPOTS - A Python Toolbox for Data Mining on Partially-Observed Time Series

Overview

Welcome to PyPOTS

A Python Toolbox for Data Mining on Partially-Observed Time Series

PyPI

⦿ Motivation: Due to all kinds of reasons like failure of collection sensors, communication error, and unexpected malfunction, missing values are common to see in time series from the real-world environment. This makes partially-observed time series (POTS) a pervasive problem in open-world modeling and prevents advanced data analysis. Although this problem is important, the area of data mining on POTS still lacks a dedicated toolkit. PyPOTS is created to fill in this blank.

⦿ Mission: PyPOTS is born to become a handy toolbox that is going to make data mining on POTS easy rather than tedious, to help engineers and researchers focus more on the core problems in their hands rather than on how to deal with the missing parts in their data. PyPOTS will keep integrating classical and the latest state-of-the-art data mining algorithms for partially-observed multivariate time series. For sure, besides various algorithms, PyPOTS is going to have unified APIs together with detailed documentation and interactive examples across algorithms as tutorials.

To make various open-source time-series datasets readily available to our users, PyPOTS gets supported by project TSDB (Time-Series DataBase), a toolbox making loading time-series datasets super easy!

Visit TSDB right now to know more about this handy tool 🛠 !

❖ Installation

Install the latest release from PyPI:

pip install pypots

Install with the latest code on GitHub:

pip install https://github.com/WenjieDu/PyPOTS/archive/main.zip

❖ Available Algorithms

Task Type Algorithm Year Reference
Imputation Neural Network SAITS: Self-Attention-based Imputation for Time Series 2022 1
Imputation Neural Network Transformer 2017 2 1
Imputation,
Classification
Neural Network BRITS (Bidirectional Recurrent Imputation for Time Series) 2018 3
Imputation Naive LOCF (Last Observation Carried Forward) - -
Classification Neural Network GRU-D 2018 4
Classification Neural Network Raindrop 2022 5
Clustering Neural Network CRLI (Clustering Representation Learning on Incomplete time-series data) 2021 6
Clustering Neural Network VaDER (Variational Deep Embedding with Recurrence) 2019 7
Forecasting Probabilistic BTTF (Bayesian Temporal Tensor Factorization) 2021 8

‼️ PyPOTS is currently under developing. If you like it and look forward to its growth, please give PyPOTS a star and watch it to keep you posted on its progress and to let me know that its development is meaningful. If you have any feedback, or want to contribute ideas/suggestions or share time-series related algorithms/papers, please join PyPOTS community and , or drop me an email.

Thank you all for your attention! 😃

Footnotes

  1. Du, W., Cote, D., & Liu, Y. (2022). SAITS: Self-Attention-based Imputation for Time Series. ArXiv, abs/2202.08516. 2

  2. Vaswani, A., Shazeer, N.M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., & Polosukhin, I. (2017). Attention is All you Need. NeurIPS 2017.

  3. Cao, W., Wang, D., Li, J., Zhou, H., Li, L., & Li, Y. (2018). BRITS: Bidirectional Recurrent Imputation for Time Series. NeurIPS 2018.

  4. Che, Z., Purushotham, S., Cho, K., Sontag, D.A., & Liu, Y. (2018). Recurrent Neural Networks for Multivariate Time Series with Missing Values. Scientific Reports, 8.

  5. Zhang, X., Zeman, M., Tsiligkaridis, T., & Zitnik, M. (2022). Graph-Guided Network for Irregularly Sampled Multivariate Time Series. ICLR 2022.

  6. Ma, Q., Chen, C., Li, S., & Cottrell, G. W. (2021). Learning Representations for Incomplete Time Series Clustering. AAAI 2021.

  7. Jong, J.D., Emon, M.A., Wu, P., Karki, R., Sood, M., Godard, P., Ahmad, A., Vrooman, H.A., Hofmann-Apitius, M., & Fröhlich, H. (2019). Deep learning for clustering of multivariate clinical patient trajectories with missing values. GigaScience, 8.

  8. Sun, L., & Chen, X. (2021). Bayesian Temporal Factorization for Multidimensional Time Series Prediction. IEEE transactions on pattern analysis and machine intelligence, PP.

Comments
  • [Feature request] Is it possible to

    [Feature request] Is it possible to "warm-up" the transformer?

    Thank you for creating this wonderful resource! This is an amazing and useful tool!

    Regarding SAITS, is it possible to pass a learning rate scheduler, rather than a fixed learning rate, for the transformer to pre-train?

    I ask this because I compared the outputs of training 100 epochs vs 1000 epochs. The loss continues to decrease, but the error on holdout timepoints does not change between 100 vs 1000 epochs. Strangely, the prediction (after 100 & 1000 epochs) is less accurate than linear interpolation...! I wondered if it is because the transformers have too many parameters, and it needs some help learning initially.

    opened by b2jia 9
  • can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    PS C:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main> & C:/Users/Lyc/AppData/Local/Programs/Python/Python39/python.exe c:/Users/Lyc/Downloads/PyPOTS-main/PyPOTS-main/pypots/tests/test_imputation.py Running test cases for BRITS... Model initialized successfully. Number of the trainable parameters: 580976 epoch 0: training loss 1.2366, validating loss 0.4201 epoch 1: training loss 0.8974, validating loss 0.3540 epoch 2: training loss 0.7426, validating loss 0.2919 epoch 3: training loss 0.6147, validating loss 0.2414 epoch 4: training loss 0.5411, validating loss 0.2157 ERunning test cases for BRITS... Model initialized successfully. Number of the trainable parameters: 580976 epoch 0: training loss 1.2054, validating loss 0.4022 epoch 1: training loss 0.8631, validating loss 0.3399 epoch 2: training loss 0.7204, validating loss 0.2863 epoch 3: training loss 0.5995, validating loss 0.2399 epoch 4: training loss 0.5325, validating loss 0.2123 ERunning test cases for LOCF... LOCF test_MAE: 0.17510570872656786 .Running test cases for LOCF... .Running test cases for SAITS... Model initialized successfully. Number of the trainable parameters: 1332704 epoch 0: training loss 0.9181, validating loss 0.2936 epoch 1: training loss 0.6287, validating loss 0.2303 epoch 2: training loss 0.5345, validating loss 0.2086 epoch 3: training loss 0.4735, validating loss 0.1895 epoch 4: training loss 0.4224, validating loss 0.1744 ERunning test cases for SAITS... Model initialized successfully. Number of the trainable parameters: 1332704 epoch 0: training loss 0.7823, validating loss 0.2779 epoch 1: training loss 0.5015, validating loss 0.2250 epoch 2: training loss 0.4418, validating loss 0.2097 epoch 3: training loss 0.4119, validating loss 0.1994 epoch 4: training loss 0.3866, validating loss 0.1815 ERunning test cases for Transformer... Model initialized successfully. Number of the trainable parameters: 666122 epoch 0: training loss 0.7715, validating loss 0.2843 epoch 1: training loss 0.4861, validating loss 0.2271 epoch 2: training loss 0.4176, validating loss 0.2077 epoch 3: training loss 0.3822, validating loss 0.2005 epoch 4: training loss 0.3592, validating loss 0.1961 ERunning test cases for Transformer... Model initialized successfully. Number of the trainable parameters: 666122 epoch 0: training loss 0.8033, validating loss 0.2910 epoch 1: training loss 0.4856, validating loss 0.2345 epoch 2: training loss 0.4282, validating loss 0.2157 epoch 3: training loss 0.3882, validating loss 0.2051 epoch 4: training loss 0.3599, validating loss 0.1942 E

    ERROR: test_impute (main.TestBRITS)

    Traceback (most recent call last): File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 125, in setUp self.brits.fit(self.train_X, self.val_X) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\brits.py", line 504, in fit self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model if np.equal(self.best_loss, float('inf')): File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    ====================================================================== ERROR: test_parameters (main.TestBRITS)

    Traceback (most recent call last): File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 125, in setUp self.brits.fit(self.train_X, self.val_X) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\brits.py", line 504, in fit self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model if np.equal(self.best_loss, float('inf')): File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    ====================================================================== ERROR: test_impute (main.TestSAITS)

    Traceback (most recent call last): File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 45, in setUp self.saits.fit(self.train_X, self.val_X) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\saits.py", line 170, in fit self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model if np.equal(self.best_loss, float('inf')): File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    ====================================================================== ERROR: test_parameters (main.TestSAITS)

    Traceback (most recent call last): File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 45, in setUp self.saits.fit(self.train_X, self.val_X) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\saits.py", line 170, in fit self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model if np.equal(self.best_loss, float('inf')): File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    ====================================================================== ERROR: test_impute (main.TestTransformer)

    Traceback (most recent call last): File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 89, in setUp self.transformer.fit(self.train_X, self.val_X) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\transformer.py", line 256, in fit self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model if np.equal(self.best_loss, float('inf')): File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.

    ====================================================================== ERROR: test_parameters (main.TestTransformer)

    Traceback (most recent call last): File "c:\Users\Lyc\Downloads\PyPOTS-main\PyPOTS-main\pypots\tests\test_imputation.py", line 89, in setUp self.transformer.fit(self.train_X, self.val_X) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\transformer.py", line 256, in fit self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask) File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\pypots\imputation\base.py", line 142, in _train_model if np.equal(self.best_loss, float('inf')): File "C:\Users\Lyc\AppData\Local\Programs\Python\Python39\lib\site-packages\torch_tensor.py", line 732, in array return self.numpy() TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.


    Ran 8 tests in 176.311s

    FAILED (errors=6)

    opened by BasinChen 8
  • BRITS imputation test fails on cuda device mismatch

    BRITS imputation test fails on cuda device mismatch

    Hi, when trying to run imputation tests with commit 6dcc8942459094e3a3fc5e11363f5d712ee8e742 on dev branch.

    py3.9_cuda11.3_cudnn8.2.0_0

    $ python -m pytest tests/test_imputation.py
    
    ./tests/test_imputation.py::TestBRITS::test_parameters Failed with Error: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
      File ".../unittest/case.py", line 59, in testPartExecutor
        yield
      File ".../unittest/case.py", line 588, in run
        self._callSetUp()
      File ".../unittest/case.py", line 547, in _callSetUp
        self.setUp()
      File ".../PyPOTS/pypots/tests/test_imputation.py", line 98, in setUp
        self.brits.fit(self.train_X, self.val_X)
      File "/PyPOTS/pypots/imputation/brits.py", line 504, in fit
        self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
      File "/PyPOTS/pypots/imputation/base.py", line 154, in _train_model
        if np.equal(self.best_loss, float("inf")):
      File .../lib/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
        return self.numpy()
    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    
    opened by MaciejSkrabski 4
  • GPU enabled model raises Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0

    GPU enabled model raises Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0

    Hello, great library, but using gpu enabled machine results in errors.

    pypots version = 0.0.6 (the one available in PyPI)

    code to replicate problem:

    import unittest
    from pypots.tests.test_imputation import TestBRITS, TestLOCF, TestSAITS, TestTransformer
    from pypots import __version__
    
    
    if __name__ == "__main__":
        print(__version__)
        unittest.main()
    

    results:

    0.0.6
    Running test cases for BRITS...
    Model initialized successfully. Number of the trainable parameters: 580976
    ERunning test cases for BRITS...
    Model initialized successfully. Number of the trainable parameters: 580976
    ERunning test cases for LOCF...
    LOCF test_MAE: 0.1712224306027283
    .Running test cases for LOCF...
    .Running test cases for SAITS...
    Model initialized successfully. Number of the trainable parameters: 1332704
    Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
    ERunning test cases for SAITS...
    Model initialized successfully. Number of the trainable parameters: 1332704
    Exception: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
    ERunning test cases for Transformer...
    Model initialized successfully. Number of the trainable parameters: 666122
    epoch 0: training loss 0.7681, validating loss 0.2941
    epoch 1: training loss 0.4731, validating loss 0.2395
    epoch 2: training loss 0.4235, validating loss 0.2069
    epoch 3: training loss 0.3781, validating loss 0.1914
    epoch 4: training loss 0.3530, validating loss 0.1837
    ERunning test cases for Transformer...
    Model initialized successfully. Number of the trainable parameters: 666122
    epoch 0: training loss 0.7826, validating loss 0.2820
    epoch 1: training loss 0.4687, validating loss 0.2352
    epoch 2: training loss 0.4188, validating loss 0.2132
    epoch 3: training loss 0.3857, validating loss 0.1977
    epoch 4: training loss 0.3604, validating loss 0.1945
    E
    ======================================================================
    ERROR: test_impute (pypots.tests.test_imputation.TestBRITS)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 99, in setUp
        self.brits.fit(self.train_X, self.val_X)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/brits.py", line 494, in fit
        training_set = DatasetForBRITS(train_X)  # time_gaps is necessary for BRITS
      File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 62, in __init__
        forward_delta = parse_delta(forward_missing_mask)
      File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 36, in parse_delta
        delta.append(torch.ones(1, n_features) + (1 - m_mask[step]) * delta[-1])
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
    
    ======================================================================
    ERROR: test_parameters (pypots.tests.test_imputation.TestBRITS)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 99, in setUp
        self.brits.fit(self.train_X, self.val_X)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/brits.py", line 494, in fit
        training_set = DatasetForBRITS(train_X)  # time_gaps is necessary for BRITS
      File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 62, in __init__
        forward_delta = parse_delta(forward_missing_mask)
      File "mydirs(...)/python3.9/site-packages/pypots/data/dataset_for_brits.py", line 36, in parse_delta
        delta.append(torch.ones(1, n_features) + (1 - m_mask[step]) * delta[-1])
    RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu!
    
    ======================================================================
    ERROR: test_impute (pypots.tests.test_imputation.TestSAITS)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 83, in _train_model
        results = self.model.forward(inputs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 95, in forward
        imputed_data, [X_tilde_1, X_tilde_2, X_tilde_3] = self.impute(inputs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 62, in impute
        enc_output, _ = encoder_layer(enc_output)
      File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 122, in forward
        enc_output, attn_weights = self.slf_attn(enc_input, enc_input, enc_input, attn_mask=mask_time)
      File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 72, in forward
        v, attn_weights = self.attention(q, k, v, attn_mask)
      File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 32, in forward
        attn = attn.masked_fill(attn_mask == 1, -1e9)
    RuntimeError: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 35, in setUp
        self.saits.fit(self.train_X, self.val_X)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 171, in fit
        self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 123, in _train_model
        raise RuntimeError('Training got interrupted. Model was not get trained. Please try fit() again.')
    RuntimeError: Training got interrupted. Model was not get trained. Please try fit() again.
    
    ======================================================================
    ERROR: test_parameters (pypots.tests.test_imputation.TestSAITS)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 83, in _train_model
        results = self.model.forward(inputs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 95, in forward
        imputed_data, [X_tilde_1, X_tilde_2, X_tilde_3] = self.impute(inputs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 62, in impute
        enc_output, _ = encoder_layer(enc_output)
      File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 122, in forward
        enc_output, attn_weights = self.slf_attn(enc_input, enc_input, enc_input, attn_mask=mask_time)
      File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 72, in forward
        v, attn_weights = self.attention(q, k, v, attn_mask)
      File "mydirs(...)/python3.9/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
        return forward_call(*input, **kwargs)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 32, in forward
        attn = attn.masked_fill(attn_mask == 1, -1e9)
    RuntimeError: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 35, in setUp
        self.saits.fit(self.train_X, self.val_X)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/saits.py", line 171, in fit
        self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 123, in _train_model
        raise RuntimeError('Training got interrupted. Model was not get trained. Please try fit() again.')
    RuntimeError: Training got interrupted. Model was not get trained. Please try fit() again.
    
    ======================================================================
    ERROR: test_impute (pypots.tests.test_imputation.TestTransformer)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 68, in setUp
        self.transformer.fit(self.train_X, self.val_X)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 257, in fit
        self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 129, in _train_model
        if np.equal(self.best_loss, float('inf')):
      File "mydirs(...)/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
        return self.numpy()
    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    
    ======================================================================
    ERROR: test_parameters (pypots.tests.test_imputation.TestTransformer)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "mydirs(...)/python3.9/site-packages/pypots/tests/test_imputation.py", line 68, in setUp
        self.transformer.fit(self.train_X, self.val_X)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/transformer.py", line 257, in fit
        self._train_model(training_loader, val_loader, val_X_intact, val_X_indicating_mask)
      File "mydirs(...)/python3.9/site-packages/pypots/imputation/base.py", line 129, in _train_model
        if np.equal(self.best_loss, float('inf')):
      File "mydirs(...)/python3.9/site-packages/torch/_tensor.py", line 732, in __array__
        return self.numpy()
    TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
    
    ----------------------------------------------------------------------
    Ran 8 tests in 20.239s
    
    FAILED (errors=6)
    

    i suspect that you call .to(device) too early on data. You might also override device parameter when initiating new tensors (i.e. in torch.ones in parse_delta)

    Best regards!

    opened by MaciejSkrabski 4
  • Early stop

    Early stop

    Wenjie,

    I tried the PyPOTS with the Beijing Air quality database. For the dataset preparation, I follow the gene_UCI_BeijingAirQuality_dataset. The following is the PyPOTS setup.

    saits_base = SAITS(seq_len=seq_len, n_features=132, 
                       n_layers=2,  # num of group-inner layers
                       d_model=256, # model hidden dim
                       d_inner=128, # hidden size of feed forward layer
                       n_head=4, # head num of self-attention
                       d_k=64, d_v=64, # key dim, value dim
                       dropout=0, 
                       epochs=200,
                       patience=30,
                       batch_size=32,
                       weight_decay=1e-5,
                       ORT_weight=1,
                       MIT_weight=1,
                      )
    
    saits_base.fit(train_set_X)
    

    PyPOTS stops earlier than the epochs specified (stops around epoch 80), without triggering either print('Exceeded the training patience. Terminating the training procedure...') or print('Finished all training epochs.').

    epoch 0: training loss 0.9637 
    epoch 1: training loss 0.6161 
    epoch 2: training loss 0.5177 
    epoch 3: training loss 0.4783 
    epoch 4: training loss 0.4489 
    ...
    epoch 73: training loss 0.2462 
    epoch 74: training loss 0.2460 
    epoch 75: training loss 0.2480 
    epoch 76: training loss 0.2452 
    epoch 77: training loss 0.2452 
    epoch 78: training loss 0.2458 
    epoch 79: training loss 0.2449 
    epoch 80: training loss 0.2423 
    epoch 81: training loss 0.2425 
    epoch 82: training loss 0.2443 
    epoch 83: training loss 0.2403 
    epoch 84: training loss 0.2406
    
    

    Then I evaluate the model performance (not knowing why the model stops early) on test_set as

    test_set_mae = cal_mae(test_set_imputation, test_set_X_intact, test_set_indicating_mask)
    0.21866121846582318
    

    I have a few questions:

    1. What could be the cause for the early stop?
    2. In addition, is there any object in saits_base that stores the loss history?
    3. Does the function cal_mae calculate the same MAE in your paper? For this Beijing air quality case, I should be able to tune the hyperparameter to get the test_set_mae down to around 0.146?

    Thank you, Haochen

    opened by Rdfing 2
  • fix: brits on cuda

    fix: brits on cuda

    Some tensors created on the fly (mainly in base.py and dataset_for_brits.py ) used to ignore the model's and data's device (cpu or gpu). This caused BRITS to throw errors whenever users wanted to run it on cuda enabled machine.

    opened by MaciejSkrabski 1
  • Update and fix the dependencies in the development env

    Update and fix the dependencies in the development env

    As mentioned in #7, I am trying to update and fix the dependencies of the development environment for testing cases. Expecting this can help speed up the processing of setting up Conda when running tests.

    opened by WenjieDu 1
  • refactor: explicit channels in conda env ymls

    refactor: explicit channels in conda env ymls

    Hi! You may have noticed that, when creating a new conda environment from *.yml file, it takes ages to solve package dependencies. I attempt to speed the process up by explicitly defining channel in which to search for a package. I also defined minimal pandas version to be 1.4.1 - the things were weird before that. I also allow for python versions newer than 3.7.13 and I believe you'll find it acceptable.

    Please let me know if this is in any way helpful.

    opened by MaciejSkrabski 9
Releases(v0.0.9)
  • v0.0.9(Dec 20, 2022)

    In this version, we speed up the installation process of PyPOTS. We noticed that torch_geometric and related dependencies take too much time to install. Therefore, they're removed from the list of requirements. They're necessary for the graph model RainDrop. Hence, users who need RainDrop have to install torch_geometric manually after they set up PyPOTS.

    What's Changed

    • Merge updates by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/23
    • Update README and add the configurations of docs by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/24
    • Merge dev into main by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/26
    • Merge dev into main by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/28

    Full Changelog: https://github.com/WenjieDu/PyPOTS/compare/v0.0.8...v0.0.9

    Source code(tar.gz)
    Source code(zip)
  • v0.0.8(Sep 13, 2022)

    Fixed bugs with running on CUDA devices;

    What's Changed

    • fix: brits imputation test device mismatch by @MaciejSkrabski in https://github.com/WenjieDu/PyPOTS/pull/11
    • Merge branch 'dev' into main by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/13
    • feat: add workflow Publish-to-PyPI by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/14
    • Merge branch 'dev' into main by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/15
    • Specify Conda channels for the dependencies by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/18
    • fix the bug of tensors on different devices by @WenjieDu in https://github.com/WenjieDu/PyPOTS/pull/22

    New Contributors

    • @MaciejSkrabski made their first contribution in https://github.com/WenjieDu/PyPOTS/pull/11
    • @WenjieDu made their first contribution in https://github.com/WenjieDu/PyPOTS/pull/13

    Full Changelog: https://github.com/WenjieDu/PyPOTS/compare/v0.0.7...v0.0.8

    Source code(tar.gz)
    Source code(zip)
  • v0.0.7(Jul 12, 2022)

Owner
Wenjie Du
"Do one thing, and do it well."
Wenjie Du
Simple, light-weight config handling through python data classes with to/from JSON serialization/deserialization.

Simple but maybe too simple config management through python data classes. We use it for machine learning.

Eren Gölge 67 Nov 29, 2022
A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al.

pyUpSet A pure-python implementation of the UpSet suite of visualisation methods by Lex, Gehlenborg et al. Contents Purpose How to install How it work

288 Jan 04, 2023
A Tools that help Data Scientists and ML engineers train and deploy ML models.

Domino Research This repo contains projects under active development by the Domino R&D team. We build tools that help Data Scientists and ML engineers

Domino Data Lab 73 Oct 17, 2022
inding a method to objectively quantify skill versus chance in games, using reinforcement learning

Skill-vs-chance-games-analysis - Finding a method to objectively quantify skill versus chance in games, using reinforcement learning

Marcus Chiam 4 Nov 19, 2022
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.7k Jan 04, 2023
A Pythonic framework for threat modeling

pytm: A Pythonic framework for threat modeling Introduction Traditional threat modeling too often comes late to the party, or sometimes not at all. In

Izar Tarandach 644 Dec 20, 2022
A Python implementation of GRAIL, a generic framework to learn compact time series representations.

GRAIL A Python implementation of GRAIL, a generic framework to learn compact time series representations. Requirements Python 3.6+ numpy scipy tslearn

3 Nov 24, 2021
As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Crate will be the hub of various ML projects which will be the resources for the ML enthusiasts! Open Source Program: SWOC 2021 and JWOC 2022.

Machine Learning Loot Crate 💻 🧰 🔴 Welcome contributors! As we all know the BGMI Loot Crate comes with so many resources for the gamers, this ML Cra

Abhishek Sharma 89 Dec 28, 2022
High performance, easy-to-use, and scalable machine learning (ML) package, including linear model (LR), factorization machines (FM), and field-aware factorization machines (FFM) for Python and CLI interface.

What is xLearn? xLearn is a high performance, easy-to-use, and scalable machine learning package that contains linear model (LR), factorization machin

Chao Ma 3k Jan 08, 2023
MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

SUPSI-DACD-ISAAC 61 Dec 19, 2022
Predict the output which should give a fair idea about the chances of admission for a student for a particular university

Predict the output which should give a fair idea about the chances of admission for a student for a particular university.

ArvindSandhu 1 Jan 11, 2022
Code base of KU AIRS: SPARK Autonomous Vehicle Team

KU AIRS: SPARK Autonomous Vehicle Project Check this link for the blog post describing this project and the video of SPARK in simulation and on parkou

Mehmet Enes Erciyes 1 Nov 23, 2021
Mortality risk prediction for COVID-19 patients using XGBoost models

Mortality risk prediction for COVID-19 patients using XGBoost models Using demographic and lab test data received from the HM Hospitales in Spain, I b

1 Jan 19, 2022
#30DaysOfStreamlit is a 30-day social challenge for you to build and deploy Streamlit apps.

30 Days Of Streamlit 🎈 This is the official repo of #30DaysOfStreamlit — a 30-day social challenge for you to learn, build and deploy Streamlit apps.

Streamlit 53 Jan 02, 2023
Laporan Proyek Machine Learning - Azhar Rizki Zulma

Laporan Proyek Machine Learning - Azhar Rizki Zulma Project Overview Domain proyek yang dipilih dalam proyek machine learning ini adalah mengenai hibu

Azhar Rizki Zulma 6 Mar 12, 2022
NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

NCVX (NonConVeX): A User-Friendly and Scalable Package for Nonconvex Optimization in Machine Learning.

SUN Group @ UMN 28 Aug 03, 2022
GroundSeg Clustering Optimized Kdtree

ground seg and clustering based on kitti velodyne data, and a additional optimized kdtree for knn and radius nn search

2 Dec 02, 2021
Pydantic based mock data generation

This library offers powerful mock data generation capabilities for pydantic based models. It can also be used with other libraries that use pydantic as a foundation, for example SQLModel, Beanie and

Na'aman Hirschfeld 396 Dec 28, 2022
Massively parallel self-organizing maps: accelerate training on multicore CPUs, GPUs, and clusters

Somoclu Somoclu is a massively parallel implementation of self-organizing maps. It exploits multicore CPUs, it is able to rely on MPI for distributing

Peter Wittek 239 Nov 10, 2022
Distributed Deep learning with Keras & Spark

Elephas: Distributed Deep Learning with Keras & Spark Elephas is an extension of Keras, which allows you to run distributed deep learning models at sc

Max Pumperla 1.6k Dec 29, 2022