ThunderGBM: Fast GBDTs and Random Forests on GPUs

Last update: Dec 16, 2022

Overview

Documentations | Installation | Parameters | Python (scikit-learn) interface

What's new?

ThunderGBM won 2019 Best Paper Award from IEEE Transactions on Parallel and Distributed Systems by the IEEE Computer Society Publications Board (1 out of 987 submissions, for the work "Zeyi Wen^, Jiashuai Shi*, Bingsheng He, Jian Chen, Kotagiri Ramamohanarao, and Qinbin Li*, Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training , IEEE Transactions on Parallel and Distributed Systems, vol. 30, no. 12, 2019, pp. 2706-2717."). see more details: Best Paper Award Winners from IEEE, News from NUS School of Computing

Overview

The mission of ThunderGBM is to help users easily and efficiently apply GBDTs and Random Forests to solve problems. ThunderGBM exploits GPUs to achieve high efficiency. Key features of ThunderGBM are as follows.

Often by 10x times over other libraries.
Support Python (scikit-learn) interfaces.
Supported Operating System(s): Linux and Windows.
Support classification, regression and ranking.

Why accelerate GBDT and Random Forests: A survey conducted by Kaggle in 2017 shows that 50%, 46% and 24% of the data mining and machine learning practitioners are users of Decision Trees, Random Forests and GBMs, respectively.

GBDTs and Random Forests are often used for creating state-of-the-art data science solutions. We've listed three winning solutions using GBDTs below. Please check out the XGBoost website for more winning solutions and use cases. Here are some example successes of GDBTs and Random Forests:

Halla Yang, 2nd place, Recruit Coupon Purchase Prediction Challenge, Kaggle interview.
Owen Zhang, 1st place, Avito Context Ad Clicks competition, Kaggle interview.
Keiichi Kuroyanagi, 2nd place, Airbnb New User Bookings, Kaggle interview.

Getting Started

Prerequisites

cmake 2.8 or above
- gcc 4.8 or above for Linux | CUDA 9 or above
- Visual C++ for Windows | CUDA 10

Quick Install

For Linux with CUDA 9.0
- pip install thundergbm
For Windows (64bit)
- Download the Python wheel file (for Python3 or above)
  - CUDA 10.0 - Win64
- Install the Python wheel file
  - pip install thundergbm-0.3.4-py3-none-win_amd64.whl
Currently only support python3
After you have installed thundergbm, you can import and use the classifier (similarly for regressor) by:

from thundergbm import TGBMClassifier
clf = TGBMClassifier()
clf.fit(x, y)

Build from source

git clone https://github.com/zeyiwen/thundergbm.git
cd thundergbm
#under the directory of thundergbm
git submodule init cub && git submodule update

Build on Linux (build instructions for Windows)

#under the directory of thundergbm
mkdir build && cd build && cmake .. && make -j

Quick Start

./bin/thundergbm-train ../dataset/machine.conf
./bin/thundergbm-predict ../dataset/machine.conf

You will see RMSE = 0.489562 after successful running.

MacOS is not supported, as Apple has suspended support for some NVIDIA GPUs. We will consider supporting MacOS based on our user community feedbacks. Please stay tuned.

How to cite ThunderGBM

If you use ThunderGBM in your paper, please cite our work (TPDS and JMLR).

@ARTICLE{8727750,
  author={Z. {Wen} and J. {Shi} and B. {He} and J. {Chen} and K. {Ramamohanarao} and Q. {Li}},
  journal={IEEE Transactions on Parallel and Distributed Systems}, 
  title={Exploiting GPUs for Efficient Gradient Boosting Decision Tree Training}, 
  year={2019},
  volume={30},
  number={12},
  pages={2706-2717},
  }

@article{wenthundergbm19,
 author = {Wen, Zeyi and Shi, Jiashuai and He, Bingsheng and Li, Qinbin and Chen, Jian},
 title = {{ThunderGBM}: Fast {GBDTs} and Random Forests on {GPUs}},
 journal = {Journal of Machine Learning Research},
 volume={21},
 year = {2020}
}

Key members of ThunderGBM

Zeyi Wen, NUS (now at The University of Western Australia)
Hanfeng Liu, GDUFS (a visting student at NUS)
Jiashuai Shi, SCUT (a visiting student at NUS)
Qinbin Li, NUS
Advisor: Bingsheng He, NUS
Collaborators: Jian Chen (SCUT)

Other information

This work is supported by a MoE AcRF Tier 2 grant (MOE2017-T2-1-122) and an NUS startup grant in Singapore.

Related libraries

ThunderSVM, which is another Thunder serier software tool developed by our group.
XGBoost | LightGBM | CatBoost | cuML

Comments

build boost library error(win10)

python2.7.15

boost-install.generate-cmake-config- D:\boost\boost_1_70_0\lib\cmake\boost_coroutine-1.70.0\boost_coroutine-config.cmake boost-install.generate-cmake-config-version- D:\boost\boost_1_70_0\lib\cmake\boost_coroutine-1.70.0\boost_coroutine-config-version.cmake boost-install.generate-cmake-variant- D:\boost\boost_1_70_0\lib\cmake\boost_coroutine-1.70.0\libboost_coroutine-variant-vc140-mt-gd-x32-1_70-shared.cmake boost-install.generate-cmake-config- D:\boost\boost_1_70_0\lib\cmake\boost_thread-1.70.0\boost_thread-config.cmake boost-install.generate-cmake-config-version- D:\boost\boost_1_70_0\lib\cmake\boost_thread-1.70.0\boost_thread-config-version.cmake boost-install.generate-cmake-variant- D:\boost\boost_1_70_0\lib\cmake\boost_thread-1.70.0\libboost_thread-variant-vc140-mt-gd-x32-1_70-shared.cmake boost-install.generate-cmake-config- D:\boost\boost_1_70_0\lib\cmake\boost_date_time-1.70.0\boost_date_time-config.cmake boost-install.generate-cmake-config-version- D:\boost\boost_1_70_0\lib\cmake\boost_date_time-1.70.0\boost_date_time-config-version.cmake boost-install.generate-cmake-variant- D:\boost\boost_1_70_0\lib\cmake\boost_date_time-1.70.0\libboost_date_time-variant-vc140-mt-gd-x32-1_70-shared.cmake boost-install.generate-cmake-config- D:\boost\boost_1_70_0\lib\cmake\boost_exception-1.70.0\boost_exception-config.cmake boost-install.generate-cmake-config-version- D:\boost\boost_1_70_0\lib\cmake\boost_exception-1.70.0\boost_exception-config-version.cmake ...failed updating 2949 targets... ...skipped 16 targets... ...updated 12449 targets...

opened by lvpinrui 22
There was an error using thundergbm-predict file

My environment is: Windows 10 Visual Studio 2017 Community CMake3.14.0-rc3 CUDA10.1.105 Using test_dataset.txt in the library is successful,but Using my own .txt is failed. my thundergbm-predict：

When building thundergbm-predict ：

opened by lvpinrui 13

Make error

Hi, I am getting the following error when calling make -j

[  3%] Building CXX object src/thundergbm/CMakeFiles/thundergbm.dir/objective/ranking_obj.cpp.o
/tmp/thundergbm/src/thundergbm/objective/ranking_obj.cpp: In member function ‘virtual void LambdaRank::get_gradient(const SyncArray<float>&, const SyncArray<float>&, SyncArray<GHPair>&)’:
/tmp/thundergbm/src/thundergbm/objective/ranking_obj.cpp:50:14: error: ‘mt19937’ is not a member of ‘std’
         std::mt19937 gen(std::rand());
              ^~~~~~~
/tmp/thundergbm/src/thundergbm/objective/ranking_obj.cpp:50:14: note: suggested alternative:
In file included from /usr/include/c++/7/tr1/random:47:0,
                 from /usr/include/c++/7/parallel/random_number.h:36,
                 from /usr/include/c++/7/parallel/partition.h:38,
                 from /usr/include/c++/7/parallel/quicksort.h:36,
                 from /usr/include/c++/7/parallel/sort.h:48,
                 from /usr/include/c++/7/parallel/algo.h:45,
                 from /usr/include/c++/7/parallel/algorithm:37,
                 from /tmp/thundergbm/src/thundergbm/objective/ranking_obj.cpp:6:
/usr/include/c++/7/tr1/random.h:701:7: note:   ‘std::tr1::mt19937’
     > mt19937;
       ^~~~~~~
/tmp/thundergbm/src/thundergbm/objective/ranking_obj.cpp:61:33: error: ‘gen’ was not declared in this scope
                     int m = dis(gen);
                                 ^~~
/tmp/thundergbm/src/thundergbm/objective/ranking_obj.cpp:61:33: note: suggested alternative: ‘len’
                     int m = dis(gen);
                                 ^~~
                                 len
src/thundergbm/CMakeFiles/thundergbm.dir/build.make:11442: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/objective/ranking_obj.cpp.o' failed
make[2]: *** [src/thundergbm/CMakeFiles/thundergbm.dir/objective/ranking_obj.cpp.o] Error 1
CMakeFiles/Makefile2:131: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/all' failed
make[1]: *** [src/thundergbm/CMakeFiles/thundergbm.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

opened by psinger 13

build thundergbm

src/thundergbm/CMakeFiles/thundergbm.dir/build.make:147: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/thundergbm_generated_tree.cu.o' failed ...

I have download the latest version of ThunderGBM, but it still can't make it. can you tell me how to fix it? Thank you very much!

opened by mmobai 10
"File Modification Detected" and errors in Visual Studio

downloaded the branch with cuda 11 support, for win10 and followed the instructions. When building the thundergbm.sln file in Visual Studio, I get a message that says "File Modification Detected". When choosing "Reload All" or any of the other options, the building ends up with a bunch of errors and bugs. Please see the pictures below for more details.

These are the steps I took before getting to the building process: git clone -b support_cuda11 --single-branch https://github.com/zeyiwen/thundergbm.git cd thundergbm git submodule init cub && git submodule update cd thundergbm mkdir build cd build cmake .. -DCMAKE_WINDOWS_EXPORT_ALL_SYMBOLS=TRUE -DBUILD_SHARED_LIBS=TRUE -G "Visual Studio 16 2019" the last step is to open the file thundergbm.sln and click on "build solution" in the "Build" menu in Visual Studio.

Here is the message window.

Down below is the errors generated by the building process, in the files xutility, xmemory, atomic and LINK.

Any idea what might be causing this?

opened by AlanSpencer2 7
Bug: Memory Leak in sci-kit interface

Hello,

I'm running TGBM on a Windows 10 machine with Cuda 10. In my use case I create a lot of different GBDTs, which go out of scope after some time. The problem is, that the gpu memory stays allocated.

My expectation were, that when pythons garbage collector deletes the instance, the memory gets freed. However, even when I explicitly delete the instance, the memory stays allocated.

I think to resolve this Issue in "scikit_tgbm.cpp" the model_free function must be extended to free the allocated memory. Currently, I don't know how to do this. But maybe if I find time I will take a deeper look in memory management with Cuda.

I found a very very ugly workaround by changing some code in the "thundergbm.py". I extended the del function of TGBMModel to reload the whole dll. This isn't a good solution but okayish as a workaround if really required by somebody.

opened by Tripton 7
Building from source on ubuntu18.04 with CUDA11.0

Hi, I am trying to build thundergbm on ubuntu18.04 with CUDA 11.0 using the instructions here. While building the binary, I get a string of warnings about C++14 (from CUB and THRUST), but the build proceeds until it hits this error:

thundergbm/src/thundergbm/sparse_columns.cu(52): error: identifier "cusparseScsr2csc" is undefined 1 error detected in the compilation of "thundergbm/src/thundergbm/sparse_columns.cu". CMake Error at thundergbm_generated_sparse_columns.cu.o.Release.cmake:279 (message): Error generating file thundergbm/build/src/thundergbm/CMakeFiles/thundergbm.dir//./thundergbm_generated_sparse_columns.cu.o src/thundergbm/CMakeFiles/thundergbm.dir/build.make:9146: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/thundergbm_generated_sparse_columns.cu.o' failed make[2]: *** [src/thundergbm/CMakeFiles/thundergbm.dir/thundergbm_generated_sparse_columns.cu.o] Error 1 CMakeFiles/Makefile2:126: recipe for target 'src/thundergbm/CMakeFiles/thundergbm.dir/all' failed make[1]: *** [src/thundergbm/CMakeFiles/thundergbm.dir/all] Error 2 Makefile:83: recipe for target 'all' failed make: *** [all] Error 2

The line that is seemingly causing the problem (sparse_columns.cu(52)) is

usparseScsr2csc(handle, dataset.n_instances(), n_column, nnz, val.device_data(), row_ptr.device_data(), col_idx.device_data(), csc_val.device_data(), csc_row_idx.device_data(), csc_col_ptr.device_data(), CUSPARSE_ACTION_NUMERIC, CUSPARSE_INDEX_BASE_ZERO);

Any suggestions on how to get around this?

Thanks.
call for contribution

opened by sensharma 6
CUDA Error 29: driver shutting down

I get the following error at the end of my training for TGBMRegressor with default parameters, this doesn't allow the python process to release the GPU memory and it stays there.

2019-06-27 16:15:43,952 INFO [default] #instances = 1280, #features = 729 2019-06-27 16:15:44,031 INFO [default] copy csr matrix to GPU 2019-06-27 16:15:44,220 INFO [default] converting csr matrix to csc matrix 2019-06-27 16:15:44,483 INFO [default] Getting cut points... 2019-06-27 16:15:44,485 INFO [default] ################>>>: 1 2019-06-27 16:15:44,491 INFO [default] ----------------->>> Last LOOP: 0.0072718 2019-06-27 16:15:44,494 INFO [default] TOTAL CP:152949 2019-06-27 16:15:45,617 INFO [default] RMSE = 3.72949 2019-06-27 16:15:45,645 INFO [default] RMSE = 3.3029 2019-06-27 16:15:45,672 INFO [default] RMSE = 2.93213 2019-06-27 16:15:45,673 INFO [default] training time = 1.17665 2019-06-27 16:15:45,711 INFO [default] #instances = 549, #features = 729 CUDA error 29 [D:\thundergbm\src\thundergbm\syncmem.cpp, 576]: driver shutting down CUDA error 29 [D:\thundergbm\cub\cub/util_allocator.cuh, 657]: driver shutting down
enhancement

opened by talhachattha 6

Make error

I am getting this error after executing make -j:

/thundergbm/src/thundergbm/hist_cut.cu(142): error: a reference of type "SyncArray<float> &" (not const-qualified) cannot be initialized with a value of type "SyncArray<float_type>"

I am using the code of the support_cuda11 branch (as my CUDA version is 11).

opened by mahi045 5

FATAL [default] Check failed: [error == cudaSuccess] out of memory

My system is:

(_env) D:_env\project\series>nvidia-smi +-----------------------------------------------------------------------------+ | NVIDIA-SMI 442.19 Driver Version: 442.19 CUDA Version: 10.2 | |-------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+=================| | 0 GeForce RTX 2070 WDDM | 00000000:01:00.0 Off | N/A | | N/A 46C P8 7W / N/A | 219MiB / 8192MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |==============================================| | 0 7852 C+G ...xperience\NVIDIA GeForce Experience.exe N/A | +-----------------------------------------------------------------------------+

I am on Windows 10

When trying to run the TGBMClassifier I get the following error:

2020-02-22 20:13:15,901 INFO [default] #instances = 20289, #features = 40 2020-02-22 20:13:16,503 INFO [default] convert csr to csc using gpu... 2020-02-22 20:13:16,878 INFO [default] Converting csr to csc using time: 0.363863 s 2020-02-22 20:13:16,878 INFO [default] Fast getting cut points... 2020-02-22 20:13:16,878 FATAL [default] Check failed: [error == cudaSuccess] out of memory 2020-02-22 20:13:16,894 WARNING [default] Aborting application. Reason: Fatal log at [D:_env\project\series\thundergbm\src\thundergbm\syncmem.cpp:107]

Any suggestion how to fix this?

opened by Kagaratsch 5
Request: Run quietly (surpress all prints) with sci-kit interface

Hi, I am implementing ThunderGBM in an AutoML framework with the goal to optimise for speed. It would be great if we had the option to run .fit() and .predict() methods with no printing to stdout.

I tried searching in the Python files but it became apparent that it is not there. I did see the a reply in another issue that this is currently not possible. If you tell me where/how to do this, I could give it a try.

Thanks
enhancement

opened by beevabeeva 5
How to visualize the ThunderGBM?

Is there a way to see what is under the hood of the random forest? Is there a way to get the parameters of a trained model? Or maybe you can get the individual decisions trees as a nested-if-statement?

Thanks alot!

opened by DarkoAlexander 3
AttributeError: 'TGBMClassifier' object has no attribute 'save'

Are there functions in thundergbm that can save a trained model to file and reload it from the file later?

There is a function "clf.predict_proba()" in sklearn that calculates probabilities of the different labels. Is there a corresponding function in thundergbm?

There is a function in sklearn that calculates the importance of the different features, called "clf.feature_importances_". Is there is a similar function in thundergbm?

opened by DonSeger 1

Debug Assertion Failed

I have installed thundergbm with cuda 11 support. But when I try to run the thundergbm classifier inside a nested loop in order to do a hyperparameter search, I get the following error:

Debug Asertion Failed

Python Version: 3.7.8 Microsoft Visual Studio 19: v.16.11.8 Cuda: 11.5 cmake: 3.22.1

Below is the code that I run in a Python Jupyter Notebook:

for n in range(100,700,100):           
    for d in range(3,16,1):             
        for c in [0.4,0.5,0.6,0.7,0.8]:   
                                                           
            clf = TGBMClassifier(depth=d, n_trees = 1, n_parallel_trees=n, bagging=1, column_sampling_rate=c, objective = 
                                             "binary:logistic")
            clf.fit(X_train, y_train)

What is the reason?

opened by DonSeger 0

the Random Forest classifies everything to be 1

I am new to thundergbm, and just trying to get a simple Random Forest classifier going. But the classifier classifies every single sample to be 1. Not one single case out of 188244 samples is classified as 0. No other classifier behaves like this. I also tried different number of trees, depth etc. But it still classies everything to 1. Is there something wrong with the following code?

from thundergbm import TGBMClassifier clf = TGBMClassifier(depth=6, n_trees = 1, n_parallel_trees=100, bagging=1) clf.fit(X_train, y_train) y_pred = clf.predict(X_test)

#y_pred classifies everything in the test set (X_test) to one.

opened by AlanSpencer2 2
make: *** No targets specified and no makefile found. Stop.
I followed the installation as described in "how to build the Python wheel file".

When I ran: "mkdir build && cd build && cmake .. && make -j", the following error came up: "make: *** No targets specified and no makefile found. Stop."

All the prior steps up to this point were completed.

git clone -b support_cuda11 --single-branch https://github.com/zeyiwen/thundergbm.git

cd thundergbm #under the directory of thundergbm

git submodule init cub && git submodule update

The installation guide is very cumbersome for laymen to comprehend. Is there an easier way to install thundergbm, like for example using pip? I have tried for two weeks now unsuccessfully.
opened by DarkoAlexander 2

Releases(0.3.2)

0.3.2(May 7, 2019)

Source code(tar.gz)
Source code(zip)
ipdps-version(Sep 25, 2018)

Source code(tar.gz)
Source code(zip)
before-reorganize(Dec 12, 2017)

Source code(tar.gz)
Source code(zip)

Owner

Xtra Computing Group

GitHub Repository

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

FLAML - Fast and Lightweight AutoML

2.2k Jan 09, 2023

Falken provides developers with a service that allows them to train AI that can play their games

Falken provides developers with a service that allows them to train AI that can play their games. Unlike traditional RL frameworks that learn through rewards or batches of offline training, Falken is

223 Jan 03, 2023

Cohort Intelligence used to solve various mathematical functions

Cohort-Intelligence-for-Mathematical-Functions About Cohort Intelligence : Cohort Intelligence ( CI ) is an optimization technique. It attempts to mod

2 Oct 25, 2021

The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

mlflow_hydra_optuna_the_easy_way The easy way to combine mlflow, hydra and optuna into one machine learning pipeline. Objective TODO Usage 1. build do

9 Sep 09, 2022

SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

SageMaker Python SDK SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker. With the S

1.8k Jan 01, 2023

Python-based implementations of algorithms for learning on imbalanced data.

ND DIAL: Imbalanced Algorithms Minimalist Python-based implementations of algorithms for imbalanced learning. Includes deep and representational learn

220 Dec 13, 2022

TIANCHI Purchase Redemption Forecast Challenge

4 Aug 26, 2022

This project has Classification and Clustering done Via kNN and K-Means respectfully

This project has Classification and Clustering done Via kNN and K-Means respectfully. It later tests its efficiency via F1/accuracy/recall/precision for kNN and Davies-Bouldin Index for Clustering. T

0 Jan 20, 2022

Both social media sentiment and stock market data are crucial for stock price prediction

Relating-Social-Media-to-Stock-Movement-Public - We explore the application of Machine Learning for predicting the return of the stock by using the information of stock returns. A trading strategy ba

15 Oct 29, 2022

Estudos e projetos feitos com PySpark.

PySpark (Spark com Python) PySpark é uma biblioteca Spark escrita em Python, e seu objetivo é permitir a análise interativa dos dados em um ambiente d

54 Nov 06, 2022

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications. With just a few lines of code, you can train and deploy high-accuracy m

55 Dec 27, 2022

Optimal Randomized Canonical Correlation Analysis

ORCCA Optimal Randomized Canonical Correlation Analysis This project is for the python version of ORCCA algorithm. It depends on Numpy for matrix calc

1 Nov 21, 2021

TensorFlow implementation of an arbitrary order Factorization Machine

This is a TensorFlow implementation of an arbitrary order (=2) Factorization Machine based on paper Factorization Machines with libFM. It supports: d

785 Dec 21, 2022

Implementation of the Object Relation Transformer for Image Captioning

Object Relation Transformer This is a PyTorch implementation of the Object Relation Transformer published in NeurIPS 2019. You can find the paper here

158 Dec 24, 2022

A Python Module That Uses ANN To Predict A Stocks Price And Also Provides Accurate Technical Analysis With Many High Potential Implementations!

Stox A Module to predict the "close price" for the next day and give "technical analysis". It uses a Neural Network and the LSTM algorithm to predict

31 Dec 16, 2022

The Fuzzy Labs guide to the universe of open source MLOps

Open Source MLOps This is the Fuzzy Labs guide to the universe of free and open source MLOps tools. Contents What is MLOps, anyway? Data version contr

352 Dec 29, 2022

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and the A* Search (using the Manhattan Distance Heuristic)

17 Aug 14, 2022

Automated Time Series Forecasting

AutoTS AutoTS is a time series package for Python designed for rapidly deploying high-accuracy forecasts at scale. There are dozens of forecasting mod

652 Jan 03, 2023

This is an auto-ML tool specialized in detecting of outliers

Auto-ML tool specialized in detecting of outliers Description This tool will allows you, with a Dash visualization, to compare 10 models of machine le

1 Nov 03, 2021

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning

pybullet-planning (previously ss-pybullet) A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and

260 Dec 27, 2022

ThunderGBM: Fast GBDTs and Random Forests on GPUs

Related tags

Overview

What's new?

Overview

Getting Started

Prerequisites

Quick Install

Build from source

Build on Linux (build instructions for Windows)

Quick Start

How to cite ThunderGBM

Related papers

Key members of ThunderGBM

Other information

Related libraries

Comments

Releases(0.3.2)

0.3.2(May 7, 2019)

ipdps-version(Sep 25, 2018)

before-reorganize(Dec 12, 2017)

Owner

Xtra Computing Group

FLAML is a lightweight Python library that finds accurate machine learning models automatically, efficiently and economically

Falken provides developers with a service that allows them to train AI that can play their games

Cohort Intelligence used to solve various mathematical functions

The easy way to combine mlflow, hydra and optuna into one machine learning pipeline.

SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.

Python-based implementations of algorithms for learning on imbalanced data.

TIANCHI Purchase Redemption Forecast Challenge

This project has Classification and Clustering done Via kNN and K-Means respectfully

Both social media sentiment and stock market data are crucial for stock price prediction

Estudos e projetos feitos com PySpark.

AutoTabular automates machine learning tasks enabling you to easily achieve strong predictive performance in your applications.

Optimal Randomized Canonical Correlation Analysis

TensorFlow implementation of an arbitrary order Factorization Machine

Implementation of the Object Relation Transformer for Image Captioning

A Python Module That Uses ANN To Predict A Stocks Price And Also Provides Accurate Technical Analysis With Many High Potential Implementations!

The Fuzzy Labs guide to the universe of open source MLOps

A Python-based application demonstrating various search algorithms, namely Depth-First Search (DFS), Breadth-First Search (BFS), and A* Search (Manhattan Distance Heuristic)

Automated Time Series Forecasting

This is an auto-ML tool specialized in detecting of outliers

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning