LLVM-based compiler for LightGBM gradient-boosted trees. Speeds up prediction by ≥10x.

Overview

lleaves 🍃

CI Documentation Status Downloads

A LLVM-based compiler for LightGBM decision trees.

lleaves converts trained LightGBM models to optimized machine code, speeding-up prediction by ≥10x.

Example

lgbm_model = lightgbm.Booster(model_file="NYC_taxi/model.txt")
%timeit lgbm_model.predict(df)
# 12.77s

llvm_model = lleaves.Model(model_file="NYC_taxi/model.txt")
llvm_model.compile()
%timeit llvm_model.predict(df)
# 0.90s 

Why lleaves?

  • Speed: Both low-latency single-row prediction and high-throughput batch-prediction.
  • Drop-in replacement: The interface of lleaves.Model is a subset of LightGBM.Booster.
  • Dependencies: llvmlite and numpy. LLVM comes statically linked.

Installation

conda install -c conda-forge lleaves or pip install lleaves (Linux and MacOS only).

Benchmarks

Ran on a dedicated Intel i7-4770 Haswell, 4 cores. Stated runtime is the minimum over 20.000 runs.

Dataset: NYC-taxi

mostly numerical features.

batchsize 1 10 100
LightGBM 52.31μs 84.46μs 441.15μs
ONNX Runtime 11.00μs 36.74μs 190.87μs
Treelite 28.03μs 40.81μs 94.14μs
lleaves 9.61μs 14.06μs 31.88μs

Dataset: MTPL2

mix of categorical and numerical features.

batchsize 10,000 100,000 678,000
LightGBM 95.14ms 992.47ms 7034.65ms
ONNX Runtime 38.83ms 381.40ms 2849.42ms
Treelite 38.15ms 414.15ms 2854.10ms
lleaves 5.90ms 56.96ms 388.88ms

Advanced usage

To avoid any Python overhead during prediction you can link directly against the generated binary. See benchmarks/c_bench/ for an example of how to do this. The function signature can change between major versions.

Development

conda env create
conda activate lleaves
pip install -e .
pre-commit install
./benchmarks/data/setup_data.sh
pytest
Comments
  • How can we reduce the size of the compiled file?

    How can we reduce the size of the compiled file?

    Hello Simon,

    I tried to compile a LightGBM model using both LLeaves and TreeLite.

    I found that the compiled .so file by LLeaves is ~80% larger than the files compiled by TreeLite.

    I want ask that is it possible to reduce the size of the compiled .so file by LLeaves?

    opened by fuyw 7
  • Only one CPU is used in prediction

    Only one CPU is used in prediction

    Hi, We installed lleaves using pip install lleaves. We found that the prediction can only utilize one CPU core, though we set n_jobs=8, while we have 8 CPU cores. This is inconsistent with the lleaves code here: https://github.com/siboehm/lleaves/blob/master/lleaves/lleaves.py#L140

    Why will that happen? Any suggestions are highly appreciated.

    opened by jiazou-bigdata 4
  • extract_pandas_traintime_categories: return empty list if pandas_categorical is null in model file

    extract_pandas_traintime_categories: return empty list if pandas_categorical is null in model file

    In lightgbm model file, pandas_categorical may be null. When call data_processing._dataframe_to_ndarray, exception happend caused by if len(cat_cols) != len(pd_traintime_categories) TypeError: object of type 'NoneType' has no len()

    opened by chenglin 4
  • saving models persistently

    saving models persistently

    First of all, thank you for your impressive work. I wanted to ask if there is a way to store the compiled models in a persistent way. In my case the predictor is composed by ~100 LightGBM models, so that the compiling of the full predictor is highly time consuming. When I tried to pickle the compiled lleaves model, I got:

    ValueError: ctypes objects containing pointers cannot be pickled

    for which I guess there is no easy workaround. Do you know if it possible to avoid re-compilation of the original LightGBM instances? Thank you

    opened by nepslor 4
  • Benchmarking

    Benchmarking

    I tried benchmarking lleaves vs treelite and found that lleaves is slightly slower than treelite. I might be doing something wrong?

    I benchmark in google benchmark with batch size 1 and random features. I have ~600 trees with 450 leaves and max depth 13. Treelite is compiled with clang 10.0. I think we did see that treelite was a lot slower using GCC.

    I noticed that the compile step for lleaves took severeal hours, so maybe the forest I'm using is somehow off?

    In any case I think your library looks very nice :)

    Xeon E2278G

    ------------------------------------------------------
    Benchmark            Time             CPU   Iterations
    ------------------------------------------------------
    BM_LLEAVES       32488 ns        32487 ns        21564
    BM_TREELITE      27251 ns        27250 ns        25635
    

    EPYC 7402P

    ------------------------------------------------------
    Benchmark            Time             CPU   Iterations
    ------------------------------------------------------
    BM_LLEAVES       38020 ns        38019 ns        18308
    BM_TREELITE      32155 ns        32154 ns        21579
    
    #include <benchmark/benchmark.h>
    #include <random>
    #include "lleavesheader.h"
    #include "treeliteheader.h"
    #include <vector>
    #include <random>
    #include <iostream>
    
    
    constexpr int NUM_FATURES = 108;
    
    static void BM_LLEAVES(benchmark::State& state)
    {
    
        std::random_device dev;
        std::mt19937 rng(dev());
        std::uniform_int_distribution<std::mt19937::result_type> dist(-10,10); 
    
        std::size_t N = 10000000;
        std::vector<double> f;
        f.reserve(N);
        for (std::size_t i = 0; i<N; ++i){
            f.push_back(dist(rng));
        }
    
        double out;
        std::size_t i = 0;
        for (auto _ : state) {
            forest_root(f.data()+NUM_FATURES*i, &out, (int)0, (int)1);
            ++i;
        }
    }
    
    static void BM_TREELITE(benchmark::State& state)
    {
    
        std::random_device dev;
        std::mt19937 rng(dev());
        std::uniform_int_distribution<std::mt19937::result_type> dist(-10,10); // distribution in range [1, 6]
    
        std::size_t N = 10000000;
        std::vector<Entry> f;
        f.reserve(N);
        for (std::size_t i = 0; i<N; ++i){
            auto e = DE::Entry();
            e.missing = -1;
            e.fvalue = dist(rng);
            e.qvalue = 0;
            f.push_back(e);
        }
    
        std::size_t i = 0;
        union DE::Entry *pFeatures = nullptr; 
        for (auto _ : state) {
            pFeatures = f.data()+NUM_FATURES*i;
            predict(pFeatures, 1);   // call treelite predict function
    
            ++i;
            
        }
    }
    BENCHMARK(BM_LLEAVES);
    BENCHMARK(BM_TREELITE);
    BENCHMARK_MAIN();
    
    opened by skaae 4
  • Additional performance benchmarks

    Additional performance benchmarks

    Hi, currently evaluating this as a potential performance enhancement on our MLOps / Inference stack.

    Tought I'd give some numbers here (based on MacBook Pro 2019).

    Test set up as follows: a) generate artificial data X = 1E6 x 200 float64, Y = X.sum() for regression, Y = X.sum() > 100 for binary classifier b) for n_feat in [...] -> fit model on 1000 samples and n_feat features; compile model c) for batchsize in [...] -> predict 10 times a randomly sampled batch of all data items, using (1) LGBM.predict(), (2). lleaves.predict(), (3) lleaves.predict(n_jobs=1); measure TOTAL time taken

    For regression results are:

    image

    Independent of the number of features, the break-even between parallel lleaves and 1 job seems to be around 1k samples at once, independent of the number of features. Using this logic, we would get better performance than LGBM at all number of samples.

    For classification:

    image

    Also, here, the break-even is around 1k samples.

    For classification with HIGHLY IMBALANCED data (1/50 positive), the break-even is only at 10k samples - Any ideas on why this is the case?

    image

    opened by Zahlii 4
  • [Question] how does model cache play with distributed workers with different CPUs?

    [Question] how does model cache play with distributed workers with different CPUs?

    Hello, thank you for this great library. I have a question about the model cache file. I am using Ray to manage a small cluster of PCs with both Intel/AMD CPUs, and different OS (Ubuntu/ClearLinux). My program has been using numba to speed things up, and the JIT mode (instead AIT mode) works fine. Ray can send the numba functions to different PCs in the cluster and they compile locally.

    So for lleaves, if I compile the models on one node, and distribute the generated cache file to all nodes in the cluster, will it work? or I have to stick to the "JIT" mode, where models are always compiled locally each time? I am using ensemble methods with many lgbm models (total >1000, each is small about 100 trees, max_depth 10). Or maybe I should have all models compiled locally on each PC? Thank you.

    opened by crayonfu 3
  • How to use multiple models via the C_API?

    How to use multiple models via the C_API?

    Hi Simon, many thanks for the nice work. I have a question about using the C_API:

    If I have 2 LightGBM models in my application, and I want to predict using the C_API. I might need to have following two functions:

    void forest_root_model1(double *, double *, int, int);
    
    void forest_root_model2(double *, double *, int, int);
    

    Do I need to modify the llvm_model.compile() function to change the function names?

    opened by fuyw 3
  • Does this cause core dump ?

    Does this cause core dump ?

    Recently, I find that one of my model will cause core dump if I use lleaves for predict.

    I am confused about two functions below.

    In codegen.py, function param type can be int* if param is categorical

    def make_tree(tree):
        # declare the function for this tree
        func_dtypes = (INT_CAT if f.is_categorical else DOUBLE for f in tree.features)
        scalar_func_t = ir.FunctionType(DOUBLE, func_dtypes)
        tree_func = ir.Function(module, scalar_func_t, name=str(tree))
        tree_func.linkage = "private"
        # populate function with IR
        gen_tree(tree, tree_func)
        return LTree(llvm_function=tree_func, class_id=tree.class_id)
    

    But in data_processing.py with predict used, all feature param are convert to double*

    def ndarray_to_ptr(data: np.ndarray):
        """
        Takes a 2D numpy array, converts to float64 if necessary and returns a pointer
    
        :param data: 2D numpy array. Copying is avoided if possible.
        :return: pointer to 1D array of dtype float64.
        """
        # ravel makes sure we get a contiguous array in memory and not some strided View
        data = data.astype(np.float64, copy=False, casting="same_kind").ravel()
        ptr = data.ctypes.data_as(POINTER(c_double))
        return ptr
    

    Is this just like

    int* predict(int* a, double* b);
    double a = 1.1;
    double b = 2.2;
    predict(&a, &b);
    

    Does this will happy in lleaves?

    opened by chenglin 3
  • compile with multiple threads

    compile with multiple threads

    I find that compile can only use one cpu core. For my model, it may take long time to compile.

    Can make compile with multiple threads, just like make -j ?

    opened by chenglin 3
  • Accept boosters as model inputs?

    Accept boosters as model inputs?

    Model currently requires the path to a model file. I was wondering if it'd make sense to also accept a booster. We could call to_string and save it as a temporary file or just work with the string representation directly. It'd make users' life (a little) easier.

    opened by lbittarello 3
  • Bump pypa/gh-action-pypi-publish from 1.5.2 to 1.6.4

    Bump pypa/gh-action-pypi-publish from 1.5.2 to 1.6.4

    Bumps pypa/gh-action-pypi-publish from 1.5.2 to 1.6.4.

    Release notes

    Sourced from pypa/gh-action-pypi-publish's releases.

    v1.6.4

    oh, boi! again?

    This is the last one tonight, promise! It fixes this embarrassing bug that was actually caught by the CI but got overlooked due to the lack of sleep. TL;DR GH passed $HOME from the external env into the container and that tricked the Python's site module to think that the home directory is elsewhere, adding non-existent paths to the env vars. See #115.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.3...v1.6.4

    v1.6.3

    Another Release!? Why?

    In pypa/gh-action-pypi-publish#112, it was discovered that passing a $PATH variable even breaks the shebang. So this version adds more safeguards to make sure it keeps working with a fully broken $PATH.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.2...v1.6.3

    v1.6.2

    What's Fixed

    • Made the $PATH and $PYTHONPATH environment variables resilient to broken values passed from the host runner environment, which previously allowed the users to accidentally break the container's internal runtime as reported in pypa/gh-action-pypi-publish#112

    Internal Maintenance Improvements

    New Contributors

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.1...v1.6.2

    v1.6.1

    What's happened?!

    There was a sneaky bug in v1.6.0 which caused Twine to be outside the import path in the Python runtime. It is fixed in v1.6.1 by updating $PYTHONPATH to point to a correct location of the user-global site-packages/ directory.

    Full Diff: https://github.com/pypa/gh-action-pypi-publish/compare/v1.6.0...v1.6.1

    v1.6.0

    Anything's changed?

    The only update is that the Python runtime has been upgraded from 3.9 to 3.11. There are no functional changes in this release.

    Full Changelog: https://github.com/pypa/gh-action-pypi-publish/compare/v1.5.2...v1.6.0

    Commits
    • c7f29f7 🐛 Override $HOME in the container with /root
    • 644926c 🧪 Always run smoke testing in debug mode
    • e71a4a4 Add support for verbose bash execusion w/ $DEBUG
    • e56e821 🐛 Make id always available in twine-upload
    • c879b84 🐛 Use full path to bash in shebang
    • 57e7d53 🐛Ensure the default $PATH value is pre-loaded
    • ce291dc 🎨🐛Fix the branch @ pre-commit.ci badge links
    • 102d8ab 🐛 Rehardcode devpi port for GHA srv container
    • 3a9eaef 🐛Use different ports in/out of GHA containers
    • a01fa74 🐛 Use localhost @ GHA outside the containers
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    dependencies 
    opened by dependabot[bot] 0
  • lleaves costs too much memory

    lleaves costs too much memory

    Thanks for your great work. I want to try it out, but the memory consumption is staggering. I can't use it on a machine with 32G memory, out of memory will occur Is there something I haven't set right?

    import lleaves
    import pdb
    
    MODEL_TXT_PATH = '/home/kk/models/lgb080401_amp.txt'
    llvm_model = lleaves.Model(model_file=MODEL_TXT_PATH)
    # num = llvm_model.num_trees()
    # pdb.set_trace()
    llvm_model.compile(cache='./lleaves.bin', fblocksize=34)
    
    opened by 111qqz 5
  • Platform interoperability

    Platform interoperability

    Is there a way to effectively check if compiled models are able to run on a machine?

    I am running predictions on various platforms, when loading the compiled model, I load the one which was compiled on the same platform (using: PLATFORM = sys.platform + '-' + sysconfig.get_platform().split('-')[-1].lower(), resulting in either darwin-arm64 or linux-x86_64). However sometimes models which are compiled in a linux-x86_64 environment, are not interoperable with other linux-x86_64 machines (I use AWS Fargate, which runs the container on whatever hardware is available). This results in exit code 132 (Illegal Instruction) in the model.predict() loop.

    The underlying reason is probably that the underlying machines are not the same architecture (ARM based?). For example, when I compile a model within a Docker container (with DOCKER_DEFAULT_PLATFORM=linux/amd64) on my M1 Mac, it registers the platform as linux-x86_64, but the model cannot be used on AWS linux machine using Docker.

    What would be a solid way to go about this issue? Is there some LLVM version which I need to look at in order for models to be interoperable?

    Thanks a lot.

    opened by TomScheffers 3
  • Improve Python API for model serialization

    Improve Python API for model serialization

    I am loading a model with thousands of trees, which takes approx. 10minutes. Therefore I want to compile the model once and then serialize it to file. Pickle or dill give the following error: "ValueError: ctypes objects containing pointers cannot be pickled". Is there a way to save/load the file to/from disk? Thanks :)

    enhancement 
    opened by TomScheffers 3
  • Windows support

    Windows support

    Forgive me, I’m ignorant of compilation nuances on different operating systems. Is windows support on PyPi possible? I see that windows support was removed in a PR about a year ago, but there are no notes.

    opened by AnotherSamWilson 1
Releases(0.2.7)
  • 0.2.7(Aug 10, 2022)

    What's Changed

    • Avoid undefined behaviour / poison by checking for NaNs before llvm::fptosi by @siboehm in https://github.com/siboehm/lleaves/pull/23. This broke categorical predictions when NaNs occurred, but only on ARM arch.

    Full Changelog: https://github.com/siboehm/lleaves/compare/0.2.6...0.2.7

    Source code(tar.gz)
    Source code(zip)
  • 0.2.6(Jul 10, 2022)

    Minor new feature: Allow specification of the root function's name in the compiled binary. This enables linking against multiple lleaves-compiled trees. Thanks @fuyw!

    What's Changed

    • Chore: Bump pre-commit and Github actions + py3.10 on CI by @siboehm in https://github.com/siboehm/lleaves/pull/22
    • add function_name to compiler by @fuyw in https://github.com/siboehm/lleaves/pull/21

    New Contributors

    • @fuyw made their first contribution in https://github.com/siboehm/lleaves/pull/21

    Full Changelog: https://github.com/siboehm/lleaves/compare/0.2.5...0.2.6

    Source code(tar.gz)
    Source code(zip)
  • 0.2.5(Mar 23, 2022)

  • 0.2.4(Nov 22, 2021)

  • 0.2.3(Nov 21, 2021)

  • 0.2.2(Sep 26, 2021)

    • Compiler flags to tune performance & compilation speed: fblocksize, finline, fcodemodel.
    • Compile parameter raw_score, equivalent to the raw_score parameter of LightGBM's Booster.predict().
    Source code(tar.gz)
    Source code(zip)
  • 0.2.1(Sep 2, 2021)

  • 0.2.0(Jul 28, 2021)

    Focus on performance improvements.

    • Instruction cache blocking
    • Agressive function inlining
    • Proper native arch targeting
    • Objective functions lowered into IR

    Small models now run ~30% faster, large models ~300% faster.

    Plus: code refactor for readability

    Source code(tar.gz)
    Source code(zip)
  • 0.1.1(Jun 27, 2021)

  • 0.1.0(Jun 26, 2021)

Owner
Simon Boehm
Data Engineering @QuantCo | Master's thesis @theislab | CS student @ ETH Zurich.
Simon Boehm
Semiconductor Machine learning project

Wafer Fault Detection Problem Statement: Wafer (In electronics), also called a slice or substrate, is a thin slice of semiconductor, such as a crystal

kunal suryawanshi 1 Jan 15, 2022
Zalo AI challenge 2021 task hum to song

Zalo AI challenge 2021 task Hum to Song pipeline: Chuẩn bị dữ liệu cho quá trình train: Sửa các file đường dẫn trong config/preprocess.yaml raw_path:

Vo Van Phuc 105 Dec 16, 2022
[CVPR'21] Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation

Projecting Your View Attentively: Monocular Road Scene Layout Estimation via Cross-view Transformation Weixiang Yang, Qi Li, Wenxi Liu, Yuanlong Yu, Y

118 Dec 26, 2022
MRQy is a quality assurance and checking tool for quantitative assessment of magnetic resonance imaging (MRI) data.

Front-end View Backend View Table of Contents Description Prerequisites Running Basic Information Measurements User Interface Feedback and usage Descr

Center for Computational Imaging and Personalized Diagnostics 58 Dec 02, 2022
Rayvens makes it possible for data scientists to access hundreds of data services within Ray with little effort.

Rayvens augments Ray with events. With Rayvens, Ray applications can subscribe to event streams, process and produce events. Rayvens leverages Apache

CodeFlare 32 Dec 25, 2022
JAX bindings to the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) library

JAX bindings to FINUFFT This package provides a JAX interface to (a subset of) the Flatiron Institute Non-uniform Fast Fourier Transform (FINUFFT) lib

Dan Foreman-Mackey 32 Oct 15, 2022
A scientific and useful toolbox, which contains practical and effective long-tail related tricks with extensive experimental results

Bag of tricks for long-tailed visual recognition with deep convolutional neural networks This repository is the official PyTorch implementation of AAA

Yong-Shun Zhang 181 Dec 28, 2022
AgML is a comprehensive library for agricultural machine learning

AgML is a comprehensive library for agricultural machine learning. Currently, AgML provides access to a wealth of public agricultural datasets for common agricultural deep learning tasks.

Plant AI and Biophysics Lab 1 Jul 07, 2022
QA-GNN: Question Answering using Language Models and Knowledge Graphs

QA-GNN: Question Answering using Language Models and Knowledge Graphs This repo provides the source code & data of our paper: QA-GNN: Reasoning with L

Michihiro Yasunaga 434 Jan 04, 2023
🧮 Matrix Factorization for Collaborative Filtering is just Solving an Adjoint Latent Dirichlet Allocation Model after All

Accompanying source code to the paper "Matrix Factorization for Collaborative Filtering is just Solving an Adjoint Latent Dirichlet Allocation Model A

Florian Wilhelm 39 Dec 03, 2022
This folder contains the python code of UR5E's advanced forward kinematics model.

This folder contains the python code of UR5E's advanced forward kinematics model. By entering the angle of the joint of UR5e, the detailed coordinates of up to 48 points around the robot arm can be c

Qiang Wang 4 Sep 17, 2022
Bulk2Space is a spatial deconvolution method based on deep learning frameworks

Bulk2Space Spatially resolved single-cell deconvolution of bulk transcriptomes using Bulk2Space Bulk2Space is a spatial deconvolution method based on

Dr. FAN, Xiaohui 60 Dec 27, 2022
Exporter for Storage Area Network (SAN)

SAN Exporter Prometheus exporter for Storage Area Network (SAN). We all know that each SAN Storage vendor has their own glossary of terms, health/perf

vCloud 32 Dec 16, 2022
CVPR 2021 Official Pytorch Code for UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training

UC2 UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pre-training Mingyang Zhou, Luowei Zhou, Shuohang Wang, Yu Cheng, Linjie Li, Zhou Yu,

Mingyang Zhou 28 Dec 30, 2022
[CVPR 2021] 'Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator'

[CVPR2021] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator Overview This is the entire codebase for the paper

35 Dec 01, 2022
Yas CRNN model training - Yet Another Genshin Impact Scanner

Yas-Train Yet Another Genshin Impact Scanner 又一个原神圣遗物导出器 介绍 该仓库为 Yas 的模型训练程序 相关资料 MobileNetV3 CRNN 使用 假设你会设置基本的pytorch环境。 生成数据集 python main.py gen 训练

wormtql 18 Jan 08, 2023
Code for Discriminative Sounding Objects Localization (NeurIPS 2020)

Discriminative Sounding Objects Localization Code for our NeurIPS 2020 paper Discriminative Sounding Objects Localization via Self-supervised Audiovis

51 Dec 11, 2022
Code to train models from "Paraphrastic Representations at Scale".

Paraphrastic Representations at Scale Code to train models from "Paraphrastic Representations at Scale". The code is written in Python 3.7 and require

John Wieting 71 Dec 19, 2022
docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

docTR by Mindee (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Mindee 1.5k Jan 01, 2023
【steal piano】GitHub偷情分析工具!

【steal piano】GitHub偷情分析工具! 你是否有这样的困扰,有一天你的仓库被很多人加了star,但是你却不知道这些人都是从哪来的? 别担心,GitHub偷情分析工具帮你轻松解决问题! 原理 GitHub偷情分析工具透过分析star的时间以及他们之间的follow关系,可以推测出每个st

黄巍 442 Dec 21, 2022