Predicting diabetes over a five year period using logistic regression and the Pima First-Nation dataset

Last update: Mar 28, 2022

Related tags

Overview

Diabetes

This script uses the Pima First Nations dataset to create a model to predict whether or not an individual will develop Diabetes Mellitus Type 2 within a five year time span

This is a quick little project involving regression analysis and diabetes. I have created this project to better my understanding of not only the content currently being covered in my anatomy and physiology course, but also to practice working with simple regression models and common libraries.

So far, this model is able to predict values with a ~75% accuracy (not bad given the lack of data and size of the model, but not great). There are several ways to optimize this model. A few I can think of off the top of my head would be gathering more data to train it on, and cleaning the data in a different way (ie... not replacing 0 values with the mean value of that column).

Dataset found on kaggle: https://www.kaggle.com/kumargh/pimaindiansdiabetescsv

Owner

GitHub Repository

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Ray provides a simple, universal API for building distributed applications. Ray is packaged with the following libraries for accelerating machine lear

23.3k Dec 31, 2022

ML Optimizers from scratch using JAX

Toy implementations of some popular ML optimizers using Python/JAX

38 Jul 29, 2022

dirty_cat is a Python module for machine-learning on dirty categorical variables.

dirty_cat dirty_cat is a Python module for machine-learning on dirty categorical variables.

637 Dec 29, 2022

Data science, Data manipulation and Machine learning package.

duality Data science, Data manipulation and Machine learning package. Use permitted according to the terms of use and conditions set by the attached l

3 Oct 19, 2022

ThunderSVM: A Fast SVM Library on GPUs and CPUs

What's new We have recently released ThunderGBM, a fast GBDT and Random Forest library on GPUs. add scikit-learn interface, see here Overview The miss

1.4k Dec 22, 2022

A simple machine learning package to cluster keywords in higher-level groups.

Simple Keyword Clusterer A simple machine learning package to cluster keywords in higher-level groups. Example: "Senior Frontend Engineer" -- "Fronte

10 Dec 18, 2022

Management of exclusive GPU access for distributed machine learning workloads

TensorHive is an open source tool for managing computing resources used by multiple users across distributed hosts. It focuses on granting

131 Dec 12, 2022

Pandas DataFrames and Series as Interactive Tables in Jupyter

Pandas DataFrames and Series as Interactive Tables in Jupyter Star Turn pandas DataFrames and Series into interactive datatables in both your notebook

364 Jan 04, 2023

ArviZ is a Python package for exploratory analysis of Bayesian models

ArviZ (pronounced "AR-vees") is a Python package for exploratory analysis of Bayesian models. Includes functions for posterior analysis, data storage, model checking, comparison and diagnostics

1.3k Jan 05, 2023

Kaggle Tweet Sentiment Extraction Competition: 1st place solution (Dark of the Moon team)

64 Nov 30, 2022

Predict the demand for electricity (R) - FRENCH

06.demand-electricity Predict the demand for electricity (R) - FRENCH Prédisez la demande en électricité Prérequis Pour effectuer ce projet, vous devr

1 Feb 13, 2022

Python module for machine learning time series:

seglearn Seglearn is a python package for machine learning time series or sequences. It provides an integrated pipeline for segmentation, feature extr

536 Dec 29, 2022

Climin is a Python package for optimization, heavily biased to machine learning scenarios

climin climin is a Python package for optimization, heavily biased to machine learning scenarios distributed under the BSD 3-clause license. It works

177 Sep 02, 2022

Exemplary lightweight and ready-to-deploy machine learning project

6 Dec 20, 2022

A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and motion planning

pybullet-planning (previously ss-pybullet) A repository of PyBullet utility functions for robotic motion planning, manipulation planning, and task and

260 Dec 27, 2022