Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc)

Overview

Heart-Failure-Prediction

Built various Machine Learning algorithms (Logistic Regression, Random Forest, KNN, Gradient Boosting and XGBoost. etc). Structured a custom ensemble model and a neural network. Found a outperformed model for heart failure prediction accuracy of 88 percent.

Introduction

Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Four out of 5CVD deaths are due to heart attacks and strokes. Heart failure is a common event caused by CVDs and this dataset contains 11 features that can be used to predict a possible heart disease.

People with cardiovascular disease or who are at high cardiovascular risk need early detection and management wherein a machine learning model can be of great help.

Table of Contents

  • [Data]

    • [What We need to do]
  • [Exploratory Data Analysis]

    • [Target Variable]
    • [Features]
  • [Model Selection]

    • [Model Creation and Comparison]
    • [Bulid a custom ensemble (superlearner) with best three of models]
    • [Neural Networks]
    • [Feature Importance]
  • [Conclusion]

DATA

1 Age: Age of the patient [years]

2 Sex: Sex of the patient [M: Male, F: Female]

3 ChestPainType: [TA: Typical Angina, ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic]

4 RestingBP: Resting blood pressure [mm Hg]

5 Cholesterol: Serum cholesterol [mm/dl]

6 FastingBS: Fasting blood sugar [1: if FastingBS > 120 mg/dl, 0: otherwise]

7 RestingECG: Resting electrocardiogram results [Normal: Normal, ST: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), LVH: showing probable or definite left ventricular hypertrophy by Estes' criteria]

8 MaxHR: Maximum heart rate achieved [Numeric value between 60 and 202]

9 ExerciseAngina: Exercise-induced angina [Y: Yes, N: No]

10 Oldpeak: ST [Numeric value measured in depression] (

11 ST_Slope: The slope of the peak exercise ST segment [Up: upsloping, Flat: flat, Down: downsloping]

12 HeartDisease: Output class [1: heart disease, 0: Normal]

Reference: https://www.kaggle.com/fedesoriano/heart-failure-prediction

Owner
Chris Yuan
Motivated, open-minded, and detail-oriented student currently working towards a degree in Data Science.
Chris Yuan
Python Research Framework

Python Research Framework

EleutherAI 106 Dec 13, 2022
Firebase + Cloudrun + Machine learning

A simple end to end consumer lending decision engine powered by Google Cloud Platform (firebase hosting and cloudrun)

Emmanuel Ogunwede 8 Aug 16, 2022
BioPy is a collection (in-progress) of biologically-inspired algorithms written in Python

BioPy is a collection (in-progress) of biologically-inspired algorithms written in Python. Some of the algorithms included are mor

Jared M. Smith 40 Aug 26, 2022
This repository contains the code to predict house price using Linear Regression Method

House-Price-Prediction-Using-Linear-Regression The dataset I used for this personal project is from Kaggle uploaded by aariyan panchal. Link of Datase

0 Jan 28, 2022
A library of sklearn compatible categorical variable encoders

Categorical Encoding Methods A set of scikit-learn-style transformers for encoding categorical variables into numeric by means of different techniques

2.1k Jan 07, 2023
JMP is a Mixed Precision library for JAX.

Mixed precision training [0] is a technique that mixes the use of full and half precision floating point numbers during training to reduce the memory bandwidth requirements and improve the computatio

DeepMind 108 Dec 31, 2022
A Python library for detecting patterns and anomalies in massive datasets using the Matrix Profile

matrixprofile-ts matrixprofile-ts is a Python 2 and 3 library for evaluating time series data using the Matrix Profile algorithms developed by the Keo

Target 696 Dec 26, 2022
Greykite: A flexible, intuitive and fast forecasting library

The Greykite library provides flexible, intuitive and fast forecasts through its flagship algorithm, Silverkite.

LinkedIn 1.7k Jan 04, 2023
Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill

Python implementation of Weng-Lin Bayesian ranking, a better, license-free alternative to TrueSkill This is a port of the amazing openskill.js package

Open Debates Project 156 Dec 14, 2022
Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.

Price Prediction model is used to develop an LSTM model to predict the future market price of Bitcoin and Ethereum.

2 Jun 14, 2022
Iterative stochastic gradient descent (SGD) linear regressor with regularization

SGD-Linear-Regressor Iterative stochastic gradient descent (SGD) linear regressor with regularization Dataset: Kaggle “Graduate Admission 2” https://w

Zechen Ma 1 Oct 29, 2021
CobraML: Completely Customizable A python ML library designed to give the end user full control

CobraML: Completely Customizable What is it? CobraML is a python library built on both numpy and numba. Unlike other ML libraries CobraML gives the us

Sriram Govindan 14 Dec 19, 2021
Retrieve annotated intron sequences and classify them as minor (U12-type) or major (U2-type)

(intron I nterrogator and C lassifier) intronIC is a program that can be used to classify intron sequences as minor (U12-type) or major (U2-type), usi

Graham Larue 4 Jul 26, 2022
Evaluate on three different ML model for feature selection using Breast cancer data.

Anomaly-detection-Feature-Selection Evaluate on three different ML model for feature selection using Breast cancer data. ML models: SVM, KNN and MLP.

Tarek idrees 1 Mar 17, 2022
BudouX is the successor to Budou, the machine learning powered line break organizer tool.

BudouX Standalone. Small. Language-neutral. BudouX is the successor to Budou, the machine learning powered line break organizer tool. It is standalone

Google 868 Jan 05, 2023
A library of extension and helper modules for Python's data analysis and machine learning libraries.

Mlxtend (machine learning extensions) is a Python library of useful tools for the day-to-day data science tasks. Sebastian Raschka 2014-2021 Links Doc

Sebastian Raschka 4.2k Dec 29, 2022
Module for statistical learning, with a particular emphasis on time-dependent modelling

Operating system Build Status Linux/Mac Windows tick tick is a Python 3 module for statistical learning, with a particular emphasis on time-dependent

X - Data Science Initiative 410 Dec 14, 2022
TIANCHI Purchase Redemption Forecast Challenge

TIANCHI Purchase Redemption Forecast Challenge

Haorui HE 4 Aug 26, 2022
🚪✊Knock Knock: Get notified when your training ends with only two additional lines of code

Knock Knock A small library to get a notification when your training is complete or when it crashes during the process with two additional lines of co

Hugging Face 2.5k Jan 07, 2023
Tribuo - A Java machine learning library

Tribuo - A Java prediction library (v4.1) Tribuo is a machine learning library in Java that provides multi-class classification, regression, clusterin

Oracle 1.1k Dec 28, 2022