The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete.

Overview

Feature-Engineering

The functions we created are included in a script. The necessary parts for pre-processing were taken. Analysis complete.

image

Business Problem

/Required for a machine learning pipeline data preprocessing and variable engineering script needs to be prepared.

/When the dataset is passed through this script, the modeling starts. Expected to be ready.

Dataset Story

*The data set is the data set of the people who were in the Titanic shipwreck. *It consists of 768 observations and 12 variables.

The target variable is specified as "Survived";

1: one's survival, 0: indicates the person's inability to survive.

Variables

  • Survived

  • 0 Died, 1 Survived

  • Pclass – Ticket Class

  • 1 = Grade 1, 2 = Grade 2, 3 = Grade 3

  • Age – Age

  • Sibsp – Number of siblings / spouses on the Titanic

  • Sex – Gender

  • Parch – Number of parents/children on Titanic

  • Embarked: – Passenger embarkation port

  • (C = Cherbourg, Q = Queenstown, S = Southampton

  • Fare – Ticket fare

  • Cabin: Cabin number

Project Tasks

1- Open a directory called helpers in the working directory and enter it. Add a script named data_prep.py. In the Feature Engineering section, all of our own collect functions into this script. Functions that should be here: outlier_thresholds replace_with_thresholds check_outlier grab_outliers remove_outlier missing_values_table missing_vs_target label_encoder one_hot_encoder rare_analyser rare_encoder

2- Write a function called titanic_data_prep. Data preprocessing or EDA functions required for this function, Get it from the eda.py and data_prep.py files in the helpers.

3- Save the data set you preprocessed to the disk with pickle.

Owner
Ayşe Nur Türkaslan
I continue my studies in the field of Data Science and Artificial Intelligence. I want to turn my efforts into a contribution using Github
Ayşe Nur Türkaslan
Object-data mapper and advanced query manager for non relational databases

Object data mapper and advanced query manager for non relational databases. The data is owned by different, configurable back-end databases and it is

Luca Sbardella 121 Aug 11, 2022
Script de monitoramento das teclas do teclado, salvando todos os dados digitados em um arquivo de log juntamente com os dados de rede.

listenerPython Script de monitoramento das teclas do teclado, salvando todos os dados digitados em um arquivo de log juntamente com os dados de rede.

Vinícius Azevedo 4 Nov 27, 2022
Bootcamp de Introducción a la Programación. Módulo 6: Matemáticas Discretas

Módulo 6: Matemáticas Discretas Última actualización: 12 de marzo Irónicamente, las matemáticas discretas son las matemáticas que lo cuentan todo. Si

Cynthia Castillo 34 Sep 29, 2022
Datargsing is a data management and manipulation Python library

Datargsing What is It? Datargsing is a data management and manipulation Python library which is currently in deving Why this library is good? This Pyt

CHOSSY Lucas 10 Oct 24, 2022
Streamlit — The fastest way to build data apps in Python

Welcome to Streamlit 👋 The fastest way to build and share data apps. Streamlit lets you turn data scripts into sharable web apps in minutes, not week

Streamlit 22k Jan 06, 2023
Advanced Variable Manager {AVM} [0.8.0]

Advanced Variable Manager {AVM} [0.8.0] By Grosse pastèque#6705 WARNING : This modules need some typing modifications ! If you try to run it without t

Big watermelon 1 Dec 11, 2021
Basic Hspice runner with Python

HSpicePy Bilgisayarınıza PATH değişkenlerine eklediğiniz HSPICE programını python ile çalıştırmanızı sağlayan basit bir araç. A simple tool that allow

1 Nov 16, 2021
sawa (ꦱꦮ) is an open source programming language, an interpreter to be precise, where you can write python code using javanese character.

ꦱꦮ sawa (ꦱꦮ) is an open source programming language, an interpreter to be precise, where you can write python code using javanese character. sawa iku

Rony Lantip 307 Jan 07, 2023
Open source tools to allow working with ESP devices in the browser

ESP Web Tools Allow flashing ESPHome or other ESP-based firmwares via the browser. Will automatically detect the board type and select a supported fir

ESPHome 195 Dec 31, 2022
Py-Parser est un parser de code python en python encore en plien dévlopement.

PY - PARSER Py-Parser est un parser de code python en python encore en plien dévlopement. Une fois achevé, il servira a de nombreux projets comme glad

pf4 3 Feb 21, 2022
This is a practice on Airflow, which is building virtual env, installing Airflow and constructing data pipeline (DAGs)

airflow-test This is a practice on Airflow, which is Builing virtualbox env and setting Airflow on that env Installing Airflow using python virtual en

Jaeyoung 1 Nov 01, 2021
Todo-backend - Todo backend with python

Todo-backend - Todo backend with python

Julio C. Diaz 1 Jan 07, 2022
Python library for datamining glitch information from Gen 1 Pokémon GameBoy ROMs

g1utils This is a Python library for datamining information about various glitches (glitch Pokémon, glitch maps, etc.) from Gen 1 Pokémon ROMs. TODO A

1 Jan 13, 2022
An open letter in support of Richard Matthew Stallman being reinstated by the Free Software Foundation

An open letter in support of RMS. To sign, click here and name the file username.yaml (replace username with your name) with the following content

2.4k Jan 07, 2023
Konomi: Kind and Optimized Next brOadcast watching systeM Infrastructure

Konomi 備考・注意事項 現在 α 版で、まだ実験的なプロダクトです。通常利用には耐えないでしょうし、サポートもできません。 安定しているとは到底言いがたい品質ですが、それでも構わない方のみ導入してください。 使い方などの説明も用意できていないため、自力でトラブルに対処できるエンジニアの方以外に

tsukumi 243 Dec 30, 2022
A python script for osu!lazer rulesets auto update.

osu-lazer-rulesets-autoupdater A python script for osu!lazer rulesets auto update. How to use: 如何使用: You can refer to the python script. The begining

3 Jul 26, 2022
A turtlebot auto controller allows robot to autonomously explore environment.

A turtlebot auto controller allows robot to autonomously explore environment.

Yuliang Zhong 1 Nov 10, 2021
A topology optimization framework written in Taichi programming language, which is embedded in Python.

Taichi TopOpt (Under Active Development) Intro A topology optimization framework written in Taichi programming language, which is embedded in Python.

Li Zhehao 41 Nov 17, 2022
A replacement of qsreplace, accepts URLs as standard input, replaces all query string values with user-supplied values and stdout.

Bhedak A replacement of qsreplace, accepts URLs as standard input, replaces all query string values with user-supplied values and stdout. Works on eve

Eshan Singh 84 Dec 31, 2022
This repository collects nice scripts ("plugins") for the SimpleBot bot for DeltaChat.

Having fun with DeltaChat This repository collects nice scripts ("plugins") for the SimpleBot bot for DeltaChat. DeltaChat is a nice e-mail based mess

Valentin Brandner 3 Dec 25, 2021