Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Overview

Modern Data Lake Storage Layers

This repository contains supporting assets for my research in modern Data Lake storage layers like Apache Hudi, Apache Iceberg, and Delta Lake.

Specifically, there's a CloudFormation template to create an EMR cluster and EMR Studio with the necessary requirements and Jupyter notebooks with the example walkthroughs.

You can view the corresponding blog post and video

Pre-requisites

You'll need an AWS Account in which you have administrator privileges and the ability to deploy a CloudFormation template. The template will create an EMR Cluster and S3 bucket that will incur charges - be sure to either shut down the cluster when done or delete the CloudFormation stack. In order to delete the CloudFormation stack, you'll need to:

  • Manually delete any EMR Studio Workspaces you created
  • Manually empty the S3 bucket created by CloudFormation
  • Manually delete the VPC created by CloudFormation due to auto-created rules

Overview

The included CloudFormation template creates a new VPC and EMR Cluster for you to be able to run the notebooks. An EMR Studio is also created and you can find the Studio URL in the Outputs tab of your CloudFormation Stack.

Once the stack is done creating, you'll need to navigate to EMR Studio and create a new workspace attached to the "data-lakes" cluster.

Inside the workspace you either upload each notebook individually from the notebooks/ folder or simply connect to this repository by using the "Git" icon on the left-hand side.

πŸ“’ Video Chat Stream Telegram Bot. Can ⏳ Stream Live Videos, Radios, YouTube Videos & Telegram Video Files On Your Video Chat Of Channels & Groups !

Telegram Video Chat Bot (Beta) πŸ“’ Video Chat Stream Telegram Bot πŸ€– Can Stream Live Videos, Radios, YouTube Videos & Telegram Video Files On Your Vide

brut✘⁢⁹ // ユスフ 15 Dec 24, 2022
DonLee Robot

πŸ€– πƒπŽπ 𝐋𝐄𝐄 π‘πŽππŽπ“ π•πŸ πŸ€– πŸ‘‹ Hey Muhammed, Iam DonLee RoBoT Make me an admin for your group and channel then connect me.... πŸŽ‰ πŸ™‚ To build a

Muhammed 27 Dec 01, 2022
Updater for PGCG (Paradox Game Converters Group) converters written in Python.

Updater Updater for PGCG (Paradox Game Converters Group) converters written in Python. Needs to be put inside an "Updater" directory in the root conve

Paradox Game Converters 2 Jan 10, 2022
A Python script to create customised Spotify playlists using the JSON, Spotipy Library and Spotify Web API, based on seed tracks in your history.

A Python script to create customised Spotify playlists using the JSON, Spotipy Library and Spotify Web API, based on seed tracks in your history.

Youngseo Park 1 Feb 01, 2022
A liblary whre you can find helpful functions for your discord bot

DBotUtils A liblary whre you can find helpful functions for your discord bot Easy setup Setup is easily and flexible. Change anytime. After setup just

Kondek286 1 Nov 02, 2021
Weather Tracker, made with Python using Open Weather API

Weather Tracker Weather Tracker, made with Python using Open Weather API

Sahil Kumar 1 Feb 07, 2022
A GETTR API client written in Python.

GUTTR A GETTR client library written in Python. I rushed to get this out so it's a bit janky. Open an issue if something is broken or missing. Getting

Roger Johnston 13 Nov 23, 2022
OpenSea Bulk Uploader And Trader 100000 NFTs (MAC WINDOWS ANDROID LINUX) Automatically and massively upload and sell your non-fungible tokens on OpenSea using Python Selenium

OpenSea Bulk Uploader And Trader 100000 NFTs (MAC WINDOWS ANDROID LINUX) Automatically and massively upload and sell your non-fungible tokens on OpenS

ERC-7211 3 Mar 24, 2022
Python app to notify via slack channel the status_code change from an URL

Python app to notify, via slack channel you choose to be notified, for the status_code change from the URL list you setup to be checked every yy seconds

Pedro Nunes 1 Oct 25, 2021
A client that allows a user, specifiy their discord token, to send images remotely to discord

ImageBot_for_Discord A client that allows a user, specifiy their discord token, to send images remotely to discord. Can select images using a file dia

0 Aug 24, 2022
Python based Spotify account generator.

Spotify Account Generator Python based Spotify account generator. Installation Download the latest release, open command prompt in the folder, run pip

polo 5 Dec 27, 2022
Telegram Bot to save Posts or Files that can be Accessed via Special Links

OKAERI-FILE Bot Telegram untuk menyimpan Posting atau File yang dapat Diakses melalui Link Khusus. Jika Anda memerlukan tambahan module lagi dalam rep

Wahyusaputra 5 Aug 04, 2022
Gorrabot is a bot made to automate checks and processes in the development process.

Gorrabot is a Gitlab bot made to automate checks and processes in the Faraday development. Features Check that the CHANGELOG is modified By default, m

Faraday 7 Dec 14, 2022
Easy Discord Webhook Token Grabber!

Easy Discord Webhook Token Grabber!

†† 27 Jun 01, 2022
Simple screen recorder

Kooha Simple screen recorder Description Kooha is a simple screen recorder built with GTK. It allows you to record your screen and also audio from you

Dave Patrick 1.2k Jan 03, 2023
Unirest in Python: Simplified, lightweight HTTP client library.

Unirest for Python Unirest is a set of lightweight HTTP libraries available in multiple languages, built and maintained by Mashape, who also maintain

Kong 432 Dec 21, 2022
A multipurpose Telegram Bot written in Python for mirroring files on the Internet to Google Drive

Mirror Leech Bot Mirror Leech Bot is a multipurpose Telegram Bot written in Python for mirroring files on the Internet to our beloved Google Drive. Ba

1 Jan 01, 2022
An advanced telegram language translator bot

Made with Python3 (C) @FayasNoushad Copyright permission under MIT License License - https://github.com/FayasNoushad/Translator-Bot-V3/blob/main/LICE

Fayas Noushad 19 Dec 24, 2022
Python Wrapper for aztro - The Astrology API | Get Daily Horoscope πŸ’«

PyAztro PyAztro is a client library for aztro written in Python. aztro provides horoscope info for sun signs such as Lucky Number, Lucky Color, Mood,

Sameer Kumar 30 Jan 08, 2023
A Telegram bot that searches for the original source of anime, manga, and art

A Telegram bot that searches for the original source of anime, manga, and art How to use the bot Just send a screenshot of the anime, manga or art or

Kira Kormak 9 Dec 28, 2022