Jupyter notebooks and AWS CloudFormation template to show how Hudi, Iceberg, and Delta Lake work

Overview

Modern Data Lake Storage Layers

This repository contains supporting assets for my research in modern Data Lake storage layers like Apache Hudi, Apache Iceberg, and Delta Lake.

Specifically, there's a CloudFormation template to create an EMR cluster and EMR Studio with the necessary requirements and Jupyter notebooks with the example walkthroughs.

You can view the corresponding blog post and video

Pre-requisites

You'll need an AWS Account in which you have administrator privileges and the ability to deploy a CloudFormation template. The template will create an EMR Cluster and S3 bucket that will incur charges - be sure to either shut down the cluster when done or delete the CloudFormation stack. In order to delete the CloudFormation stack, you'll need to:

  • Manually delete any EMR Studio Workspaces you created
  • Manually empty the S3 bucket created by CloudFormation
  • Manually delete the VPC created by CloudFormation due to auto-created rules

Overview

The included CloudFormation template creates a new VPC and EMR Cluster for you to be able to run the notebooks. An EMR Studio is also created and you can find the Studio URL in the Outputs tab of your CloudFormation Stack.

Once the stack is done creating, you'll need to navigate to EMR Studio and create a new workspace attached to the "data-lakes" cluster.

Inside the workspace you either upload each notebook individually from the notebooks/ folder or simply connect to this repository by using the "Git" icon on the left-hand side.

Converts between Spotify's new lyrics (and their proprietary format) to an LRC file for local playback.

spotify-lyrics-to-lrc Converts between Spotify's new lyrics (and their proprietary format) to an LRC file for local playback. How to use: Open Spotify

~noah~ 6 Nov 19, 2022
WikiChecker - Repositorio oficial del complemento WikiChecker para NVDA.

WikiChecker Buscador rápido de artículos en Wikipedia. Introducción. El complemento WikiChecker para NVDA permite a los usuarios consultar de forma rá

2 Jan 10, 2022
Vladilena Mirize Music - Bot Music Telegram By @zenfrans

Vladilena Mirize Music - Bot Music Telegram By @zenfrans

Wahyusaputra 3 Feb 12, 2022
Fully undetected auto skillcheck hack for dead by daylight that works decently well

Auto-skillcheck was made by Love ❌ code ✅ ❔ ・How to use Start off by installing python ofc Open cmd in the same directory and type pip install -r requ

Rdimo 10 Aug 13, 2022
Py hec token mgr - Create HEC tokens in Cribl Stream through the API

Add HEC tokens via API calls This script is intended as an example of how to aut

Jon Rust 3 Mar 04, 2022
A Discord Token Grabber/Stealer But It's in One Line of Coding

Discord-Token-Grabber-But-In-One-Line That's a Discord Token Grabber/Stealer But It's in One Line of Coding! The Name Says All 3

YoSoyAngi 2 Jan 11, 2022
Repositorio dedicado a contener los archivos fuentes del bot de discord "Lector de Ejercicios".

Lector de Ejercicios Este bot de discord está pensado para usarse únicamente en el discord de la materia Algoritmos y Programación I, de la Facultad d

Franco Lighterman Reismann 3 Sep 17, 2022
Braje: a python based credit hacker tool. Hack unlimited RAJE LIKER app Credit

#ReCoded Evan Al Mahmud Irfan ✨ ථ BRAJE 1.0 AUTO LIKER, AUTO COMMENT AND AUTO FOLLOWER APP CREDIT HACKER TOOL About Braje: Braje is a python based cre

Evan Al Mahmud Irfan ථ 2 Dec 23, 2021
Calendars for various securities exchanges.

IMPORTANT NOTE This package is currently unmaintained as the sponsor, quantopian, is going through corporate changes. As such there is a fork of this

Quantopian, Inc. 545 Jan 07, 2023
Available slots checker for Spanish Passport

Bot that checks for available slots to make an appointment to issue the Spanish passport at the Uruguayan consulate page

1 Nov 30, 2021
Starlink Order Status Notification

Starlink Order Status Notification This script logs into Starlink order portal, pulls your estimated delivery date and emails it to a designated email

Aaron R. 1 Jul 08, 2022
修改自SharpNoPSExec的基于python的横移工具 A Lateral Movement Tool Learned From SharpNoPSExec -- Twitter: @juliourena

PyNoPSExec A Lateral Movement Tool Learned From SharpNoPSExec -- Twitter: @juliourena 根据@juliourena大神的SharpNOPsExec项目改写的横向移动工具 Platform(平台): Windows 1

<a href=[email protected]"> 23 Nov 09, 2022
Repository for the IPvSeeYou talk at Black Hat 2021

IPvSeeYou Geolocation Lookup Tool Overview IPvSeeYou.py is a tool to assist with geolocating EUI-64 IPv6 hosts. It takes as input an EUI-64-derived MA

57 Nov 08, 2022
Python wrapper for Xeno-canto API 2.0. Enables downloading bird data with one command line

Python wrapper for Xeno-canto API 2.0. Enables downloading bird data with one command line. Supports multithreading

_zza 9 Dec 10, 2022
Automatically download any NFT collection from OpenSea.

OpenSea NFT Stealer The sole purpose of this script is to download any NFT collection from OpenSea. How does it work? Basically, the OpenSea website a

Dan 111 Dec 29, 2022
[Fullversion]Web3 Pancakeswap Sniper bot written in python3.

🚀 Pancakeswap BSC Sniper Bot 🚀 Web3 Pancakeswap Sniper && Take Profit/StopLose bot written in python3, Please note the license conditions! The secon

21 Dec 11, 2022
🔎 Hunt down social media accounts by username across social networks

Hunt down social media accounts by username across social networks Installation | Usage | Docker Notes | Contributing Installation # clone the repo $

Sherlock 38.2k Jan 01, 2023
Benachrichtigungs-Bot für das niedersächische Impfportal / Notification bot for the lower saxony vaccination portal

Ein kleines Wochenend-Projekt von mir. Der Bot überwacht die REST-API des niedersächsischen Impfportals auf freie Impfslots und sendet eine Benachrichtigung mit deinem bevorzugtem Service. Ab da gilt

sibalzer 37 May 11, 2022
An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. This is Also The Source Code of The Bot Which is Being Used In @SafoTheBot Group! ❤️

Telegram Video Player Bot (Beta) An Telegram Bot By @AsmSafone To Stream Videos in Telegram Voice Chat. Special Features Supports Live Streaming From

SAF ONE 206 Jan 03, 2023
Get charts, top artists and top songs WITHOUT LastFM API

LastFM Get charts, top artists and top songs WITHOUT LastFM API Usage Get stats (charts) We provide many filters and options to customize. Geo filter

4 Feb 11, 2022