Pyspark sam - Analyze Big Sequence Alignments with PySpark in AWS EMR

Overview

pyspark_sam

This repo hosts my code for the article "Analyze Big Sequence Alignments with PySpark in AWS EMR".

Prerequisite

  1. Spark

  2. AWS CLI

  3. AWS Account

Run

Follow the instruction in the article. Once you have uploaded the files into your S3 bucket, run

aws emr create-cluster --name "Spark_step_pip" \
    --release-label emr-6.5.0 \
    --applications Name=Spark \
    --log-uri s3://[your_S3_bucket]/logs/ \
    --instance-type m5.xlarge \
    --instance-count 3 \
    --bootstrap-actions Path=s3://[your_S3_bucket]/emr_bootstrap.sh \
    --use-default-roles --auto-terminate \
    --steps "Type=Spark,Name=SparkProgram,ActionOnFailure=CONTINUE,Args=[--deploy-mode,cluster,--master,yarn,--py-files,s3://[your_S3_bucket]/helper_function.py,s3://[your_S3_bucket]/spark_3mer.py,s3://[your_S3_bucket]/test.sam,[your_S3_bucket],sankey.json]" 

When the job finishes, download the sankey.json. And run this command to visualize:

python sankey.py sankey.json

Authors

  • Sixing Huang - Concept and Coding

License

This project is licensed under the MIT License - see the LICENSE file for details

Owner
Sixing Huang
A triple Neo4j certified data scientist. I am currently working at BGI in Shenzhen.
Sixing Huang
Scrapes an instagram user's photos and videos

Instagram Scraper instagram-scraper is a command-line application written in Python that scrapes and downloads an instagram user's photos and videos.

7.3k Nov 18, 2022
A nuker for Roblox accounts.

Roblox-Nuker A nuker for Roblox accounts. Made by Ice Bear#0167 Usage I would recommend running in replit (https://replit.com) as it is deprecated in

7 May 10, 2022
This Wrapper is a Discum Copy With Addons, original one is made by Merubokkusu

Remaded Discum Its not Official Discum Wrapper ! This Wrapper is a Discum Copy With Addons, original one is made by Merubokkusu Authors @merubokkusu (

discum-remaded 8 Aug 09, 2022
Telegram bot to check availability of vaccination slots in India.

cowincheckbot Telegram bot to check availability of vaccination slots in India. Setup Install requirements using pip3 install -r requirements.txt Crea

Muhammed Shameem 10 Jun 11, 2022
A chatbot on Telegram using technologies of cloud computing.

Chatbot This project is about a chatbot on Telegram to study the cloud computing. You can refer to the project of chatbot-deploy which is conveinent f

Jeffery 4 Apr 24, 2022
Telegram chats reader with python

Telegram chat reader Программа полностью сливает чаты телеграм в базу данных PostgreSQL. Для использования нужен развернутый сервер постгрес и телегра

Anton 4 Dec 24, 2021
Telegram group manager moderen and simple.

Upin Robot A Advanced Powerful, Smart And Intelligent Group Management Bot With New And Powerful Features ... Written with Pyrogram and Telethon... If

Muhammad Nawawi 3 Dec 23, 2021
Louis Manager Bot With Python

✨ Natsuki ✨ Are You Okay Baby I'm Natsuki Unmaintained. The new repo of @TheNatsukiBot is public. ⚡ (It is no longer based on this source code. The co

Team MasterXBots 1 Nov 07, 2021
Best Buy purchase bot

B3 Best-Buy-Bot. Written in Python NOTICE: Don't be a disgrace to society. Don't use this for any mass buying/reselling purposes. About B3 is a bot th

Dogey11 8 Aug 15, 2022
We propose the adversarial blur attack (ABA) against visual object tracking.

ABA We propose the adversarial blur attack (ABA) against visual object tracking. The ICCV link: https://arxiv.org/abs/2107.12085 and, https://openacce

Qing Guo 13 Dec 01, 2022
𝐀 𝐦𝐨𝐝𝐮𝐥𝐚𝐫 𝐓𝐞𝐥𝐞𝐠𝐫𝐚𝐦 𝐆𝐫𝐨𝐮𝐩 𝐦𝐚𝐧𝐚𝐠𝐞𝐦𝐞𝐧𝐭 𝐛𝐨𝐭 𝐰𝐢𝐭𝐡 𝐮𝐥𝐭𝐢𝐦𝐚𝐭𝐞 𝐟𝐞𝐚𝐭𝐮𝐫𝐞𝐬 !!

𝐇𝐨𝐰 𝐓𝐨 𝐃𝐞𝐩𝐥𝐨𝐲 For easiest way to deploy this Bot click on the below button 𝐌𝐚𝐝𝐞 𝐁𝐲 𝐒𝐮𝐩𝐩𝐨𝐫𝐭 𝐆𝐫𝐨𝐮𝐩 𝐒𝐨𝐮𝐫𝐜𝐞𝐬 𝐆𝐞𝐧𝐞?

Mukesh Solanki 4 Oct 18, 2021
Terraform Cloud CLI for Managing Workspace Terraform Versions

Terraform Cloud Version Manager This tiny script makes it easy to update the Terraform Version on all of the Workspaces inside Terraform Cloud. It wil

Robert Hafner 1 Jan 07, 2022
A simple Python library to integrate with the Heron Data API

Heron Python This library provides easy access to the Heron Data API from applications written in Python. Documentation No language-specific docs are

Heron Data 11 Nov 11, 2022
A unified API wrapper for YouTube and Twitch chat bots.

Chatto A unified API wrapper for YouTube and Twitch chat bots. Contributing Chatto is open to contributions. To find out where to get started, have a

Ethan Henderson 5 Aug 01, 2022
GitHub action to deploy serverless functions to YandexCloud

YandexCloud serverless function deploy action Deploy new serverless function version (including function creation if it does not exist). Inputs yc_acc

Много Лосося 4 Apr 10, 2022
Python wrapper for Interactive Brokers Client Portal Web API

EasyIB: Unofficial Wrapper for Interactive Brokers API EasyIB is an unofficial python wrapper for Interactive Brokers Client Portal Web API. Features

39 Dec 13, 2022
This is the repository for HalpyBOT, the Hull Seals IRC Chatbot Assistant.

HalpyBOT 1.4.2 This is the repository for HalpyBOT, the Hull Seals IRC Chatbot Assistant. Description This repository houses all of the files required

The Hull Seals 3 Nov 03, 2022
Python linting made easy. Also a casual yet honorific way to address individuals who have entered an organization prior to you.

pysen What is pysen? pysen aims to provide a unified platform to configure and run day-to-day development tools. We envision the following scenarios i

Preferred Networks, Inc. 452 Jan 05, 2023
A Bot To Get Info Of Telegram messages , Media , Channel id Group ID etc.

Info-Bot A Bot To Get Info Of Telegram messages , Media , Channel id Group ID etc. Get Info Of Your And Messages , Channels , Groups ETC... How to mak

Vɪᴠᴇᴋ 23 Nov 12, 2022