Snakemake worflow to process and filter long read data from Oxford Nanopore Technologies.

Overview

Nanopore-Workflow

Snakemake workflow to process and filter long read data from Oxford Nanopore Technologies. It is designed to compare whole human genome tumor/normal pairs, but can also run individual samples. Reports and plots are generated for de novo genome assembly, differentially methylated regions, copy number variants, and structural variants. Filtering heuristics typically reduce the reported translocations to the break points. It is suggested to have at least 15x - 20x of coverage, and a median read length of at least 5kbp - 6kbp.

nanopore_workflow

Installation instructions

Download the latest code from GitHub:

git clone https://github.com/mike-molnar/nanopore-workflow.git

Before running the workflow, you will need to download the reference genome. I have not included the download as part of the workflow because it is designed to run a cluster that may not have internet access. You can use a local copy of GRCh38 if you have one, but the chromosomes must be named chr1, chr2, ... , and the reference can only contain the autosomes and sex chromosomes. To download the reference genome and index it, change to the reference directory of the workflow and run the script:

cd /path/to/nanopore-workflow/reference
chmod u+x download_reference.sh
./download_reference.sh

To run the workflow copy the Snakefile and config.yaml files to the directory that you want to run the workflow:

cp /path/to/nanopore-workflow/Snakefile /path/to/nanopore-workflow/config.yaml /path/to/samples

Modify the config.yaml file to represent the information for the necessary files and directories of your sample(s). The workflow is currently designed to have a single FASTQ, and a single sequencing summary file in a folder named fastq that is in a folder named after the sample. The config.yaml file provides an example of how to format the initial files and directories before running the workflow.

To run on a grid engine

There are a few different grid engines, so the exact format may be different for your particular grid engine. To run everything except the de novo assembly on a Univa grid engine:

snakemake --jobs 500 --rerun-incomplete --keep-going --latency-wait 30 --cluster "qsub -cwd -V -o snakemake.output.log -e snakemake.error.log -q queue_name -P project_name -pe smp {threads} -l h_vmem={params.memory_per_thread} -l h_rt={params.run_time} -b y" all_but_assembly

You will have to replace queue_name and project_name with the necessary values to run on your grid.

Dependencies

There are many dependencies, so it is best to create a new Conda environment using the YAML files in the env directory. There is a YAML file for the workflow, and another for Medaka. You will need to install a separate environment for QUAST if you are going to run the de novo assembly portion of the workflow. Change to the env directory and create the environments with Conda:

cd /path/to/nanopore-worflow/env
conda env create -n nanopore-workflow -f nanopore-workflow_env.yml
conda env create -n medaka -f medaka_env.yml
conda env create -n quast -f quast_env.yml
conda env create -n R_env -f R_env.yml
conda activate nanopore-workflow

Before running the workflow you will need to export the paths of the four environments to your PATH variable:

export PATH="/path/to/conda/envs/nanopore-workflow/bin:$PATH"
export PATH="/path/to/conda/envs/medaka/bin:$PATH"
export PATH="/path/to/conda/envs/quast/bin:$PATH"
export PATH="/path/to/conda/envs/R_env/bin:$PATH"

nanopore-workflow dependencies:

  • bcftools
  • bedtools
  • cutesv
  • flye
  • longshot
  • nanofilt v2.8.0
  • nanoplot v1.20.0
  • nanopolish
  • seaborn v0.10.0
  • snakemake
  • sniffles
  • survivor
  • svim
  • whatshap
  • winnowmap

R_env dependencies:

  • bioconductor-karyoploter
  • bioconductor-txdb.hsapiens.ucsc.hg38.knowngene
  • bioconductor-org.hs.eg.db
  • bioconductor-dss
  • r-tidyverse
You might also like...
 A simple way to read and write LAPS passwords from linux.
A simple way to read and write LAPS passwords from linux.

A simple way to read and write LAPS passwords from linux. This script is a python setter/getter for property ms-Mcs-AdmPwd used by LAPS inspired by @s

 ⚙️ Compile, Read and update your .conf file in python
⚙️ Compile, Read and update your .conf file in python

⚙️ Compile, Read and update your .conf file in python

Discovering local read-level DNA methylation patterns and DNA methylation heterogeneity in intermediately methylated regions

Discovering local read-level DNA methylation patterns and DNA methylation heterogeneity in intermediately methylated regions

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences! Posts are organized by country and destination within that country.

To lazy to read your homework ? Get it done with LOL

LOL To lazy to read your homework ? Get it done with LOL Needs python 3.x L:::::::::L OO:::::::::OO L:::::::::L L:::::::

Pequenos programas variados que estou praticando e implementando, leia o Read.me!

my-small-programs Pequenos programas variados que estou praticando e implementando! Arquivo: automacao Automacao de processos de rotina com código Pyt

Show my read on kindle this year

Show my kindle status on GitHub

Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques
Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques

PURPOSE OF PROJECT That this project will be created by the SOC/Incident Response Community Develop a Catalog of Incident Response Playbook for every

These are After Effects and Python files that were made in the process of creating the video for the contest.

spirograph These are After Effects and Python files that were made in the process of creating the video for the contest. In the python file you can qu

Releases(v0.1.0)
WhyNotWin11 - Detection Script to help identify why your PC isn't Windows 11 Release Ready

WhyNotWin11 - Detection Script to help identify why your PC isn't Windows 11 Release Ready

Robert C. Maehl 5.9k Dec 31, 2022
Blender Addon for Snapping a UV to a specific part of a Tilemap

UVGridSnapper A simple Blender Addon for easier texturing. A menu in the UV editor allows a square UV to be snapped to an Atlas texture, or Tilemap. P

2 Jul 17, 2022
Python most simple|stupid programming language (MSPL)

Most Simple|Stupid Programming language. (MSPL) Stack - Based programming language "written in Python" Features: Interpretate code (Run). Generate gra

Kirill Zhosul 14 Nov 03, 2022
Just a little benchmark for scrapper PC's

PopMark Just a little benchmark for scrapper PC's This benchmark is for old computer that dont support other benchmark because of support. Like lack o

Garry 1 Nov 24, 2021
Python project setup, updater, and launcher

pyLaunch Python project setup, updater, and launcher Purpose: Increase project productivity and provide features easily. Once installed as a git submo

DAAV, LLC 1 Jan 07, 2022
This is the course repository for the Spring 2022 iteration of MACS 30123 "Large-Scale Computing for the Social Sciences" at the University of Chicago.

Large-Scale Computing for the Social Sciences Spring 2022 - MACS 30123/MAPS 30123/PLSC 30123 Instructor Information TA Information TA Information Cour

6 May 06, 2022
emoji-math computes the given python expression and returns either the value or the nearest 5 emojis as measured by cosine similarity.

emoji-math computes the given python expression and returns either the value or the nearest 5 emojis as measured by cosine similarity.

Andrew White 13 Dec 11, 2022
Repo with data from local elections in Portugal from 2009 to 2013

autarquicas - local elections in Portugal Repo with data from local elections in Portugal from 2009 to 2013 Objective To provide, to all, raw data fro

Jorge Gomes 6 Apr 06, 2022
A clipboard where a user can add and retrieve multiple items to and from (resp) from the clipboard cache.

A clipboard where a user can add and retrieve multiple items to and from (resp) from the clipboard cache.

Gaurav Bhattacharjee 2 Feb 07, 2022
A site that went kinda viral that lets you put Bernie Sanders in places

Bernie In Places An app that accidentally went viral! Read the story in WIRED here Install First, create a python virtual environment, and install all

310 Aug 22, 2022
Rotazioni: a linear programming workout split optimizer

Rotazioni: a linear programming workout split optimizer Dependencies Dependencies for the frontend and backend are respectively listed in client/packa

Marco 3 Oct 13, 2022
This tool don't used illegal ativity

ETHICALTOOL This tool for only educational purposes don't used illegal ativity @onlinehacking this tool for pkg update && pkg upgrade && pkg install g

Mrkarthick 4 Dec 23, 2021
Manipulation OpenAI Gym environments to simulate robots at the STARS lab

liegroups Python implementation of SO2, SE2, SO3, and SE3 matrix Lie groups using numpy or PyTorch. [Documentation] Installation To install, cd into t

STARS Laboratory 259 Dec 11, 2022
Pylexa - Artificial Assistant made with Python

Pylexa - Artificial Assistant made with Python Alexa is a famous artificial assistant used massively across the world. It is a substitute of Alexa whi

\_PROTIK_/ 4 Nov 03, 2021
A desktop app to check the unlocked courses bases on previously done courses.

Course Picker A desktop app to check the unlocked courses bases on previously done courses. Table of contents About the Project Built with What it doe

Ahmed Symum Swapno 3 Feb 07, 2022
Adds a Bake node to Blender's shader node system

Bake to Target This Blender Addon adds a new shader node type capable of reducing the texture-bake step to a single button press. Please note that thi

Thomas 8 Oct 04, 2022
Blender addon for executing the operator in response to the received OSC message.

I/F Joiner 受信したOSCメッセージに応じてオペレータ(bpy.ops)を実行するアドオンです. OSC通信に対応したコントローラやアプリをインストールしたスマートフォンを使用してBlenderを操作することが可能になります. 同時開発しているAndroidコントローラ化アプリMocopa

simasimataiyo 6 Oct 02, 2022
HiQ - A Modern Observability System

🦉 A Modern Observability System HiQ is a declarative, non-intrusive, dynamic and transparent tracking system for both monolithic application and dist

Oracle Sample Code 40 Aug 21, 2022
propuestas electorales de los candidatos a constituyentes, Chile 2021

textos-constituyentes propuestas electorales de los candidatos a constituyentes, Chile 2021 Programas descargados desde https://elecciones2021.servel.

Sergio Lucero 6 Nov 19, 2021
Free components that wrap up Python into Delphi and Lazarus (FPC)

Python for Delphi (P4D) is a set of free components that wrap up the Python DLL into Delphi and Lazarus (FPC). They let you easily execute Python scri

747 Jan 02, 2023