Snakemake worflow to process and filter long read data from Oxford Nanopore Technologies.

Overview

Nanopore-Workflow

Snakemake workflow to process and filter long read data from Oxford Nanopore Technologies. It is designed to compare whole human genome tumor/normal pairs, but can also run individual samples. Reports and plots are generated for de novo genome assembly, differentially methylated regions, copy number variants, and structural variants. Filtering heuristics typically reduce the reported translocations to the break points. It is suggested to have at least 15x - 20x of coverage, and a median read length of at least 5kbp - 6kbp.

nanopore_workflow

Installation instructions

Download the latest code from GitHub:

git clone https://github.com/mike-molnar/nanopore-workflow.git

Before running the workflow, you will need to download the reference genome. I have not included the download as part of the workflow because it is designed to run a cluster that may not have internet access. You can use a local copy of GRCh38 if you have one, but the chromosomes must be named chr1, chr2, ... , and the reference can only contain the autosomes and sex chromosomes. To download the reference genome and index it, change to the reference directory of the workflow and run the script:

cd /path/to/nanopore-workflow/reference
chmod u+x download_reference.sh
./download_reference.sh

To run the workflow copy the Snakefile and config.yaml files to the directory that you want to run the workflow:

cp /path/to/nanopore-workflow/Snakefile /path/to/nanopore-workflow/config.yaml /path/to/samples

Modify the config.yaml file to represent the information for the necessary files and directories of your sample(s). The workflow is currently designed to have a single FASTQ, and a single sequencing summary file in a folder named fastq that is in a folder named after the sample. The config.yaml file provides an example of how to format the initial files and directories before running the workflow.

To run on a grid engine

There are a few different grid engines, so the exact format may be different for your particular grid engine. To run everything except the de novo assembly on a Univa grid engine:

snakemake --jobs 500 --rerun-incomplete --keep-going --latency-wait 30 --cluster "qsub -cwd -V -o snakemake.output.log -e snakemake.error.log -q queue_name -P project_name -pe smp {threads} -l h_vmem={params.memory_per_thread} -l h_rt={params.run_time} -b y" all_but_assembly

You will have to replace queue_name and project_name with the necessary values to run on your grid.

Dependencies

There are many dependencies, so it is best to create a new Conda environment using the YAML files in the env directory. There is a YAML file for the workflow, and another for Medaka. You will need to install a separate environment for QUAST if you are going to run the de novo assembly portion of the workflow. Change to the env directory and create the environments with Conda:

cd /path/to/nanopore-worflow/env
conda env create -n nanopore-workflow -f nanopore-workflow_env.yml
conda env create -n medaka -f medaka_env.yml
conda env create -n quast -f quast_env.yml
conda env create -n R_env -f R_env.yml
conda activate nanopore-workflow

Before running the workflow you will need to export the paths of the four environments to your PATH variable:

export PATH="/path/to/conda/envs/nanopore-workflow/bin:$PATH"
export PATH="/path/to/conda/envs/medaka/bin:$PATH"
export PATH="/path/to/conda/envs/quast/bin:$PATH"
export PATH="/path/to/conda/envs/R_env/bin:$PATH"

nanopore-workflow dependencies:

  • bcftools
  • bedtools
  • cutesv
  • flye
  • longshot
  • nanofilt v2.8.0
  • nanoplot v1.20.0
  • nanopolish
  • seaborn v0.10.0
  • snakemake
  • sniffles
  • survivor
  • svim
  • whatshap
  • winnowmap

R_env dependencies:

  • bioconductor-karyoploter
  • bioconductor-txdb.hsapiens.ucsc.hg38.knowngene
  • bioconductor-org.hs.eg.db
  • bioconductor-dss
  • r-tidyverse
You might also like...
 A simple way to read and write LAPS passwords from linux.
A simple way to read and write LAPS passwords from linux.

A simple way to read and write LAPS passwords from linux. This script is a python setter/getter for property ms-Mcs-AdmPwd used by LAPS inspired by @s

 ⚙️ Compile, Read and update your .conf file in python
⚙️ Compile, Read and update your .conf file in python

⚙️ Compile, Read and update your .conf file in python

Discovering local read-level DNA methylation patterns and DNA methylation heterogeneity in intermediately methylated regions

Discovering local read-level DNA methylation patterns and DNA methylation heterogeneity in intermediately methylated regions

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences

Users can read others' travel journeys in addition to being able to upload and delete posts detailing their own experiences! Posts are organized by country and destination within that country.

To lazy to read your homework ? Get it done with LOL

LOL To lazy to read your homework ? Get it done with LOL Needs python 3.x L:::::::::L OO:::::::::OO L:::::::::L L:::::::

Pequenos programas variados que estou praticando e implementando, leia o Read.me!

my-small-programs Pequenos programas variados que estou praticando e implementando! Arquivo: automacao Automacao de processos de rotina com código Pyt

Show my read on kindle this year

Show my kindle status on GitHub

Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques
Incident Response Process and Playbooks | Goal: Playbooks to be Mapped to MITRE Attack Techniques

PURPOSE OF PROJECT That this project will be created by the SOC/Incident Response Community Develop a Catalog of Incident Response Playbook for every

These are After Effects and Python files that were made in the process of creating the video for the contest.

spirograph These are After Effects and Python files that were made in the process of creating the video for the contest. In the python file you can qu

Releases(v0.1.0)
NewsBlur is a personal news reader bringing people together to talk about the world.

NewsBlur NewsBlur is a personal news reader bringing people together to talk about the world.

Samuel Clay 6.2k Dec 29, 2022
Lectures for Udemy - Complete Python Bootcamp Course

Complete-Python-Bootcamp Welcome to the Repository for the Complete Python Bootcamp! This is the Repository for the Udemy course - "Complete Python Bo

Marci 2k Dec 28, 2022
pvaPy provides Python bindings for EPICS pvAccess

PvaPy - PvAccess for Python The PvaPy package is a Python API for EPICS7. It supports both PVA and CA providers, all standard EPICS7 types (structures

EPICS Base 25 Dec 05, 2022
This is the key combo trainer for League of Legends and Dota 2 players.

This is the key combo trainer for League of Legends and Dota 2 players. Place the mouse cursor on the blue point and press the key combo from the upper-left side of the screen.

Ilya Shpigor 1 Jan 31, 2022
Statistics Calculator module for all types of Stats calculations.

Statistics-Calculator This Calculator user the formulas and methods to find the statistical values listed. Statistics Calculator module for all types

2 May 29, 2022
A calculator developed in Python.

Calculadora Uma simples calculadora... ( + − × ÷ ) 💻 Situação do projeto: Projeto finalizado ✔️ 🛠 Tecnologias: Python Tkinter (GUI) ⚙️ Pré-requisito

Arthur V.B.S. 1 Jan 27, 2022
A tool for removing PUPs using signatures

Unwanted program removal tool A tool for removing PUPs using signatures What is the unwanted program removal tool? The unwanted program removal tool i

4 Sep 20, 2022
solsim is the Solana complex systems simulator. It simulates behavior of dynamical systems—DeFi protocols, DAO governance, cryptocurrencies, and more—built on the Solana blockchain

solsim is the Solana complex systems simulator. It simulates behavior of dynamical systems—DeFi protocols, DAO governance, cryptocurrencies, and more—built on the Solana blockchain

William Wolf 12 Jul 13, 2022
Semester Project on Signal Processing @CS UCU 2021

Blur Detection with Haar Wavelet Transform Requirements Python3 opencv-python PyWavelets Install these using the following command: $ pip install -r r

ButynetsD 2 Oct 15, 2022
Annotates sequences with Eggnog-mapper and hhblits against PDB70

Annotating "hypothetical" proteins with the PDB See config/ for configuration information. This workflow takes as input a set of protein sequences. It

1 Apr 05, 2022
TimeWizard - A script that generates every single Time Wizard EDOPRO lflist possible

EDOPRO F&L list generator This project is just a script that generates every sin

Diamond Dude 2 Sep 28, 2022
GUI tool to manage the contents of chests in Botw

Botw chest manager is a small gui tool allowing to easily manage chests. Sometimes Ice Spear can be very time consuming when adding a simple chest. The purpose of this light tool is to add a new ches

3 Aug 25, 2022
A parallel branch-and-bound engine for Python.

pybnb A parallel branch-and-bound engine for Python. This software is copyright (c) by Gabriel A. Hackebeil (gabe.hacke

Gabriel Hackebeil 52 Nov 12, 2022
Rotating cube with hand

I am still working on this project :)) To-Do(Present): = It needs an algorithm to fine tune my hand's coordinates for rotation of our cube (initial o

Jay Desale 2 Dec 26, 2021
Project5 Data processing system

Project5-Data-processing-system User just needed to copy both these file to a folder and open Project5.py using cmd or using any python ide. It is to

1 Nov 23, 2021
A few of my adventures with Devito.

Devito-playbox A few of my adventures with Devito. This repository contains a few notebooks and scripts that will lead me in the road of learning this

Átila Saraiva Quintela Soares 1 Feb 08, 2022
This repo is a collection of programs and websites templates too

📢 Register here for Hacktoberfest and make four pull requests (PRs) between October 1st-31st to grab free SWAGS 🔥 . IMPORTANT While making pull requ

Binayak Jha - 2 7 Oct 03, 2022
An application for automation of the mining function in the game Alienworlds.IO

alienautomation A Python script made to automate the tidious job of mining on AlienWorlds This script: Automatically opens the browser Automatically l

anonieXdev 42 Dec 03, 2022
Nateve transpiler developed with python.

Adam Adam is a Nateve Programming Language transpiler developed using Python. Nateve Nateve is a new general domain programming language open source i

Nateve 7 Jan 15, 2022
Blender Light Manipulation - A script that makes it easier to work with light

Blender Light Manipulation A script that makes it easier to work with light 1. Wstęp W poniższej dokumentacji przedstawiony zostanie skrypt, który swo

Tomasz 1 Oct 19, 2021