MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

Last update: Jun 27, 2022

Related tags

Overview

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data. It is implemented using Python.

MHtyper workflow

Step1：

Sequencing data was filtered by NanoFilt and aligment with minimap2

Step2：

Files with both BED and pileup format are generated

Step3：

Phasing with margin ; Correction with isONcorrect; Haplotype analysis by MHtyper ; Integrate the analysis results to get the final micro haplotype results

Python environment construction and required software installation

   conda create -n MHtyper 
   conda activate MHtyper
   conda config --add channels bioconda 
   conda config --add channels
   conda-forge conda install -y NanoFilt minimap2 samtools bedtools

isONcorrect & Margin installation

isONcorrect: https://github.com/ksahlin/isONcorrect [1] Margin: https://github.com/UCSC-nanopore-cgl/margin

# isONcorrect installation

   git clone https://github.com/ksahlin/isONcorrect.git
   cd isONcorrect 
   ./isONcorrect

# Marin installation
# step1:
   sudo apt-get install git make gcc g++ autoconf zlib1g-dev libcurl4-openssl-dev libbz2-dev libhdf5-dev
   wget https://github.com/Kitware/CMake/releases/download/v3.14.4/cmake-3.14.4-Linux-x86_64.sh && sudo mkdir /opt/cmake &&
   sudo sh cmake-3.14.4-Linux-x86_64.sh --prefix=/opt/cmake --skip-license && sudo ln -s /opt/cmake/bin/cmake
   /usr/local/bin/cmake cmake --version

# step2: Check out the repository and submodules:

   git clone https://github.com/UCSC-nanopore-cgl/margin.git
   cd margin git submodule update --init

# step3: Make build directory:

   mkdir build cd build

# step4: Generate Makefile and run:

   cmake .. 
   make ./margin

MHtyper installation

   git clone https://github.com/willow2333/MHtyper.git
   cd MHtyper 
   python run.py --h
   
   usage: run.py [-h] [--fastqfiles FASTQFILES] [--reference REFERENCE] [--prefix PREFIX] [--truthvcf TRUTHVCF] [--marginpath MARGINPATH]

    optional arguments:
      -h, --help            show this help message and exit
      --fastqfiles FASTQFILES
                            The input *.fq.gz files.
      --reference REFERENCE
                            The path of your ref.
      --prefix PREFIX       The name of your Sample, default is "Test".
      --truthvcf TRUTHVCF   The truth variant files in your research.
      --marginpath MARGINPATH
                            The setup path of "Margin".

Illustration

1.Test

   cd ./Test
   python ../run.py --fastqfiles test.fq.gz --reference path/hg19.fa --prefix Test --truthvcf truthvcf.txt  --marginpath path/margin

2. The sites vcf files needed

The snp-sites.txt that contained the information of samples must needed

3. Output

The analysis results of microhaplotypes is in finalphase.txt

Citation

1.Sahlin, K., Medvedev, P. Error correction enables use of Oxford Nanopore technology for reference-free transcriptome analysis. Nat Commun 12, 2 (2021). https://doi.org/10.1038/s41467-020-20340-8 Link.

Email: Yiping Hou ([email protected]), Zheng Wang ([email protected]), Liu Qin ([email protected])

MHtyper is an end-to-end pipeline for recognized the Forensic microhaplotypes in Nanopore sequencing data.

Related tags

Overview

Overview

MHtyper workflow

Step1：

Step2：

Step3：

Python environment construction and required software installation

isONcorrect & Margin installation

MHtyper installation

Illustration

1.Test

2. The sites vcf files needed

3. Output

Citation

Owner

willow

💬 Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants

Open source annotation tool for machine learning practitioners.

A Streamlit web app that generates Rick and Morty stories using GPT2.

Fastseq 基于ONNXRUNTIME的文本生成加速框架

Edge-Augmented Graph Transformer

Code for the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer"

Bpe algorithm can finetune tokenizer - Bpe algorithm can finetune tokenizer

PyTorch Implementation of "Non-Autoregressive Neural Machine Translation"

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

Contains analysis of trends from Fitbit Dataset (source: Kaggle) to see how the trends can be applied to Bellabeat customers and Bellabeat products

Precision Medicine Knowledge Graph (PrimeKG)

PyTorch Language Model for 1-Billion Word (LM1B / GBW) Dataset

source code for paper: WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach.

DensePhrases provides answers to your natural language questions from the entire Wikipedia in real-time

Final Project Bootcamp Zero

一个基于Nonebot2和go-cqhttp的娱乐性qq机器人

This repository is home to the Optimus data transformation plugins for various data processing needs.

Contains the code and data for our #ICSE2022 paper titled as "CodeFill: Multi-token Code Completion by Jointly Learning from Structure and Naming Sequences"

A machine learning model for analyzing text for user sentiment and determine whether its a positive, neutral, or negative review.