A spatial genome aligner for analyzing multiplexed DNA-FISH imaging data.

Related tags

Deep Learningjie
Overview

jie

jie is a spatial genome aligner. This package parses true chromatin imaging signal from noise by aligning signals to a reference DNA polymer model.

The codename is a tribute to the Chinese homophones:

  • 结 (jié) : a knot, a nod to the mysterious and often entangled structures of DNA
  • 解 (jiĕ) : to solve, to untie, our bid to uncover these structures amid noise and uncertainty
  • 姐 (jiĕ) : sister, our ability to resolve tightly paired replicated chromatids

Installation

Step 1 - Clone this repository:

git clone https://github.com/b2jia/jie.git
cd jie

Step 2 - Create a new conda environment and install dependencies:

conda create --name jie -f environment.yml
conda activate jie

Step 3 - Install jie:

pip install -e .

To test, run:

python -W ignore test/test_jie.py

Usage

jie is an exposition of chromatin tracing using polymer physics. The main function of this package is to illustrate the utility and power of spatial genome alignment.

jie is NOT an all-purpose spatial genome aligner. Chromatin imaging is a nascent field and data collection is still being standardized. This aligner may not be compatible with different imaging protocols and data formats, among other variables.

We provide a vignette under jie/jupyter/, with emphasis on inspectability. This walks through the intuition of our spatial genome alignment and polymer fiber karyotyping routines:

00-spatial-genome-alignment-walk-thru.ipynb

We also provide a series of Jupyter notebooks (jie/jupyter/), with emphasis on reproducibility. This reproduces figures from our accompanying manuscript:

01-seqFISH-plus-mouse-ESC-spatial-genome-alignment.ipynb
02-seqFISH-plus-mouse-ESC-polymer-fiber-karyotyping.ipynb
03-seqFISH-plus-mouse-brain-spatial-genome-alignment.ipynb
04-seqFISH-plus-mouse-brain-polymer-fiber-karyotyping.ipynb
05-bench-mark-spatial-genome-agignment-against-chromatin-tracing-algorithm.ipynb

A command-line tool forthcoming.

Motivation

Multiplexed DNA-FISH is a powerful imaging technology that enables us to peer directly at the spatial location of genes inside the nucleus. Each gene appears as tiny dot under imaging.

Pivotally, figuring out which dots are physically linked would trace out the structure of chromosomes. Unfortunately, imaging is noisy, and single-cell biology is extremely variable. The two confound each other, making chromatin tracing prohibitively difficult!

For instance, in a diploid cell line with two copies of a gene we expect to see two spots. But what happens when we see:

  • Extra signals:
    • Is it noise?
      • Off-target labeling: The FISH probes might inadvertently label an off-target gene
    • Or is it biological variation?
      • Aneuploidy: A cell (ie. cancerous cell) may have more than one copy of a gene
      • Cell cycle: When a cell gets ready to divide, it duplicates its genes
  • Missing signals:
    • Is it noise?
      • Poor probe labeling: The FISH probes never labeled the intended target gene
    • Or is it biological variation?
      • Copy Number Variation: A cell may have a gene deletion

If true signal and noise are indistinguishable, how do we know we are selecting true signals during chromatin tracing? It is not obvious which spots should be connected as part of a chromatin fiber. This dilemma was first aptly characterized by Ross et al. (https://journals.aps.org/pre/abstract/10.1103/PhysRevE.86.011918), which is nothing short of prescient...!

jie is, conceptually, a spatial genome aligner that disambiguates spot selection by checking each imaged signal against a reference polymer physics model of chromatin. It relies on the key insight that the spatial separation between two genes should be congruent with its genomic separation.

It makes no assumptions about the expected copy number of a gene, and when it traces chromatin it does so instead by evaluating the physical likelihood of the chromatin fiber. In doing so, we can uncover copy number variations and even sister chromatids from multiplexed DNA-FISH imaging data.

Citation

Contact

Author: Bojing (Blair) Jia
Email: b2jia at eng dot ucsd dot edu
Position: MD-PhD Student, Ren Lab

For other work related to single-cell biology, 3D genome, and chromatin imaging, please visit Prof. Bing Ren's website: http://renlab.sdsc.edu/

Owner
Bojing Jia
How do we better describe the world around us?
Bojing Jia
High-Fidelity Pluralistic Image Completion with Transformers (ICCV 2021)

Image Completion Transformer (ICT) Project Page | Paper (ArXiv) | Pre-trained Models | Supplemental Material This repository is the official pytorch i

Ziyu Wan 243 Jan 03, 2023
Pyeventbus: a publish/subscribe event bus

pyeventbus pyeventbus is a publish/subscribe event bus for Python 2.7. simplifies the communication between python classes decouples event senders and

15 Apr 21, 2022
Automatic Image Background Subtraction

Automatic Image Background Subtraction This repo contains set of scripts for automatic one-shot image background subtraction task using the following

Oleg Sémery 6 Dec 05, 2022
Generalized Data Weighting via Class-level Gradient Manipulation

Generalized Data Weighting via Class-level Gradient Manipulation This repository is the official implementation of Generalized Data Weighting via Clas

18 Nov 12, 2022
Dynamic Attentive Graph Learning for Image Restoration, ICCV2021 [PyTorch Code]

Dynamic Attentive Graph Learning for Image Restoration This repository is for GATIR introduced in the following paper: Chong Mou, Jian Zhang, Zhuoyuan

Jian Zhang 84 Dec 09, 2022
TICC is a python solver for efficiently segmenting and clustering a multivariate time series

TICC TICC is a python solver for efficiently segmenting and clustering a multivariate time series. It takes as input a T-by-n data matrix, a regulariz

406 Dec 12, 2022
Locally Differentially Private Distributed Deep Learning via Knowledge Distillation (LDP-DL)

Locally Differentially Private Distributed Deep Learning via Knowledge Distillation (LDP-DL) A preprint version of our paper: Link here This is a samp

Di Zhuang 3 Jan 08, 2023
Grammar Induction using a Template Tree Approach

Gitta Gitta ("Grammar Induction using a Template Tree Approach") is a method for inducing context-free grammars. It performs particularly well on data

Thomas Winters 36 Nov 15, 2022
Pywonderland - A tour in the wonderland of math with python.

A Tour in the Wonderland of Math with Python A collection of python scripts for drawing beautiful figures and animating interesting algorithms in math

Zhao Liang 4.1k Jan 03, 2023
An air quality monitoring service with a Raspberry Pi and a SDS011 sensor.

Raspberry Pi Air Quality Monitor A simple air quality monitoring service for the Raspberry Pi. Installation Clone the repository and run the following

rydercalmdown 24 Dec 09, 2022
🕺Full body detection and tracking

Pose-Detection 🤔 Overview Human pose estimation from video plays a critical role in various applications such as quantifying physical exercises, sign

Abbas Ataei 20 Nov 21, 2022
OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Documentation: https://mmsegmentation.readthedocs.io/ English | 简体中文 Introduction MMSegmentation is an open source semantic segmentation toolbox based

OpenMMLab 5k Dec 31, 2022
A repository for storing njxzc final exam review material

文档地址,请戳我 👈 👈 👈 ☀️ 1.Reason 大三上期末复习软件工程的时候,发现其他高校在GitHub上开源了他们学校的期末试题,我很受触动。期末

GuJiakai 2 Jan 18, 2022
MicroNet: Improving Image Recognition with Extremely Low FLOPs (ICCV 2021)

MicroNet: Improving Image Recognition with Extremely Low FLOPs (ICCV 2021) A pytorch implementation of MicroNet. If you use this code in your research

Yunsheng Li 293 Dec 28, 2022
Code for the Paper: Conditional Variational Capsule Network for Open Set Recognition

Conditional Variational Capsule Network for Open Set Recognition This repository hosts the official code related to "Conditional Variational Capsule N

Guglielmo Camporese 35 Nov 21, 2022
Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Summary Explorer Summary Explorer is a tool to visually inspect the summaries from several state-of-the-art neural summarization models across multipl

Webis 42 Aug 14, 2022
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech Jaehyeon Kim, Jungil Kong, and Juhee Son In our rece

Jaehyeon Kim 1.7k Jan 08, 2023
Pretraining on Dynamic Graph Neural Networks

Pretraining on Dynamic Graph Neural Networks Our article is PT-DGNN and the code is modified based on GPT-GNN Requirements python 3.6 Ubuntu 18.04.5 L

7 Dec 17, 2022
A universal framework for learning timestamp-level representations of time series

TS2Vec This repository contains the official implementation for the paper Learning Timestamp-Level Representations for Time Series with Hierarchical C

Zhihan Yue 284 Dec 30, 2022
UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss

UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss This repository contains the TensorFlow implementation of the paper UnF

Simon Meister 270 Nov 06, 2022