Benchmarking Pipeline for Prediction of Protein-Protein Interactions

Last update: Jun 27, 2022

Related tags

Overview

B4PPI

Benchmarking Pipeline for the Prediction of Protein-Protein Interactions

How this benchmarking pipeline has been built, and how to use it, is detailed in our preprint here (please cite it if you find this work useful!).

A minimal example is available here, and the list of requirements there.

How to use the gold standard

All the data files are in data, most of them are available as csv (sep='|') and pickled pandas DataFrames (sometimes the csv file may be missing due to file size constraints on GitHub).

The gold standard, without pre-processed features, can be loaded using:

goldStandard = pd.read_csv(
    os.path.join('data', 'benchmarkingGS_v1-0.csv'),
    sep='|'
)

Or with the pre-processed features:

goldStandard_with_featuresSeq = pd.read_pickle(
    os.path.join('data', 'benchmarkingGS_v1-0_similarityMeasure_sequence_v3-1.pkl')
)

UniProtIDs are used for both proteins A and B.
isInteraction is the ground truth from the IntAct database (1 = interacting proteins, 0 = non-interacting proteins).
trainTest is the split between training set (train), first testing set T1 (test1) and second testing set T2 (test2).
Pre-processed features are explained in the manuscript.

Training and evaluation can then be done normally. The code from the preprint is in the Training section.

How to cite this work

Lannelongue L., Inouye M., Construction of in silico protein-protein interaction networks across different topologies using machine learning, 2022, BioArxiv

Licence

This work is licensed under a Creative Commons Attribution 4.0 International License.

Credits

The code was written in Python 3.7.
Many libraries were used, in particular Pandas, Numpy, scikit-learn and PyTorch Lightning (full list in the code and in the requirements file).
Plots were drawn using Matplotlib, Seaborn and the MetBrewer colour palettes.
Logs were saved using Weight & Bias.

Benchmarking Pipeline for Prediction of Protein-Protein Interactions

Related tags

Overview

B4PPI

How to use the gold standard

How to cite this work

Licence

Credits

Owner

Loïc Lannelongue

The official implementation of NeurIPS 2021 paper: Finding Optimal Tangent Points for Reducing Distortions of Hard-label Attacks

The project of phase's key role in complex and real NN

You Only Look One-level Feature (YOLOF), CVPR2021, Detectron2

A tool to prepare websites grabbed with wget for local viewing.

[CVPR 2020] Transform and Tell: Entity-Aware News Image Captioning

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding (AAAI 2020) - PyTorch Implementation

Simple transformer model for CIFAR10

Summary Explorer is a tool to visually explore the state-of-the-art in text summarization.

Self-supervised spatio-spectro-temporal represenation learning for EEG analysis

code for paper "Does Unsupervised Architecture Representation Learning Help Neural Architecture Search?"

Code for Environment Dynamics Decomposition (ED2).

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

Code for our paper at ECCV 2020: Post-Training Piecewise Linear Quantization for Deep Neural Networks

Bayesian Optimization using GPflow

Code for Max-Margin Contrastive Learning - AAAI 2022

Implementation of CSRL from the AAAI2022 paper: Constraint Sampling Reinforcement Learning: Incorporating Expertise For Faster Learning

Source code for 2021 ICCV paper "In-the-Wild Single Camera 3D Reconstruction Through Moving Water Surfaces"

Traductor de lengua de señas al español basado en Python con Opencv y MedaiPipe

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

[ICCV 2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration