EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

Last update: Jan 03, 2023

Overview

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

Paper on arXiv

EquiBind, is a SE(3)-equivariant geometric deep learning model performing direct-shot prediction of both i) the receptor binding location (blind docking) and ii) the ligand’s bound pose and orientation. EquiBind achieves significant speed-ups and better quality compared to traditional and recent baselines. If you have questions, don't hesitate to open an issue or ask me via [email protected] or social media or Octavian Ganea via [email protected]. We are happy to hear from you!

Dataset

Our preprocessed data (see dataset section in the paper Appendix) is available from zenodo.
The files in data contain the names for the time-based data split.

If you want to train one of our models with the data then:

download it from zenodo
unzip the directory and place it into data such that you have the path data/PDBBind

Use provided model weights to predict binding structure of your own protein-ligand pairs:

Step 1: What you need as input

Ligand files of the formats .mol2 or .sdf or .pdbqt or .pdb.
Receptor files of the format .pdb
For each complex you want to predict you need a directory containing the ligand and receptor file. Like this:

my_data_folder
└───name1
    │   name1_protein.pdb
    │   name1_ligand.sdf
└───name2
    │   name2_protein.pdb
    │   name2_ligand.sdf
...

Step 2: Setup Environment

We will set up the environment using Anaconda. Clone the current repo

git clone https://github.com/HannesStark/EquiBind

Create a new environment with all required packages using environment.yml (this can take a while). While in the project directory run:

conda env create

Activate the environment

conda activate equibind

Here are the requirements themselves if you want to install them manually instead of using the environment.yml:

python=3.7
pytorch 1.10
torchvision
cudatoolkit=10.2
torchaudio
dgl-cuda10.2
rdkit
openbabel
biopython
rdkit
biopandas
pot
dgllife
joblib
pyaml
icecream
matplotlib
tensorboard

Step 3: Predict Binding Structures!

In the config file configs_clean/inference.yml set the path to your input data folder inference_path: path_to/my_data_folder.
Then run:

python inference.py --config=configs_clean/inference.yml

Done! 🎉
Your results are saved as .sdf files in the directory specified in the config file under output_directory: 'data/results/output' and as tensors at runs/flexible_self_docking/predictions_RDKitFalse.pt!

Reproducing paper numbers

Download the data and place it as described in the "Dataset" section above.

Using the provided model weights

To predict binding structures using the provided model weights run:

python inference.py --config=configs_clean/inference_file_for_reproduce.yml

This will give you the results of EquiBind-U and then those of EquiBind after running the fast ligand point cloud fitting corrections.
The numbers are a bit better than what is reported in the paper. We will put the improved numbers into the next update of the paper.

Training a model yourself and using those weights

To train the model yourself, run:

python train.py --config=configs_clean/RDKitCoords_flexible_self_docking.yml

The model weights are saved in the runs directory.
You can also start a tensorboard server tensorboard --logdir=runs and watch the model train.
To evaluate the model on the test set, change the run_dirs: entry of the config file inference_file_for_reproduce.yml to point to the directory produced in runs. Then you can runpython inference.py --config=configs_clean/inference_file_for_reproduce.yml as above!

Reference

📃 Paper on arXiv

@misc{stark2022equibind,
      title={EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction}, 
      author={Hannes Stärk and Octavian-Eugen Ganea and Lagnajit Pattanaik and Regina Barzilay and Tommi Jaakkola},
      year={2022}
}

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

Related tags

Overview

EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction

Paper on arXiv

Dataset

Use provided model weights to predict binding structure of your own protein-ligand pairs:

Step 1: What you need as input

Step 2: Setup Environment

Step 3: Predict Binding Structures!

Reproducing paper numbers

Using the provided model weights

Training a model yourself and using those weights

Reference

Owner

Hannes Stärk

A rough implementation of the paper "A Steering Algorithm for Redirected Walking Using Reinforcement Learning"

Official implementation of the paper "Lightweight Deep CNN for Natural Image Matting via Similarity Preserving Knowledge Distillation"

Code for "Diffusion is All You Need for Learning on Surfaces"

A generator of point clouds dataset for PyPipes.

Deep Learning and Reinforcement Learning Library for Scientists and Engineers 🔥

Generate high quality pictures. GAN. Generative Adversarial Networks

A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

Learning and Building Convolutional Neural Networks using PyTorch

OBG-FCN - implementation of 'Object Boundary Guided Semantic Segmentation'

Pull sensitive data from users on windows including discord tokens and chrome data.

Source code for the plant extraction workflow introduced in the paper “Agricultural Plant Cataloging and Establishment of a Data Framework from UAV-based Crop Images by Computer Vision”

A Decentralized Omnidirectional Visual-Inertial-UWB State Estimation System for Aerial Swar.

Python implementation of a live deep learning based age/gender/expression recognizer

This project demonstrates the use of neural networks and computer vision to create a classifier that interprets the Brazilian Sign Language.

YouRefIt: Embodied Reference Understanding with Language and Gesture

Microsoft Cognitive Toolkit (CNTK), an open source deep-learning toolkit

simple demo codes for Learning to Teach with Dynamic Loss Functions

functorch is a prototype of JAX-like composable function transforms for PyTorch.

[ICCV2021] Safety-aware Motion Prediction with Unseen Vehicles for Autonomous Driving

Official PyTorch implementation of RobustNet (CVPR 2021 Oral)