SynNet - synthetic tree generation using neural networks

Last update: Dec 29, 2022

Related tags

Overview

SynNet

This repo contains the code and analysis scripts for our amortized approach to synthetic tree generation using neural networks. Our model can serve as both a synthesis planning tool and as a tool for synthesizable molecular design.

The method is described in detail in the publication "Amortized tree generation for bottom-up synthesis planning and synthesizable molecular design" [TODO add link to arXiv after publication] and summarized below.

Summary

Overview

We model synthetic pathways as tree structures called synthetic trees. A valid synthetic tree has one root node (the final product molecule) linked to purchasable building blocks (encoded as SMILES strings) via feasible reactions according to a list of discrete reaction templates (examples of templates encoded as SMARTS strings in data/rxn_set_hb.txt). At a high level, each synthetic tree is constructed one reaction step at a time in a bottom-up manner, starting from purchasable building blocks.

The model consists of four modules, each containing a multi-layer perceptron (MLP):

An Action Type selection function that classifies action types among the four possible actions (“Add”, “Expand”, “Merge”, and “End”) in building the synthetic tree.
A First Reactant selection function that predicts an embedding for the first reactant. A candidate molecule is identified for the first reactant through a k-nearest neighbors (k-NN) search from the list of potential building blocks.
A Reaction selection function whose output is a probability distribution over available reaction templates, from which inapplicable reactions are masked (based on reactant 1) and a suitable template is then sampled using a greedy search.
A Second Reactant selection function that identifies the second reactant if the sampled template is bi-molecular. The model predicts an embedding for the second reactant, and a candidate is then sampled via a k-NN search from the masked set of building blocks.

These four modules predict the probability distributions of actions to be taken within a single reaction step, and determine the nodes to be added to the synthetic tree under construction. All of these networks are conditioned on the target molecule embedding.

Synthesis planning

This task is to infer the synthetic pathway to a given target molecule. We formulate this problem as generating a synthetic tree such that the product molecule it produces (i.e., the molecule at the root node) matches the desired target molecule.

For this task, we can take a molecular embedding for the desired product, and use it as input to our model to produce a synthetic tree. If the desired product is successfully recovered, then the final root molecule will match the desired molecule used to create the input embedding. If the desired product is not successully recovered, it is possible the final root molecule may still be similar to the desired molecule used to create the input embedding, and thus our tool can also be used for synthesizable analog recommendation.

Synthesizable molecular design

This task is to optimize a molecular structure with respect to an oracle function (e.g. bioactivity), while ensuring the synthetic accessibility of the molecules. We formulate this problem as optimizing the structure of a synthetic tree with respect to the desired properties of the product molecule it produces.

To do this, we optimize the molecular embedding of the molecule using a genetic algorithm and the desired oracle function. The optimized molecule embedding can then be used as input to our model to produce a synthetic tree, where the final root molecule corresponds to the optimized molecule.

Setup instructions

Setting up the environment

You can use conda to create an environment containing the necessary packages and dependencies for running synth_net by using the provided YAML file:

conda env create -f env/synthenv.yml

If you update the environment and would like to save the updated environment as a new YAML file using conda, use:

conda env export > path/to/env.yml

Unit tests

To check that everything has been set-up correctly, you can run the unit tests from within the tests/. If starting in the main SynNet/ directory, you can run the unit tests as follows:

export PYTHONPATH=`pwd`:$PYTHONPATH
cd tests/
python -m unittest

You should get no errors if everything ran correctly.

Code Structure

The code is structured as follows:

synth_net/
├── data
│   └── rxn_set_hb.txt
├── environment.yml
├── LICENSE
├── README.md
├── scripts
│   ├── compute_embedding_mp.py
│   ├── compute_embedding.py
│   ├── generation_fp.py
│   ├── generation.py
│   ├── gin_supervised_contextpred_pre_trained.pth
│   ├── _mp_decode.py
│   ├── _mp_predict_beam.py
│   ├── _mp_predict_multireactant.py
│   ├── _mp_predict.py
│   ├── _mp_search_similar.py
│   ├── _mp_sum.py
│   ├── mrr.py
│   ├── optimize_ga.py
│   ├── predict-beam-fullTree.py
│   ├── predict_beam_mp.py
│   ├── predict-beam-reactantOnly.py
│   ├── predict_mp.py
│   ├── predict_multireactant_mp.py
│   ├── predict.py
│   ├── read_st_data.py
│   ├── sample_from_original.py
│   ├── search_similar.py
│   ├── sketch-synthetic-trees.py
│   ├── st2steps.py
│   ├── st_split.py
│   └── temp.py
├── setup.py
├── synth_net
│   ├── data_generation
│   │   ├── check_all_template.py
│   │   ├── filter_unmatch.py
│   │   ├── __init__.py
│   │   ├── make_dataset_mp.py
│   │   ├── make_dataset.py
│   │   ├── _mp_make.py
│   │   ├── _mp_process.py
│   │   └── process_rxn_mp.py
│   ├── __init__.py
│   ├── models
│   │   ├── act.py
│   │   ├── mlp.py
│   │   ├── prepare_data.py
│   │   ├── rt1.py
│   │   ├── rt2.py
│   │   └── rxn.py
│   └── utils
│       ├── data_utils.py
│       ├── ga_utils.py
│       └── __init__.py
└── tests
    ├── create-unittest-data.py
    └── test_DataPreparation.py

The model implementations can be found in synth_net/models/, with processing and analysis scripts located in scripts/.

Instructions

Before running anything, you need to add the root directory to the Python path. One option for doing this is to run the following command in the root SynNet directory:

export PYTHONPATH=`pwd`:$PYTHONPATH

Using pre-trained models

We have made available a set of pre-trained models at the following link. The pretrained models correspond to the Action, Reactant 1, Reaction, and Reactant 2 networks, trained on the Hartenfeller-Button dataset using radius 2, length 4096 Morgan fingerprints for the molecular node embeddings, and length 256 fingerprints for the k-NN search. For further details, please see the publication.

The models can be uncompressed with:

tar -zxvf hb_fp_2_4096_256.tar.gz

Synthesis Planning

To perform synthesis planning described in the main text: [TODO add checkpoints to prediction scripts // save trees periodically. otherwise just saves at end and is problematic of job times out]

python predict_multireactant_mp.py -n -1 --ncpu 36 --data test

This script will feed a list of molecules from the test data and save the decoded results (predicted synthesis trees) to synth_net/results/. One can use --help to see the instruction of each argument. Note: this file reads parameters from a directory, please specify the path to parameters previously.

Synthesizable Molecular Design

To perform synthesizable molecualr design, under synth_net/scripts/, run:

optimize_ga.py -i path/to/zinc.csv --radius 2 --nbits 4096 --num_population 128 --num_offspring 512 --num_gen 200 --ncpu 32 --objective gsk

This script uses a genetic algorithm to optimize molecular embeddings and returns the predicted synthetic trees for the optimized molecular embedding. One can use --help to see the instruction of each argument. If user wants to start from a checkpoint of previous run, run:

optimize_ga.py -i path/to/population.npy --radius 2 --nbits 4096 --num_population 128 --num_offspring 512 --num_gen 200 --ncpu 32 --objective gsk --restart

Note: the input file indictaed by -i is seed molecules in csv for initial run and numpy array of population for restarting run.

Train the model from scratch

Before training any models, you will first need to preprocess the set of reaction templates which you would like to use. You can use either a new set of reaction templates, or the provided Hartenfeller-Button (HB) set of reaction templates (see data/rxn_set_hb.txt). To preprocess a new dataset, you will need to:

Preprocess the data to identify applicable reactants for each reaction template
Generate the synthetic trees by random selection
Split the synthetic trees into training, testing, and validation splits
Featurize the nodes in the synthetic trees using molecular fingerprints
Prepare the training data for each of the four networks

Once you have preprocessed a training set, you can begin to train a model by training each of the four networks separately (the Action, First Reactant, Reaction, and Second Reactant networks).

After training a new model, you can then use the trained model to make predictions and construct synthetic trees for a list given set of molecules.

You can also perform molecular optimization using a genetic algorithm.

Instructions for all of the aforementioned steps are described in detail below.

In addition to the aforementioned types of jobs, we have also provide below instructions for (1) sketching synthetic trees and (2) calculating the mean reciprocal rank of reactant 1.

Processing the data: reaction templates and applicable reactants

Given a set of reaction templates and a list of buyable building blocks, we first need to assign applicable reactants for each template. Under synth_net/synth_net/data_generation/, run:

python process_rxn_mp.py

This will save the reaction templates and their corresponding building blocks in a JSON file. Then, run:

python filter_unmatch.py

This will filter out buyable building blocks which didn't match a single template.

Generating the synthetic path data by random selection

Under synth_net/synth_net/data_generation/, run:

python make_dataset_mp.py

This will generate synthetic path data saved in a JSON file. Then, to make the dataset more pharmaceutically revelant, we can change to synth_net/scripts/ and run:

python sample_from_original.py

This will filter out the samples where the root node QED is less than 0.5, or randomly with a probability less than 1 - QED/0.5.

Splitting data into training, validation, and testing sets, and removing duplicates

Under synth_net/scripts/, run:

python st_split.py

The default split ratio is 6:2:2 for training, validation, and testing sets.

Featurizing data

Under synth_net/scripts/, run:

python st2steps.py -r 2 -b 4096 -d train

This will featurize the synthetic tree data into step-by-step data which can be used for training. The flag -r indicates the fingerprint radius, -b indicates the number of bits to use for the fingerprints, and -d indicates which dataset split to featurize.

Preparing training data for each network

Under synth_net/synth_net/models/, run:

python prepare_data.py --radius 2 --nbits 4096

This will prepare the training data for the networks.

Each is a training script and can be used as follows (using the action network as an example):

python act.py --radius 2 --nbits 4096

This will train the network and save the model parameters at the state with the best validation loss in a logging directory, e.g., act_hb_fp_2_4096_logs. One can use tensorboard to monitor the training and validation loss.

Sketching synthetic trees

To visualize the synthetic trees, run:

python scripts/sketch-synthetic-trees.py --file /pool001/whgao/data/synth_net/st_hb/st_train.json.gz --saveto ./ --nsketches 5 --actions 3

This will sketch 5 synthetic trees with 3 or more actions to the current ("./") directory (you can play around with these variables or just also leave them out to use the defaults).

Testing the mean reciprocal rank (MRR) of reactant 1

Under synth_net/scripts/, run:

python mrr.py --distance cosine

Comments

Unit Test Files
First off great work!

The unit tests reference files that are ignored in .gitignore

'./data/states_0_train.npz' './data/st_hb_test.json.gz' './data/building_blocks_matched.csv.gz'

can we add these to the repo so the unit tests can be run?
opened by lilleswing 4

Running optimize_ga.py

I'm trying to test everything is working in my setup by running

python optimize_ga.py --radius 2 --nbits 4096 --num_population 128 --num_offspring 512 --num_gen 200 --ncpu 48

It seems to run forever with the following output

Using backend: pytorch
Downloading gin_supervised_contextpred_pre_trained.pth from https://data.dgl.ai/dgllife/pre_trained/gin_supervised_contextpred.pth...
Pretrained model loaded
Downloading gin_supervised_contextpred_pre_trained.pth from https://data.dgl.ai/dgllife/pre_trained/gin_supervised_contextpred.pth...
Pretrained model loaded
Starting with 128 fps with 4096 bits
mat1 and mat2 shapes cannot be multiplied (1x12292 and 12288x1200)
mat1 and mat2 shapes cannot be multiplied (1x12292 and 12288x1200)
mat1 and mat2 shapes cannot be multiplied (1x12292 and 12288x1200)
mat1 and mat2 shapes cannot be multiplied (1x12292 and 12288x1200)
...
mat1 and mat2 shapes cannot be multiplied (1x12292 and 12288x1200)
mat1 and mat2 shapes cannot be multiplied (1x12292 and 12288x1200)
Initial: 0.000 +/- 0.000
Scores: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0.]
Top-3 Smiles: [None, None, None]

How long should this run and is this output normal?

opened by tkram01 3

add docs on `compute_embedding.py` needed for inference

hello (again),

sorry that I am raising multiple issues. just want to make it easier for everyone else to start using this awesome work.

i didn't a note about how one could compute molecular fingerprints / GNN embeddings for a dataset. only after some CTRL+F, i found that scripts/compute_embedding.py does it. https://github.com/wenhao-gao/SynNet/blob/master/scripts/compute_embedding.py

so, it would be a good idea to add this to the README. I believe we need to do this step before running any inference.

opened by linminhtoo 2

No mol_fp module in _mp_decode.py

I run optimize_ga.py for my molecule optimization. But I got the error because no mol_fp module in _mp_decoe.py.

Traceback (most recent call last):
  File "/home/sejeong/codes/SynNet/scripts/optimize_ga.py", line 207, in <module>
    [decode.mol_fp(smi, args.radius, args.nbits) for smi in starting_smiles]
  File "/home/sejeong/codes/SynNet/scripts/optimize_ga.py", line 207, in <listcomp>
    [decode.mol_fp(smi, args.radius, args.nbits) for smi in starting_smiles]
AttributeError: module 'scripts._mp_decode' has no attribute 'mol_fp'

So, I changed the mol_fp to mol_fp function in predict_utils.py.

from syn_net.utils.predict_utils import mol_fp

            population = np.array(
                [mol_fp(smi, args.radius, args.nbits) for smi in starting_smiles]
            )

Then, I got the error like below.

Traceback (most recent call last):
  File "/home/sejeong/codes/SynNet/scripts/optimize_ga.py", line 210, in <module>
    population = population.reshape((population.shape[0], population.shape[2]))
IndexError: tuple index out of range

Can you help me with this error?

opened by SejeongPark8354 2

Errors in creating env and running unit tests.
Hi,

I'd like to use SynNet in my work. I have followed the instructions in the README to setup my environment.

In the environment.ymlfile the name is rdkit not synthenv. As a result, source activate synthevn as instructed in the readme does not work. You may want to take a look at these.

When I ran the unit tests, it gives me a few errors. I think it's originating from the incorrect path specifications. One of the errors I have got: FileNotFoundError: [Errno 2] No such file or directory: '/pool001/whgao/data/synth_net/st_hb/enamine_us_emb_gin.npy' I noticed that there are multiple pathways as such, which might make it difficult to use in future computations without having to change each and everyone of them.

Will you be able to help me with these? Thanks!
opened by geemi725 2
ZINC csv used by publication
hello wenhao & rocio,

I see that we have to provide path/to/zinc.csv to run the genetic algorithm (to replicate how it was done in the paper) https://github.com/wenhao-gao/SynNet#synthesizable-molecular-design-1

optimize_ga.py -i path/to/zinc.csv --radius 2 --nbits 4096 --num_population 128 --num_offspring 512 --num_gen 200 --ncpu 32 --objective gsk

is it possible to provide the exact zinc.csv that was used in the publication?

Seeds are randomly sampled from the ZINC database (Sterling & Irwin, 2015)
opened by linminhtoo 1
hardcoded paths in training `validation_step`
hello wenhao & rocio,

the unittests are great and gives a great overview of how different modules should be run. however, I saw that in these lines, the path to the building block embeddings are hardcoded to the path on the HPC cluster. https://github.com/wenhao-gao/SynNet/blob/56917a668c1a6b633964e02eb53b717be0d1dd64/syn_net/models/mlp.py#L78-L89

so, I am unable to make pytest pass, specifically:

FAILED tests/test_Training.py::TestTraining::test_reactant1_network - UnboundLocalError: local variable 'kdtree' referenced before assignment FAILED tests/test_Training.py::TestTraining::test_reactant2_network - UnboundLocalError: local variable 'kdtree' referenced before assignment

at least for the unittest, what should the correct path be? and would it be possible to make these paths user-passable arguments?

https://github.com/wenhao-gao/SynNet/blob/56917a668c1a6b633964e02eb53b717be0d1dd64/scripts/predict_multireactant_mp.py#L29 there's a similar hardcoding in this line, so I suppose we'll have to generate the .json.gz ourselves
opened by linminhtoo 1
predict_multireactant_mp.py error

I run the code with my data (which have smiles data more than 2000). And then, the sentence like below was printed. Can you tell me why the error occurs? I don't know the exact list object which provoke the error.

list index out of range

opened by SejeongPark8354 1

Error computing embeddings

When running the compute_embedding.py I get this error.

Using backend: pytorch
Downloading gin_supervised_contextpred_pre_trained.pth from https://data.dgl.ai/dgllife/pre_trained/gin_supervised_contextpred.pth...
Pretrained model loaded
Total data:  172988
  0%|                                                                                                                                                                                                            | 0/172988 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/home/ec2-user/SynNet/scripts/compute_embedding.py", line 143, in <module>
    embeddings.append(model(smi))
  File "/home/ec2-user/miniconda3/envs/rdkit/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
TypeError: forward() missing 2 required positional arguments: 'categorical_node_feats' and 'categorical_edge_feats'

When trying to run the compute_embedding_mp.py I get the following error

Using backend: pytorch
Downloading gin_supervised_contextpred_pre_trained.pth from https://data.dgl.ai/dgllife/pre_trained/gin_supervised_contextpred.pth...
Pretrained model loaded
Total data:  172988
Traceback (most recent call last):
  File "/home/ec2-user/SynNet/scripts/compute_embedding_mp.py", line 29, in <module>
    embeddings = pool.map(gin_embedding, data)
NameError: name 'gin_embedding' is not defined

I think this can be resolved by changing gin_embedding to model but that then results in the above error.

opened by tkram01 1

WIP: Syntree visualisation with graphviz
Visualize syntree with graphviz instead of mermaid.js.

Works by plotting all chemicals as pngs and then uses graphviz to create a single image of the entire syntree.

Reason for change:

Found it too clunky to render graphviz correctly in VS Code with mermaid preview extension and export them.

To do:

[ ] update readme

[ ] write (some) tests

[ ] figure out how to plot in higher resolution, ideally svg

[ ] add target node to plot (if present/available as smiles)

[ ] add wrapper or script to plot multiple syntrees with sane defaults

[ ] add direct display in jupyter notebooks

Inspiration from: https://github.com/MolecularAI/aizynthfinder/blob/9e44989213c11f1bb647a00b8756e0c76a8f4b52/aizynthfinder/utils/image.py
opened by chrulm 0
Refactor SynNet
This PR concludes some refactoring of SynNet.

New:

performance improvements

refactored scripts/modules (wip)

Breaking:

unittests (somewhat replaced by the INSTRUCTIONS.md file)

Graph neural net embeddings (not supported as of now)

Removed:

all code related to beam search
opened by chrulm 0
index 192158 is out of bounds for axis 0 with size 179821

Dear authors, we can not run the 20-predict-targets.py file, the previous files can all be performed correctly.

Can you tell me what I should do to solve this? Thank you a lot!!

opened by JackAILab 0

Releases(v2.0.0)

v2.0.0(Oct 12, 2022)
Significant changes from the previous SynNet version.

New:

Performance improvements

Refactored scripts/modules (wip)

Improved documentation and instructions

Fixed all opened issues and bugs (predict_multireactant_mp.py error, hard-coded paths)

Removed:

All code related to beam search

Source code(tar.gz)
Source code(zip)

Owner

Wenhao Gao

I'm currently a PhD student in ChemE at MIT. I'm interested in developing systematical molecular design and synthesis protocols.

GitHub Repository

SeqTR: A Simple yet Universal Network for Visual Grounding

SeqTR This is the official implementation of SeqTR: A Simple yet Universal Network for Visual Grounding, which simplifies and unifies the modelling fo

76 Dec 24, 2022

Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can ! 🤡

Customers Segmentation using PHP and Rubix ML PHP Library Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can !

11 Oct 08, 2022

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

Sign-Agnostic Convolutional Occupancy Networks Paper | Supplementary | Video | Teaser Video | Project Page This repository contains the implementation

64 Jan 05, 2023

An Inverse Kinematics library aiming performance and modularity

IKPy Demo Live demos of what IKPy can do (click on the image below to see the video): Also, a presentation of IKPy: Presentation. Features With IKPy,

481 Jan 02, 2023

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

T2I_CL This is the official Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning Requirements Linux Python

42 Dec 31, 2022

Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Real world Anomaly Detection in Surveillance Videos : Pytorch RE-Implementation This repository is a re-implementation of "Real-world Anomaly Detectio

62 Dec 08, 2022

Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

Automatic, Readable, Reusable, Extendable Machin is a reinforcement library designed for pytorch. Build status Platform Status Linux Windows Supported

348 Dec 24, 2022

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021 We propose a cross encoder model (LTR_CrossEncoder) for information retrieval, re-retrie

7 Jan 12, 2022

Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

CMPC-Refseg Code of our CVPR 2020 paper Referring Image Segmentation via Cross-Modal Progressive Comprehension. Shaofei Huang*, Tianrui Hui*, Si Liu,

55 Dec 01, 2022

PyTorch implementation of PSPNet segmentation network

pspnet-pytorch PyTorch implementation of PSPNet segmentation network Original paper Pyramid Scene Parsing Network Details This is a slightly different

532 Dec 29, 2022

A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

PokeGAN A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon. Dataset The model has been trained on dataset that includes 8

19 Jul 26, 2022

Simple node deletion tool for onnx.

snd4onnx Simple node deletion tool for onnx. I only test very miscellaneous and limited patterns as a hobby. There are probably a large number of bugs

6 May 15, 2022

Deep Surface Reconstruction from Point Clouds with Visibility Information

Data, code and pretrained models for the paper Deep Surface Reconstruction from Point Clouds with Visibility Information.

23 Jan 04, 2023

Self-Supervised Image Denoising via Iterative Data Refinement

Self-Supervised Image Denoising via Iterative Data Refinement Yi Zhang1, Dasong Li1, Ka Lung Law2, Xiaogang Wang1, Hongwei Qin2, Hongsheng Li1 1CUHK-S

72 Jan 01, 2023

Its a Plant Leaf Disease Detection System based on Machine Learning.

My_Project_Code Its a Plant Leaf Disease Detection System based on Machine Learning. I have used Tomato Leaves Dataset from kaggle. This system detect

3 Jun 15, 2022

Unofficial PyTorch implementation of SimCLR by Google Brain

2 Oct 13, 2021

[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement Announcement 🔥 We have not tested the code yet. We will fini

7 Oct 30, 2022

The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

DGMS This is the code of the paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks". Installation Our code works with Pytho

3 Aug 28, 2022

Custom studies about block sparse attention.

Block Sparse Attention 研究总结本人近半年来对Block Sparse Attention（块稀疏注意力）的研究总结（持续更新中）。按时间顺序，主要分为如下三部分： PyTorch 自定义 CUDA 算子——以矩阵乘法为例基于 Triton 的 Block Sparse A

2 Jan 09, 2022

The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.

SuperGen The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding. Requirements Before running, you

38 Dec 12, 2022

SynNet - synthetic tree generation using neural networks

Related tags

Overview

SynNet

Summary

Overview

Synthesis planning

Synthesizable molecular design

Setup instructions

Setting up the environment

Unit tests

Code Structure

Instructions

Using pre-trained models

Synthesis Planning

Synthesizable Molecular Design

Train the model from scratch

Processing the data: reaction templates and applicable reactants

Generating the synthetic path data by random selection

Splitting data into training, validation, and testing sets, and removing duplicates

Featurizing data

Preparing training data for each network

Sketching synthetic trees

Testing the mean reciprocal rank (MRR) of reactant 1

Comments

Releases(v2.0.0)

v2.0.0(Oct 12, 2022)

Owner

Wenhao Gao

SeqTR: A Simple yet Universal Network for Visual Grounding

Can we do Customers Segmentation using PHP and Unsupervized Machine Learning ? Yes we can ! 🤡

ICCV2021 Oral SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks

An Inverse Kinematics library aiming performance and modularity

Pytorch implementation of the paper Improving Text-to-Image Synthesis Using Contrastive Learning

Real-world Anomaly Detection in Surveillance Videos- pytorch Re-implementation

Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

LTR_CrossEncoder: Legal Text Retrieval Zalo AI Challenge 2021

Code for Referring Image Segmentation via Cross-Modal Progressive Comprehension, CVPR2020.

PyTorch implementation of PSPNet segmentation network

A tensorflow/keras implementation of StyleGAN to generate images of new Pokemon.

Simple node deletion tool for onnx.

Deep Surface Reconstruction from Point Clouds with Visibility Information

Self-Supervised Image Denoising via Iterative Data Refinement

Its a Plant Leaf Disease Detection System based on Machine Learning.

Unofficial PyTorch implementation of SimCLR by Google Brain

[CVPR 2022] Back To Reality: Weak-supervised 3D Object Detection with Shape-guided Label Enhancement

The official implementation of paper "Finding the Task-Optimal Low-Bit Sub-Distribution in Deep Neural Networks" (IJCV under review).

Custom studies about block sparse attention.

The source code for Generating Training Data with Language Models: Towards Zero-Shot Language Understanding.