Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

Last update: Dec 01, 2022

You might also like...

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

GVP Transformer (wip) Implementation of the GVP-Transformer, which was used in the paper Learning inverse folding from millions of predicted structure

19 May 6, 2022

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation A pytorch-version implementation

11 Oct 8, 2022

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

English | 简体中文 | 繁體中文 State-of-the-art Natural Language Processing for Jax, PyTorch and TensorFlow 🤗 Transformers provides thousands of pretrained mo

77.2k Jan 2, 2023

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training Code for our paper "Predicting lncRNA–protein interactio

1 Nov 29, 2022

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

GNN_PPI Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction". Lear

2 Dec 14, 2022

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

RITA: a Study on Scaling Up Generative Protein Sequence Models RITA is a family of autoregressive protein models, developed by a collaboration of Ligh

69 Dec 22, 2022

Generative Models for Graph-Based Protein Design

Graph-Based Protein Design This repo contains code for Generative Models for Graph-Based Protein Design by John Ingraham, Vikas Garg, Regina Barzilay

159 Dec 15, 2022

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

kaggle-hpa-2021-7th-place-solution Code for 7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle. A description of the met

8 Jul 9, 2021

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Using a predicted aligned error matrix corresponding to an AlphaFold2 model , returns a series of lists of residue indices, where each list corresponds to a set of residues clustering together into a pseudo-rigid domain.

24 Nov 23, 2022

Comments

protein bert uniref90 dataset
(discussed in discord)

after running the first step (create_uniref_db) of https://github.com/nadavbra/protein_bert I got a 24GB file "uniref_proteins_and_annotations.db" . It seems it could be useful for generate sequences for this project, sharing the links there

https://gitlab.com/rom1504/uniref data

colab to get the db and do a few queries https://colab.research.google.com/drive/1BGYEBDmD0yToLNou2T-t-QbJV5wCtIBz#scrollTo=21U3PpCp-pxr There are 135301051 records in the db, in a table looking like:

CREATE TABLE "protein_annotations" ( "index" INTEGER, "tax_id" REAL, "uniprot_name" TEXT, "go_annotations" TEXT, "flat_go_annotations" TEXT, "n_go_annotations" INTEGER, "complete_go_annotation_indices" TEXT, "n_complete_go_annotations" INTEGER );

Sample look like this:

| | index | tax_id | uniprot_name | go_annotations | flat_go_annotations | n_go_annotations | complete_go_annotation_indices | n_complete_go_annotations | |---:|--------:|-----------------:|:-----------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------|-------------------:|:---------------------------------|----------------------------:| | 0 | 0 | 1.57204e+06 | A0A5A9P0L4_9TELE | {"GO Molecular Function": ["GO:0003755", "GO:0005524", "GO:0004672", "GO:0005509"], "GO Biological Process": [], "GO Cellular Component": []} | ["GO:0003755", "GO:0004672", "GO:0005509", "GO:0005524"] | 4 | [2761, 3561, 4193, 4205] | 4 | | 1 | 1 | 648755 | UPI0016133188 | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 2 | 2 | 1.93059e+06 | A0A410P257_9BACT | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 3 | 3 | 519421 | UPI0019403D63 | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 4 | 4 | 72004 | A0A6B0RPA5_9CETA | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": []} | ["GO:0004672", "GO:0005524"] | 2 | [3561, 4205] | 2 | | 5 | 5 | 375764 | A0A672ZWI7_9TELE | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 6 | 6 | 1.41558e+06 | A0A6P7YNV3_9AMPH | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886"] | 3 | [3561, 4205, 4526] | 3 | | 7 | 7 | 240159 | A0A4U5TZD8_COLLU | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0016021", "GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886", "GO:0016021"] | 4 | [3561, 4205, 4526, 10019] | 4 | | 8 | 8 | 146911 | UPI00074FFD9C | {"GO Molecular Function": [], "GO Biological Process": [], "GO Cellular Component": []} | [] | 0 | [] | 0 | | 9 | 9 | 260995 | A0A6P8RG40_GEOSA | {"GO Molecular Function": ["GO:0005524", "GO:0004672"], "GO Biological Process": [], "GO Cellular Component": ["GO:0005886"]} | ["GO:0004672", "GO:0005524", "GO:0005886"] | 3 | [3561, 4205, 4526] | 3 |
opened by rom1504 4

Releases(0.0.36)

0.0.36(Aug 16, 2021)

Source code(tar.gz)
Source code(zip)
0.0.35(Aug 9, 2021)

Source code(tar.gz)
Source code(zip)
0.0.34(Jul 7, 2021)

Source code(tar.gz)
Source code(zip)
0.0.33(Jul 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.32(Jul 6, 2021)

Source code(tar.gz)
Source code(zip)
0.0.29(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.28a(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.27(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.26(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.25(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.24(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.23(Jul 4, 2021)

Source code(tar.gz)
Source code(zip)
0.0.21(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.20(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.19(Jul 3, 2021)

Source code(tar.gz)
Source code(zip)
0.0.18(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.17(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.16(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.14(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.12(Jul 2, 2021)

Source code(tar.gz)
Source code(zip)
0.0.11(Jul 1, 2021)

Source code(tar.gz)
Source code(zip)
0.0.10(Jul 1, 2021)

Source code(tar.gz)
Source code(zip)
0.0.9a(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.8(Jun 30, 2021)

Source code(tar.gz)
Source code(zip)
0.0.7(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.6(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5a(Jun 28, 2021)

Source code(tar.gz)
Source code(zip)
0.0.5(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.3a(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)
0.0.2a(Jun 25, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Phil Wang

Working with Attention

GitHub Repository

This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction

H3DS Dataset This repository contains the code for using the H3DS dataset introduced in H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction Access

72 Dec 10, 2022

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral)

Not All Points Are Equal: Learning Highly Efficient Point-based Detectors for 3D LiDAR Point Clouds (CVPR 2022, Oral) This is the official implementat

259 Dec 25, 2022

This is an early in-development version of training CLIP models with hivemind.

A transformer that does not hog your GPU memory This is an early in-development codebase: if you want a stable and documented hivemind codebase, look

[email protected]"> 4 Nov 06, 2022

Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21'

Argument Extraction by Generation Code for paper "Document-Level Argument Extraction by Conditional Generation". NAACL 21' Dependencies pytorch=1.6 tr

87 Dec 26, 2022

ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels

ROCKET + MINIROCKET ROCKET: Exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge D

298 Dec 26, 2022

PantheonRL is a package for training and testing multi-agent reinforcement learning environments.

PantheonRL is a package for training and testing multi-agent reinforcement learning environments. PantheonRL supports cross-play, fine-tuning, ad-hoc coordination, and more.

57 Dec 28, 2022

UFPR-ADMR-v2 Dataset

UFPR-ADMR-v2 Dataset The UFPR-ADMRv2 dataset contains 5,000 dial meter images obtained on-site by employees of the Energy Company of Paraná (Copel), w

8 Sep 29, 2022

Voxel Transformer for 3D object detection

Voxel Transformer This is a reproduced repo of Voxel Transformer for 3D object detection. The code is mainly based on OpenPCDet. Introduction We provi

173 Dec 25, 2022

Make differentially private training of transformers easy for everyone

private-transformers This codebase facilitates fast experimentation of differentially private training of Hugging Face transformers. What is this? Why

73 Dec 28, 2022

Original Implementation of Prompt Tuning from Lester, et al, 2021

Prompt Tuning This is the code to reproduce the experiments from the EMNLP 2021 paper "The Power of Scale for Parameter-Efficient Prompt Tuning" (Lest

282 Dec 28, 2022

Train Yolov4 using NBX-Jobs

yolov4-trainer-nbox Train Yolov4 using NBX-Jobs. Use the powerfull functionality available in nbox-SDK repo to train a tiny-Yolo v4 model on Pascal VO

1 Jan 12, 2022

Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization (Salesforce Research) This is a PyTorch implementation of the CoMatch paper [B

107 Dec 14, 2022

This is the official github repository of the Met dataset

The Met dataset This is the official github repository of the Met dataset. The official webpage of the dataset can be found here. What is it? This cod

35 Dec 17, 2022

Transformer part of 12th place solution in Riiid! Answer Correctness Prediction

kaggle_riiid Transformer part of 12th place solution in Riiid! Answer Correctness Prediction. Please see here for more information. Execution You need

2 Apr 23, 2022

[CoRL 21'] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo

TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo Lukas Koestler1* Nan Yang1,2*,† Niclas Zeller2,3 Daniel Cremers1

744 Jan 04, 2023

Extremely simple and fast extreme multi-class and multi-label classifiers.

napkinXC napkinXC is an extremely simple and fast library for extreme multi-class and multi-label classification, that focus of implementing various m

43 Nov 14, 2022

Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

164 Dec 30, 2022

Tensorflow implementation for "Improved Transformer for High-Resolution GANs" (NeurIPS 2021).

HiT-GAN Official TensorFlow Implementation HiT-GAN presents a Transformer-based generator that is trained based on Generative Adversarial Networks (GA

78 Oct 31, 2022

Use unsupervised and supervised learning to predict stocks

AIAlpha: Multilayer neural network architecture for stock return prediction This project is meant to be an advanced implementation of stacked neural n

1.5k Jan 06, 2023

Fashion Entity Classification

Fashion-Entity-Classification - Fashion-MNIST is a dataset of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grays

1 Jan 04, 2022

Implementation and replication of ProGen, Language Modeling for Protein Generation, in Jax

Related tags

Overview

ProGen - (wip)

Install

Usage

Training from Uniref

Todo

Citations

You might also like...

Implementation of the GVP-Transformer, which was used in the paper "Learning inverse folding from millions of predicted structures" for de novo protein design alongside Alphafold2

A pytorch-version implementation codes of paper: "BSN++: Complementary Boundary Regressor with Scale-Balanced Relation Modeling for Temporal Action Proposal Generation"

🤗 Transformers: State-of-the-art Natural Language Processing for Pytorch, TensorFlow, and JAX.

Predicting lncRNA–protein interactions based on graph autoencoders and collaborative training

Codes and models for the paper "Learning Unknown from Correlations: Graph Neural Network for Inter-novel-protein Interaction Prediction".

RITA is a family of autoregressive protein models, developed by LightOn in collaboration with the OATML group at Oxford and the Debora Marks Lab at Harvard.

Generative Models for Graph-Based Protein Design

7th place solution of Human Protein Atlas - Single Cell Classification on Kaggle

Graph-based community clustering approach to extract protein domains from a predicted aligned error matrix

Comments

protein bert uniref90 dataset

Releases(0.0.36)

0.0.36(Aug 16, 2021)

0.0.35(Aug 9, 2021)

0.0.34(Jul 7, 2021)

0.0.33(Jul 6, 2021)

0.0.32(Jul 6, 2021)

0.0.29(Jul 4, 2021)

0.0.28a(Jul 4, 2021)

0.0.27(Jul 4, 2021)

0.0.26(Jul 4, 2021)

0.0.25(Jul 4, 2021)

0.0.24(Jul 4, 2021)

0.0.23(Jul 4, 2021)

0.0.21(Jul 3, 2021)

0.0.20(Jul 3, 2021)

0.0.19(Jul 3, 2021)

0.0.18(Jul 2, 2021)

0.0.17(Jul 2, 2021)

0.0.16(Jul 2, 2021)

0.0.14(Jul 2, 2021)

0.0.12(Jul 2, 2021)

0.0.11(Jul 1, 2021)

0.0.10(Jul 1, 2021)

0.0.9a(Jun 30, 2021)

0.0.8(Jun 30, 2021)

0.0.7(Jun 29, 2021)

0.0.6(Jun 29, 2021)

0.0.5a(Jun 28, 2021)

0.0.5(Jun 25, 2021)

0.0.3a(Jun 25, 2021)

0.0.2a(Jun 25, 2021)