Skip to content

Enzyme Activity Prediction of Sequence Variants onNovel Substrates using Improved Substrate Encodings and Convolutional Pooling

License

Notifications You must be signed in to change notification settings

LMSE/Compound_Protein_Interac_Pred

Repository files navigation

Substrate Enzyme Interaction Prediction

The repository includes codes for reproducing work in paper Enzyme Activity Prediction of Sequence Variants onNovel Substrates using Improved Substrate Encodings and Convolutional Pooling, (https://proceedings.mlr.press/v165/xu22a.html). In this work, a new compound protein interaction prediction pipeline is proposed with performance tested on datasets obtained from Machine learning modeling of family wide enzyme-substrate specificity screens (arXiv:2109.03900v1, by S. Goldman and C. W. Coley). The pipeline is based on sequence embeddings generated by protein language models and count encodings of molecule fingerprints.

The figure below shows the prediction model's architechture,

We were able to show a substantial improvements with the new pipeline as we tested the predictions on multiple enzyme-substrate-activity datasets (i.e. aminotransferase, kinase, halogenase, phosphatase, etc. ) as shown in the table below.

About

Enzyme Activity Prediction of Sequence Variants onNovel Substrates using Improved Substrate Encodings and Convolutional Pooling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages