GP_qsar

Wrapper around Sklearn and Gpytorch gaussian process model to enable use in Reinvent as a scoring function and for use in active learning. Use in REINVENT4 is identical to serialised QSARtuna models.

Features:

Automates feature selection and hyperparameter tuning/ kernel selection.
Produces predictions from SMILES
Evaluates single-sample acquisiton functions for use in active learning
Evaluates batch selection aquisition functions for use in active leaning

Example config .toml files for use of these models in REINVENT4 are given in /example_config.

Installation

Clone this repository

git clone https://github.com/cmwoodley/GP_qsar.git

Install GP_qsar

pip install .

Example Usage

from gp_qsar import GP_qsar
import numpy as np

# Toy dataset for simple example

smiles = np.array([
    "CCO", "C1CCCCC1", "O=C=O", "CC(C)C",
    "C1=CC=CC=C1", "CCN(CC)CC", "C1=CC(=O)NC(=O)N1", "CC(C)O",
    "C#N", "C=O", "O=C(O)C", "CC(C)CC",
    "NCCO", "CC(=O)O", "C1CC1", "O=S(=O)(O)O",
    "CNC", "C=CC", "CCOCC", "CCOC"
])

test_smiles = [
    "C1CCOC1",  # Tetrahydrofuran (THF)
    "N#CCN",    # Cyanogen
    "CC(C)CO",  # Isobutanol
    "C1=CC(=O)OC=C1",  # Furan-2(5H)-one
    "C=C",      # Ethene (Ethylene)
]

y = [
    3.14, 2.718, 1.618, 0.577,
    6.022, 9.81, 1.414, 2.302,
    0.693, 4.669, 0.007, 299792.458,
    1.732, 42.0, 0.001, 8.314,
    1.96, 0.333, 0.618, 1.12
]

# Initialise model
model = GP_qsar(smiles, y)

# Generate predictions 
predictions = model.predict_from_smiles(test_smiles)
predictions_std = model.predict_from_smiles(test_smiles, uncert=True) # Generate with uncertainty

# Evaluate acquisition function
UCB = model.evaluate_acquisition_functions(test_smiles, "UCB")

To do

Add teach functionality to re-train models with newly acquired datat
Actually implement metadata to show model performance
Store names of selected features in some meaningful way
Improve testing framework
Add install option to make gpytorch an optional dependency because cuda is big

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
__pycache__		__pycache__
example_config		example_config
gp_qsar		gp_qsar
tests		tests
.gitignore		.gitignore
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GP_qsar

Installation

Example Usage

To do

About

Releases

Packages

Languages

cmwoodley/GP_qsar

Folders and files

Latest commit

History

Repository files navigation

GP_qsar

Installation

Example Usage

To do

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages