Skip to content

Latest commit

 

History

History
29 lines (20 loc) · 2.09 KB

README.md

File metadata and controls

29 lines (20 loc) · 2.09 KB

Hansen solubility parameter (HSP) predictions

XGBoost and GNN training and models for prediction of Hansen solubility parameters used in https://doi.org/10.1016/j.chemolab.2024.105168. Dependencies:

  • Python packages : deepchem, mordred, pandas, hyperopt, rdkit , sklearn , xgboost

Folders:

  • data : folder with all data used in the paper
  • trained_models: folder with trained models, for now only XGBOOST models are available in GitHub (due to their size), if you want to use trained GNN models download them from this link.

Files:

XGBOOST related

  • XGBOOST_feature_generation.ipynb - jupyter notebook with the code for generating descriptors(features) for the molecues, along with their initial filtering
  • XGBOOST_training.ipynb - jupyter notebook for training and testing XGBOOST models
  • XGBOOST_new_data_predictions.ipynb - jupyter notebook for loading and applying/testing the XGBOOST models on new data
  • SHAP_XGBOOST.ipynb - jupyter notebook for SHAP plots

GNN related

  • GNN_training_D/P/H.ipynb - jupyter notebook for training the GNN models for D(dispersive component), P(polar component) or H(hydrogen bond component) parameter. They are separate for readability purposes, but the code is essentialy the same in all three cases.
  • GNN_new_data_predictions.ipynb - jupyter notebook for loading and applying/testing the GNN models on new data. Make sure you download the models from the above link first and put them in a convinient folder (in the notebook it's in the trained_models/gnn but it's not necesarry, just change the model_dir argument to the appropriate path)

Other

  • dataset_exploratory_analysis.ipynb - jupyter notebook with the code and visualizations for exploratory analysis of the training and test datasets
  • plots.ipynb - jupyter notebook with plots of the results(predictions)
  • sphere.py - python file for drawing Hansen sphere (.py because then you can rotate and ajust the sphere for best visibility)