Skip to content

Latest commit

 

History

History
102 lines (74 loc) · 3.83 KB

README.md

File metadata and controls

102 lines (74 loc) · 3.83 KB

Molecular Supervised Learning with GNNs and Bayesian Learning

Source codes for "A benchmark study on reliable molecular supervised learning via Bayesian learning" and "Comprehensive study on molecular supervised learning with graph neural networks"

Setup Python environment

for GPU usage, DGL requires CUDA 10.0.

# Install python environment
# For CPU usage,
conda env create -f environment_cpu.yml   
# For GPU usage,
conda env create -f environment_gpu.yml

# Activate environment
conda activate reliable_gnn

Download datasets

# At the root of the project
cd data/ 
bash script_download_datasets.sh

Reproducibility

1. Usage

1.1 In terminal

# Run the main file (at the root of the project)
# To run GIN on BACE dataset,
python main_classification.py --config configs/gin_bace.json # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json # for GPU

# To run GCN on BACE dataset,
python main_classification.py --config configs/gcn_bace.json # for CPU
python main_classification.py --gpu_id 0 --config configs/gcn_bace.json # for GPU

# To run GIN on BBBP dataset,
python main_classification.py --config configs/gin_bbbp.json # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bbbp.json # for GPU

# To run GIN on HIV dataset,
python main_classification.py --config configs/gin_hiv.json # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_hiv.json # for GPU

# To run GIN on TOX21 dataset,
python main_tox21.py --config configs/gin_tox21.json # for CPU
python main_tox21.py --gpu_id 0 --config configs/gin_tox21.json # for GPU

# To run GIN on BACE dataset with MCDropout,
python main_classification.py --config configs/gin_bace.json --mcdropout True # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json --mcdropout True # for GPU

# To run GIN on BACE dataset with SWA,
python main_classification.py --config configs/gin_bace.json --swa True # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json --swa True # for GPU

# To run GIN on BACE dataset with SWAG,
python main_classification.py --config configs/gin_bace.json --swag True # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json --swag True # for GPU

# To run GIN on BACE dataset with pSGLD,
python main_classification.py --config configs/gin_bace.json --psgld True # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json --psgld True # for GPU

# To run GIN on BACE dataset with Bayes By Backprop,
python main_classification.py --config configs/gin_bace.json --bbp True # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json --bbp True # for GPU

# To run GIN on BACE dataset with Checkpoint saved,
python main_classification.py --config configs/gin_bace.json --save_params True # for CPU
python main_classification.py --gpu_id 0 --config configs/gin_bace.json --save_params True # for GPU

The training and network parameters for each dataset and network is stored in a json file in the configs/ directory.

1.2 In jupyter notebook

# Run the notebook file (at the root of the project)
conda activate reliable_gnn 
jupyter notebook

Please use main_tox21.ipynb to run models on TOX21. Else, use main_classification.ipynb.

2. Output, checkpoints

Output results are located in the folder defined by the variable out_dir in the corresponding config file (eg. configs/molecules_graph_regression_GatedGCN_ZINC.json file).
If out_dir = 'out/bace_classification/', then

To see checkpoints and results,

  1. Go toout/bace_classification/results to view all result text files.
  2. Directory out/bace_classification/checkpoints contains model checkpoints.