This figure represent: Electron Clouds; Protein-Ligand Interactions; Latent Diffusion Process
conda env create -f ecloudgen_env.yml
conda activate ecloudgen
This environment has been successfully tested on CUDA==12.1
# recommend using numpy<2
mamba create -n ecloudgen pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=12.1 plyfile pyg rdkit biopython easydict jupyter ipykernel lmdb mamba moleculekit openbabel scikit-learn scipy omegaconf einops accelerate h5py wandb xtb ignite gpytorch altair -c pytorch -c nvidia -c pyg -c conda-forge -c acellera
conda activate ecloudgen
# optional
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.2.0+cu121.html
You can download the raw data as provided in ResGen. You can also download the processed protein-ligand pair from the this link.
Note: index.pkl, split_by_name.pt. are automatically downloaded with the SurfGen code. index.pkl saves the information of each protein-ligand pair, while split_by_name.pt save the train-test split of the dataset.
tar -xzvf crossdocked_pocket10.tar.gz
# Then follow the ./dataset/readme.md for processing protein-ligand dataset from scratch.
see ECloudGen/dataset/02_generate_ligecloud_data.py
You can download the pretrained checkpoints, and put them in the ECloudGen/model_ckpts
.
# modify the data path and batch_size in the ./configs/eclouddiff.yml
python generate_from_pdb.py --pdb_file pdb_file ./play_around/peptide_example/7ux5_protein.pdb --lig_file ./play_around/peptide_example/7ux5_peptide.sdf --outputs_dir results
Generate from ECloud, ECloud can be from ECloudDiff module or from the hit ligands.
python generate_from_ecloud.py --input_ecloud play_around/example/BRD4_gen_ecloud.npy --model model_ckpts/ecloud_smiles_67.pkl --num_gen 100 --batch_size 32 --noise 0.6 --output output/test.sdf
The model-agnostic optimizer can be found at ECloudGen/play_around/model_agnostic_optimizer.ipynb
The training process is released as train.py, the following command is an example of how to train a model.
# modify the data path and batch_size in the ./configs/eclouddiff.yml
python train_eclouddiff.py
python train_eclouddecipher.py