SMA-MD is a procedure for sampling the conformational space of molecules. SMA-MD first leverages Deep Generative Models to enhance the sampling of slow degrees of freedom. Then, the generated ensemble undergoes statistical reweighting, followed by short simulations.
The following versions of SMA-MD exist in this repository:
- v1.b is the label corresponding to the first beta version.
- Anaconda or Miniconda with Python 3.9.
- CUDA-enabled GPU.
SMA-MD uses two conda environments, e3nn-env and openmm-env. e3nn-env is used for training and sampling the generative model and openmm-env is used for molecular dynamics related tasks. The environments have complicated deppendencies, and direct installation can take long time. Therefore, we recommend following the installation steps as described below.
To set up e3nn-env, use ./environments/e3nn-env.yml and run:
conda env create -f environments/e3nn-env.yml
conda activate e3nn-env.yml
pip3 install torch torchvision torchaudio
pip install torch_geometric
pip install pyg_lib torch_scatter torch_sparse torch_cluster torch_spline_conv -f https://data.pyg.org/whl/torch-2.1.0+cu121.html
pip install rdkit=2022.09.4 seaborn mdtraj pyaml networkx h5py e3nn
To set up openmm-env, use ./environments/openmm-env.yml and run:
conda env create -f environments/openmm-env.yml
Then, in the SMA-MD directory
git clone https://github.com/noegroup/reform.git
cd reform
pip install .
Before performing training or inference, preprocessing.py needs to be run. The surrogate model (torsional diffusion) can be trained with the script train.py using e3nn-env. Sampling from a trained model can be done with sample.py using e3nn-env. For a complete sampling procedure, ones needs to also run energy_evaluation.py and md_finetuning.py with the openmm-env.
The (hyper-)parameters and dataset paths/indexes can be specified at ./parameters.py.
The MDQM9-nc dataset is available at https://github.com/olsson-group/mdqm9-nc-loaders. It contains mdqm9-nc.sdf, a sdf file with the molecules, mdqm9-nc.hdf5 with conformational data. Random splits are also provided.
@JuanViguera and @psolsson.
Contributions are welcome in the form of issues or pull requests. To report a bug, please submit an issue. Thank you to everyone who has used the code and provided feedback thus far.
If you use SMA-MD in your research, please reference our paper.
The reference in BibTex format are available below:
@article{Viguera Diez_2024,
doi = {10.1088/2632-2153/ad3b64},
url = {https://dx.doi.org/10.1088/2632-2153/ad3b64},
year = {2024},
month = {apr},
publisher = {IOP Publishing},
volume = {5},
number = {2},
pages = {025010},
author = {Juan Viguera Diez and Sara Romeo Atance and Ola Engkvist and Simon Olsson},
title = {Generation of conformational ensembles of small molecules via surrogate model-assisted molecular dynamics},
journal = {Machine Learning: Science and Technology},
abstract = {The accurate prediction of thermodynamic properties is crucial in various fields such as drug discovery and materials design. This task relies on sampling from the underlying Boltzmann distribution, which is challenging using conventional approaches such as simulations. In this work, we introduce surrogate model-assisted molecular dynamics (SMA-MD), a new procedure to sample the equilibrium ensemble of molecules. First, SMA-MD leverages deep generative models to enhance the sampling of slow degrees of freedom. Subsequently, the generated ensemble undergoes statistical reweighting, followed by short simulations. Our empirical results show that SMA-MD generates more diverse and lower energy ensembles than conventional MD simulations. Furthermore, we showcase the application of SMA-MD for the computation of thermodynamical properties by estimating implicit solvation free energies.}
}
@misc{jing2023torsional,
title={Torsional Diffusion for Molecular Conformer Generation},
author={Bowen Jing and Gabriele Corso and Jeffrey Chang and Regina Barzilay and Tommi Jaakkola},
year={2023},
eprint={2206.01729},
archivePrefix={arXiv},
primaryClass={physics.chem-ph}
}
@Article {Boltzmann_gen,
author = {No{\'e}, Frank and Olsson, Simon and K{\"o}hler, Jonas and Wu, Hao},
title = {Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning},
volume = {365},
number = {6457},
elocation-id = {eaaw1147},
year = {2019},
doi = {10.1126/science.aaw1147},
publisher = {American Association for the Advancement of Science},
issn = {0036-8075},
URL = {https://science.sciencemag.org/content/365/6457/eaaw1147},
eprint = {https://science.sciencemag.org/content/365/6457/eaaw1147.full.pdf},
journal = {Science}
}
SMA-MD is licensed under the MIT license and is free and provided as-is.