Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDKit Conversion Issue #756

Closed
zachcp opened this issue Feb 14, 2025 · 3 comments
Closed

RDKit Conversion Issue #756

zachcp opened this issue Feb 14, 2025 · 3 comments

Comments

@zachcp
Copy link

zachcp commented Feb 14, 2025

Hi @padix-key ,

As a follow up to zachcp/moltite#1, I just tried to use this new feature and the conversion gives me an empty stack. I can track this down if needed but do you think you could confirm if this is reproducible on your end?

import rdkit
import biotite

print(biotite.__version__)
print(rdkit.__version__)
# 1.1.1.dev43+g4c7204af
# 2024.09.5


from rdkit import Chem
from biotite.interface import rdkit as irdk

# Example SMILES string
smiles = "CC(=O)OC1=CC=CC=C1"  # Aspirin

# Convert SMILES to mol object
mol = Chem.MolFromSmiles(smiles)

biotite_mol = irdk.from_mol(mol)
biotite_mol

# the conversion gives me an empty stack...
# stack([ ])
@zachcp
Copy link
Author

zachcp commented Feb 14, 2025

I can confirm the above with this minimal env. This is python 3.13.

# pixi.toml
[project]
authors = ["Zachary Charlop-Powers <>"]
channels = ["conda-forge"]
name = "test"
platforms = ["osx-arm64"]
version = "0.1.0"

[tasks]

[dependencies]
rdkit = ">=2024.9.5,<2025"
ipython = ">=8.32.0,<9"

[pypi-dependencies]
biotite = { git = "https://github.com/biotite-dev/biotite", rev = "4c7204afd07c489117243fd8ad4a67f787a4a820" }
pixi run ipython
# paste code from above.

@padix-key
Copy link
Member

padix-key commented Feb 14, 2025

Hi, thanks for reporting. @Croydon-Brixton also encountered this problem (#741 (comment)) and already prepared a fix (padix-key#13). The reason is that the AtomArrayStack contains one model per conformer and a Mol freshly created from a SMILES string does not contain any conformer. In the fix from @Croydon-Brixton the AtomArrayStack will have a coordinate array full of NaN values in this case.

To get actual coordinates you have to predict them and (optionally) add hydrogen atoms.
For example like here (taken from the tutorial in the main branch):

    from rdkit.Chem import MolFromSmiles
    from rdkit.Chem.rdDistGeom import EmbedMolecule
    from rdkit.Chem.rdForceFieldHelpers import UFFOptimizeMolecule
    from rdkit.Chem.rdmolops import AddHs

    ERTAPENEM_SMILES = "C[C@@H]1[C@@H]2[C@H](C(=O)N2C(=C1S[C@H]3C[C@H](NC3)C(=O)NC4=CC=CC(=C4)C(=O)O)C(=O)O)[C@@H](C)O"

    mol = MolFromSmiles(ERTAPENEM_SMILES)
    # RDKit uses implicit hydrogen atoms by default, but Biotite requires explicit ones
    mol = AddHs(mol)
    # Create a 3D conformer
    conformer_id = EmbedMolecule(mol)
    UFFOptimizeMolecule(mol)
    ertapenem = rdkit_interface.from_mol(mol, conformer_id)

@zachcp
Copy link
Author

zachcp commented Feb 14, 2025

Beautiful. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants