Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add notice for the atom map id in the rxn. #36

Merged
merged 5 commits into from
Jul 1, 2020

Conversation

autodataming
Copy link
Contributor

@autodataming autodataming commented Jun 28, 2020

The map ids in the rxn should be consecutive!

Issue #, if available: #33

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

The map ids in the rxn should be consecutive!
@mufeili mufeili self-requested a review June 28, 2020 07:41
```
The map ids in the rxn should be consecutive, or it will report [the molAtomMapNumber issue](https://github.com/awslabs/dgl-lifesci/issues/33).

To avoid the problem, you could convert the raw rxn smiles with explicit hydrgogen atoms to the rxn smiles without hydrogen atoms by RDKit befor adding map ids for the rxn smiles.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we give some examples about how to relabel atom mapping numbers using consecutive integers?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, the raw rxn smiles is "[H]C([H])([H])Oc1ccc(CCNC=O)cc1OC([H])([H])[H]>>[H]C([H])([H])Oc1cc2c(cc1OC([H])([H])[H])CCN=C2", if you directly add map for the rxn by RDT software.
the mapped rxn smiles is

[CH2:1]([CH2:2][NH:10][CH:11]=[O:21])[c:8]1[cH:7][cH:6][c:5]([O:4][CH3:3])[c:12]([cH:9]1)[O:13][CH3:14]>>[CH:11]1=[N:10][CH2:2][CH2:1][c:8]2[cH:9][c:12]([O:13][CH3:14])[c:5]([cH:6][c:7]12)[O:4][CH3:3]

the oxygen atom will be labelled as 21.

First, convert the raw rxn smiles to rxn smiles without hydrogen atoms.

#!python


from rdkit import Chem
def canonicalizatonsmi(smi):
    newsmi = Chem.MolToSmiles(Chem.MolFromSmiles(smi))
    return newsmi


def canon_reaction(rxnstring):
    #print("rxnstring:",rxnstring)
    r,p =rxnstring.split('>>')
    rs = r.split('.')
    #print("rs",rs)
    ps = p.split('.')
    #print("ps",p)
    rscans=[]
    pscans=[]
    for reactant in rs:
        temp=canonicalizatonsmi(reactant)
        #print(reactant,temp)
        rscans.append(temp)
    for product in ps:
        pscans.append(canonicalizatonsmi(product))


    rscan='.'.join(sorted(rscans))
    pscan='.'.join(pscans)
    newrxnsring='%s>>%s'%(rscan,pscan)
    return newrxnsring
from rdkit import Chem
rxnstring= '[H]C([H])([H])Oc1ccc(CCNC=O)cc1OC([H])([H])[H]>>[H]C([H])([H])Oc1cc2c(cc1OC([H])([H])[H])CCN=C2'
canon_reaction(rxnstring)

Then, add map for the reaction smiles.

[O:15]=[CH:1][NH:2][CH2:3][CH2:4][c:5]1[cH:6][cH:7][c:8]([O:9][CH3:10])[c:11]([O:12][CH3:13])[cH:14]1>>[CH:1]1=[N:2][CH2:3][CH2:4][c:5]2[cH:14][c:11]([O:12][CH3:13])[c:8]([O:9][CH3:10])[cH:7][c:6]12

the oxygen atom will be labelled as 15.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the example. Is it possible for us to have a python script that automatically performs:

  1. Canonicalize the rxn SMILES
  2. Add new atom mapping numbers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indigo support the python api, but the accuracy is worse.
RDT is better than Indigo, but it is a Java tools.
Other tools such as ChemAxon、rxnmapper need to further be evaluated.
I am not sure which tool is the best tool to add atom mapping numbers.
So I don't put "Add new atom mapping numbers" in the python script.

@mufeili mufeili merged commit ec36fcb into awslabs:master Jul 1, 2020
@mufeili mufeili mentioned this pull request Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants