Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding resonance transitons for lone pairs #1348

Merged
merged 43 commits into from
Jun 3, 2018
Merged

Adding resonance transitons for lone pairs #1348

merged 43 commits into from
Jun 3, 2018

Conversation

alongd
Copy link
Member

@alongd alongd commented Apr 21, 2018

This PR adds new resonance transitions and introduces the resonance filtration function based on heuristics.

New resonance transitions added

lone_pair_multiple_bond_resonance_structures

Resonance structures formed by lone electron pair - multiple bond shifts.
Example: image image image

lone_pair_radical_multiple_bond_resonance_structures

Resonance structures formed by lone electron pair - radical - multiple bond shifts.
Example: image image image

N5dc_resonance_structures

Resonance structures formed by radical and lone pair shifts mediated by an N5dc atom (not to be confused with N5ddc_N5tc_resonance_structures).
Example: image image image
(in the above example, the two O atoms are "different" - this addresses issue #1293)

Other resonance modifications

  • adjacent_resonance_structures was renamed ally_delocalization_resonance_structures and elaborated for heteroatoms
  • N5dd_N5ts_resonance_structures was renamed N5ddc_N5tc_resonance_structures and elaborated for sulfur
  • lone_pair_radical_resonance_structures was elaborated for heteroatoms
  • Added respective resonance and pathfinder tests

Resonance filtration

This module contains methods for filtering a list of Molecules representing a single Species,
keeping only the representative structures.
The rules this module follows are (by order of importance):

  1. Minimum overall deviation from the Octet Rule (also elaborated for Dectet and Duodectet for hypervalance heteroatoms)
  2. Additional charge separation is only important for radical species, and is kept only if it makes a new radical site in the species
  3. If a structure must have charge separation, negative charges will be assigned to more electronegative atoms, and vice versa
  4. Opposite charges will be as close as possible to one another, and vice versa

Other modifications introduced by this PR

  • Added the reactive attribute to the Molecule class. If a non-representative molecule (one that would have been filtered out if formed by generate_resonance_structures()) is directly formed during the enlarge step, the non-representative Molecule is marked an reactive=False, and the respresentative structures are appended to the Molecule list. If, however, this species already exists (and doesn't contain the non-representative structure), the species are merged.
  • Added various tests
  • Added an electronegativity dictionary to the PeriodicSystem class
  • An iPython notebook for resonance generation was added

@alongd
Copy link
Member Author

alongd commented Apr 21, 2018

This PR fails on Travis due to aug-InChI issues

@alongd alongd force-pushed the resLP branch 2 times, most recently from e06389c to 91ee3d6 Compare April 27, 2018 13:25
@alongd alongd added the Status: Ready for Review PR is complete and ready to be reviewed label Apr 27, 2018
@alongd
Copy link
Member Author

alongd commented Apr 27, 2018

@mliu49, this is now ready for review.
Please advise regarding function deprecation (#1353)

@codecov
Copy link

codecov bot commented May 7, 2018

Codecov Report

Merging #1348 into master will decrease coverage by 0.03%.
The diff coverage is 30.76%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1348      +/-   ##
==========================================
- Coverage   41.54%   41.51%   -0.04%     
==========================================
  Files         173      174       +1     
  Lines       28537    28951     +414     
  Branches     5586     5716     +130     
==========================================
+ Hits        11857    12019     +162     
- Misses      15885    16121     +236     
- Partials      795      811      +16
Impacted Files Coverage Δ
rmgpy/molecule/atomtype.py 0% <ø> (ø) ⬆️
rmgpy/pdep/network.py 12.71% <0%> (-0.81%) ⬇️
rmgpy/molecule/inchi.py 0% <0%> (ø) ⬆️
rmgpy/molecule/translator.py 0% <0%> (ø) ⬆️
rmgpy/species.py 0% <0%> (ø) ⬆️
rmgpy/reaction.py 0% <0%> (ø) ⬆️
rmgpy/molecule/element.py 0% <0%> (ø) ⬆️
rmgpy/molecule/pathfinder.py 0% <0%> (ø) ⬆️
rmgpy/molecule/group.py 0% <0%> (ø) ⬆️
rmgpy/molecule/resonance.py 0% <0%> (ø) ⬆️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 60d5e53...d291869. Read the comment docs.

@alongd alongd force-pushed the resLP branch 3 times, most recently from d7426f2 to 1ede57d Compare May 9, 2018 02:06
Copy link
Contributor

@mliu49 mliu49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall comments:

  1. In for loops, enumerate() is generally cleaner and more pythonic than xrange(len()). I commented on a few cases, but I think every instance you added could potentially be replaced by enumerate.
  2. The filtration algorithm seems pretty involved. Have you had a chance to evaluate the overall effect this PR has on performance of generate_resonance_structures?
  3. What's the final status of augmented InChI? Is it still broken/deprecated?
  4. How do you think Chemical identity comparison for Molecules and Species #1329 would affect the changes in this PR, if at all?
  5. Could you run RMG-tests?

Thanks for the awesome work!

@@ -147,6 +147,7 @@ def deflateReaction(rxn, molDict):
object are populated with the value of the dictionary, either an
integer index, or either a Species object.
"""
rxn.generatePairs()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pair generation is typically done by KineticsFamily.getReactionPairs(), except for bimolecular families which are not hard-coded, in which case we recently allowed falling back to Reaction.generatePairs() which is done in CERM.makeNewReaction().

Because Reaction.generatePairs() resets the pairs, I think it's not advisable to call that here since it will overwrite the pairs generated by the kinetics family. I think it might be better to simply catch the error (which I assume occurs when rxn.pairs is empty or none?) and save a more appropriate value.

if ((atom1.isNitrogen() and atom1.radicalElectrons >= 1 and atom1.lonePairs in [0, 1])
or (atom1.isOxygen() and atom1.radicalElectrons >= 1 and atom1.lonePairs in [1, 2])
or (atom1.isSulfur() and atom1.radicalElectrons >= 1 and atom1.lonePairs in [0, 1, 2])):
for atom2, _ in atom1.edges.items():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use keys() or iterkeys() here if you only need the atoms.


def generate_adjacent_resonance_structures(mol):
def generate_ally_delocalization_resonance_structures(mol):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allyl is misspelled (unless you used ally intentionally).

@@ -879,7 +879,7 @@ def sortAtoms(self):
for index, vertex in enumerate(self.vertices):
vertex.sortingLabel = index

def update(self):
def update(self, log_species_while_updating_atom_types=True):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps just naming it log_species would be sufficient? This seems a bit long.

logging.debug('A net charged species was formed when reacting {0} to form {1} in'
' reaction family {2}. Not generating this reaction.'.format(
reactant_net_charge,product_net_charge,self.label))
return None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section could be better integrated with the section at line 1322 to avoid repetition.

# Reverse direction (the direction in which kinetics is not defined)
reactionList.extend(self.__generateReactions(reactants, products=products, forward=False, prod_resonance=prod_resonance))

if all([mol.reactive for mol in reactants]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? It's checking that all the molecules are reactive. Also, this method is intended to accept either a list of molecules or a list of lists of molecules, so it would be nice if that still worked.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is wrong and will be addressed in a fixup

},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAEMAAAAZCAIAAADlvQS1AAAABmJLR0QA/wD/AP+gvaeTAAADqUlE\nQVRYhd2YT0jbUBzHX2VuZKWSmtB2soN/wGKFzlpQpChIFaSReRp4EQZjiJdimZBDT4KTHAseXA5F\nxLIi9SBFT7WgYkVklYk6dVgQUbSS2uJoamv17ZCZxvpnmU3d2OfU3/e993u/b/L6S1oZhBD8FxT8\nxb2dTmckEpEq219zkkgkNjc3dTrd0NBQOp2WICN8dFiW7e/vx3H84OBga2vLYrFotdrp6ekc0z62\nE6/XW1ZWVlNTQ5Lk2dkZJ/p8Pp1O19LSsrGx8eDMj+dkZWWlqanJYDDMzc3t7u52dHRUVFRMTk5y\no6lUyuFw4DhutVpjsdgD8j+GE4ZhrFarSqVyOBzpdJrX/X6/Xq9vbm5eXV29f6YY8uvkt1f64uJi\ndHRUrVZ3dXUdHx9zYjAYbGxsrK2tnZ+fF79XHp3wp399ff3+mScnJyRJqtVqiqKSySQner3e0tLS\n9vb2cDgsZru8ONne3iYIorKycmpqSvyqm32MZVmXyyXymF1zUl1dPTY2xodut1ur1d61cmJior6+\nHkEQHMctFksgEOD0nZ2drKv7R/h8vqqqqtv72OfP8NUr+OwZVKng+/cwEpHACU3TCoWCpun9/f1o\nNOrxeLq7u7mh8/NzkefhLvhvF0mSGfXTJ4ii0O2GsRj89g22tUGDAV718Qc6YVkWRVGapoXi5eVl\nLtXfhGEYu93+K0gkoFIJnc7McDwONRooqOEhbyvBYDAWi3V2dgpFmUwmwRuHAAzDBgYGfgVfvoBo\nFLx5kxl+/hy8fg18Pl54krW+p6ent7eX+5xKpUpKSm7uEYlE5HJ5UVGRpJXfC8MAuRwoFNfEFy/A\n9+98lH1PBgcHv15BURQnLiwsNDQ0mEwmm80GAMAwLB6Pn56e5rd6ITgO4nHw48c18fAQ4DgfZTtR\nKpUvryguLubE8vLy2dnZQCCwt7e3trZmNBpRFB0fHxcuhHn9nWM0AhQFHk9GYVng9QKzmReyT9et\n8GessLCwoKAAQRCKovr6+mQyGUEQCIL4/f6ZmZnh4WFJyxeAIODjR/DhA5DLQVsbODwENhtQqcDb\nt5k5wnZxf+9aXl62WCx86PF46urquOcJQRCLi4vS9q5bcLmgXg+fPoU4Dt+9gwwjHBT7jA+HwyaT\n6ejoKA8FSoMoJ8lksrW1NRgM5ruaXBDlZGRkRKPRmM1ms9m8tLTE66FQKBQK5a22P0MGc+g5GIYB\nACT8VyEXRPWuu7Db7VLVkTs53ZN/ip+ygYFFORXIAQAAAABJRU5ErkJggg==\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I'm not sure if this should be part of RMG-Py.
  2. If we decide to add it, could you clear the output cells before committing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm clearing the output cells.
I don't mind not including this notebook (tell me what you think, I'm OK with both ways). I think we previously discussed adding a resonance generation feature to the website, let's discuss it and decide whether or not to ha a filtration switch (or filter by default). It could be an excellent replacement for this notebook.

if (((atom.isNitrogen() or atom.isSulfur()) and atom.lonePairs in [1, 2, 3])
or (atom.isOxygen() and atom.lonePairs in [2, 3])):
return True
return False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could directly return the value of your boolean expression (also for gain_lone_pair).

Also, loose should be lose.

# Remake resonance structures with new labels
if resonance:
species.generate_resonance_structures(keepIsomorphic=True)
if all([mol.reactive for mol in species.molecule]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are potential cases where there might be an unreactive molecules when calling this method? Independent atom IDs are pretty important for degeneracy, so depending on the situation, a more comprehensive fix might be better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Molecules with a reactive=False flag get generated, but are not allowed to further react in families. However, we do calculate degeneracy for these reactions. My concern here was a line below that resets the molecule list (species.molecule = [mol]), removing all unreactive molecules. Even if we re-generate resonance structures, we won't regenerate these unreactive molecule back. I modified this that condition so that unreactive molecules are appended if needed.

elif resonance:
# IDs are already independent, generate resonance structures if needed
for species in input_species:
species.generate_resonance_structures(keepIsomorphic=True)
if len(species.molecule) == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is already done by Species.generate_resonance_structures().

@alongd
Copy link
Member Author

alongd commented May 17, 2018

Thanks for the helpful comments, @mliu49 !
I addressed them in fixup commits.

  1. I'm new to autosquash, so the xrange(len()) to enumerate() changes already got squashed in by mistake.
  2. I think that the filtration algorithm doesn't have a significant effect on timings, I'll give some figures soon.
  3. Aug-InChI was solved via 024b080. Please see the commit message. Briefly, InChI represents delocalized species, so there's no need to generate resonance structures for it.
  4. Do you have particular concerns re Chemical identity comparison for Molecules and Species #1329? Let's try rebasing them on a dummy branch and check.
  5. I ran the tests, will add the report
  6. Regarding deflateReaction in rmg/react,py: I implemented an error capture with rxn.generatePairs() in it, please let me know if and how to implement KineticsFamily.getReactionPairs().

@alongd
Copy link
Member Author

alongd commented May 18, 2018

Here's a comparison of number of species and resonance structure generation time for this brunch (w/ and w/o filtration) vs. master:

Species number of structures before filtration time (ms) number of structures after filtration time inc. filtration (ms) number of structures on master time on master (ms)
NO2 8 1.03 2 1.34 2 0.23
N2O 4 0.73 2 0.80 2 0.16
N(N)[N+](=O)[O-] 4 1.05 1 1.13 1 0.32
[CH2]N=O 7 1.53 3 1.68 3 0.32
C=[C]C(=O)O[O] 3 1.52 2 1.68 2 0.25
[CH]1C=NC=N1 7 5.15 3 5.3 3 1.92
C1=C[CH]CC1([O]) 2 1.62 2 1.87 2 1.0
HSO 2 1.13 2 1.31 1 0.03
[NH]N=S=O 62 21.97 4 23.3 2 0.2

The overall time is indeed higher (about a factor of 5), but most of it is spent in generating the new structures (using new transitions not available on current master). The filtration time (time inc. filtration - time in the table) looks reasonable. For the extreme case of [NH]N=S=O (with lots of structures to filter, but also generate) it's ~6% of the overall time.

Copy link
Contributor

@mliu49 mliu49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fixes!

  • It looks like filtration doesn't take that long. The increase in time for resonance structure generation overall is a bit concerning though. I'm not sure that there's much we can do about it though.
  • Regarding augmented InChI, the fact that we were augmenting them with electron positions was already going against the philosophy of InChI. I think Nick used resonance structure generation to try and get canonical numbering for the electrons. I don't think removing it should have a significant impact though.
  • Regarding Chemical identity comparison for Molecules and Species #1329, I was mainly thinking about the checkForExistingSpecies method. In that PR, I completely replace it with InChI comparison. I think that could greatly simplify the changes necessary to that method in this PR.

rxn.pairs = [(molDict[reactant.molecule[0]], molDict[product.molecule[0]]) for reactant, product in rxn.pairs]
except ValueError:
rxn.generatePairs()
rxn.pairs = [(molDict[reactant.molecule[0]], molDict[product.molecule[0]]) for reactant, product in rxn.pairs]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was actually thinking that you could just set rxn.pairs = None if you encounter an error, since the pairs will be generated later if they don't exist (here).

for atom in self.vertices:
if atom.isHydrogen() and atom.label == '':
hydrogens.append(atom)
hydrogens = filter(lambda at: at.isHydrogen(), self.vertices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this loses the check on the atom label. Was that intentional?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unintentional. I'm dropping this small commit

def addReverseAttribute(self, rxn):
reactants = [reactant if isinstance(reactant, list) else [reactant] for reactant in reactants]
for reactant in reactants:
if not all([mol.reactive for mol in reactant]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might not have been clear about the list of lists. In the typical case, this takes reactants as [reactantA, reactantB]. However, it can also properly handle [[A_res1, A_res2], [B_res1, B_res2]], where the sublists are lists of resonance structures for each reactant. I think I was also confused in my original comment about this part; it was correct to check that all the reactants are reactive.

Now that I think about it more though, it may be unnecessary do the check here at all. Since you already check in __generateReactions() where the actual generation is done, I don't think you need to change this method at all.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I'll revert the original change

rmgpy/species.py Outdated
for species in species_list:
if isinstance(species, Species):
if self.isIsomorphic(species):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also do return self.isisomorphic(species).

rmgpy/species.py Outdated
return True
else:
raise ValueError('Unexpected value "{0!r}" for species_list parameter;'
' should be a List of Species objects.'.format(species))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use TypeError?

@alongd
Copy link
Member Author

alongd commented May 19, 2018

Thanks for the comments @mliu49! I pushed additional fixups.
Currently re-running tests

@mliu49
Copy link
Contributor

mliu49 commented May 30, 2018

Could you convert the two assertions Codacy flagged into exceptions as well? Then I think you can go ahead and squash all of the fixups.

@alongd
Copy link
Member Author

alongd commented May 30, 2018

Thanks - dealt with the two assertions, squashed and rebased

if isinstance(struc, Molecule):
struc.update()
else:
struc.resetRingMembership()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you added this following my earlier comment; could you also move Xiaorui's additions up to here and remove the section at line 1334?

@@ -1526,7 +1554,7 @@ def addReverseAttribute(self, rxn):
self.forbidden = ForbiddenStructures() # Initialize with empty one
try:
reactionList = self.__generateReactions([spc.molecule for spc in rxn.products],
products=rxn.reactants, forward=True)
products=rxn.reactants, forward=True, react_non_reactive=react_non_reactive)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really minor, but I think it would be better to put the react_non_reactive argument on the next line and keep the indentation inline with the open parenthesis like before.

else:
# For noncyclics, default to original algorithm of ordering thermo based on the most stable enthalpy
H298 = numpy.array([t.getEnthalpy(298.) for t in thermoDataList])
indices = H298.argsort()
indices = numpy.array([indices[i] for i in xrange(len(indices)) if species.molecule[indices[i]].reactive] +\
[indices[i] for i in xrange(len(indices)) if not species.molecule[indices[i]].reactive])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like you need xrange at all, and could just do for i in indices.

for mol in filtered_list:
logging.info('Structure: {0}\n{1}Reactive: {2}'.format(mol.toSMILES(),mol.toAdjacencyList(),mol.reactive))
logging.info('\n')
raise AssertionError('Each species must have at least one reactive structure. Something probably went wrong'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it would be a good idea to create a ResonanceError class to be used here.

@@ -376,6 +376,8 @@ def _generate_minimum_resonance_isomer(mol):
Next, we return the candidate with the lowest unpaired electrons metric.

The metric is a sorted list with indices of the atoms that bear an unpaired electron

This function is currently depreciated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spelling of deprecated. Also, it would be good to mention why.

I think you can add a deprecation warning similar to f7dd2f1 for now.

return True, True, spec

# As a last resort, check using molecule.fingerprint if the obj matches any existing species,
# and if it does, generate resonance structures w/o filtration and check for isomorphism
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like you're checking fingerprints as the comment says.

@alongd alongd self-assigned this May 30, 2018
@alongd
Copy link
Member Author

alongd commented May 31, 2018

@mliu49, I agree with all of your comments, I addressed them in fixup commits
Thanks!

indices = numpy.array([indices[i] for i in xrange(len(indices)) if species.molecule[indices[i]].reactive] +\
[indices[i] for i in xrange(len(indices)) if not species.molecule[indices[i]].reactive])
indices = numpy.array([i for i in indices if species.molecule[indices[i]].reactive] +\
[i for i in indices if not species.molecule[indices[i]].reactive])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think indices[i] should be replaced by i as well.

alongd added 10 commits June 2, 2018 16:24
If resonance structures were already generated for `new_item`, they may
contain non-reactive structures that we'd like to keep. Re-generating
resonance structures could remove them. This makes sure that resonance
structures would only be generated for Species that contain only
reactive structure.
If `saturateH` is `True`, then saturate `mol` with hydrogens before
generating the resonance structures and remove the hydrogens before
returning `isomorphic_isomers`. This is useful when resonance structures
are generated for molecules in which all hydrogens were intentionally
removed as in generating augInChI. Otherwise, RMG will probably get many
of the lonePairs and partial charges in a molecule wrong.
@alongd
Copy link
Member Author

alongd commented Jun 2, 2018

Here are the test results for this branch:
resLP.txt

There were no changes for eg1, eg3, eg6, eg7, and solvent_hexane


In eg5, the order in which networks were generated was different, so we get things like:

Non-identical kinetics!
original:
rxn: C=C(16) + C=CC(30) => [CH2]CC([CH2])C(66)		origin: PDepNetwork #40
tested:
rxn: C=CC(30) + C=C(16) => [CH2]CC([CH2])C(66)		origin: PDepNetwork #40
k(1bar)|300K   |400K   |500K   |600K   |800K   |1000K  |1500K  |2000K  

k(T):  | -29.36| -21.27| -16.40| -13.13|  -8.99|  -6.47|  -3.07|  -1.35
k(T):  | -58.17| -42.17| -32.52| -26.04| -17.83| -12.83|  -6.08|  -2.69

Kinetics: Chebyshev(coeffs=[[-9.141,2.483,-0.01289,-0.007827],[14.85,0.01142,0.008461,0.005105],[0.1403,0.0008223,0.0006267,0.0003947],[0.03315,0.0005011,0.0003743,0.0002288],[0.001036,0.0003139,0.0002346,0.0001435],[-0.007855,0.0002016,0.0001507,9.213e-05]], kunits='cm^3/(mol*s)', Tmin=(300,'K'), Tmax=(3000,'K'), Pmin=(0.001,'atm'), Pmax=(98.692,'atm'))
Kinetics: Chebyshev(coeffs=[[-23.02,-0.0164,-0.0122,-0.007413],[29.41,0.01176,0.008717,0.005266],[0.2732,0.001358,0.001026,0.0006379],[0.05007,0.0007712,0.0005766,0.0003528],[-0.007379,0.0003885,0.0002909,0.0001783],[-0.02062,0.0001704,0.0001278,7.853e-05]], kunits='cm^3/(mol*s)', Tmin=(300,'K'), Tmax=(3000,'K'), Pmin=(0.001,'atm'), Pmax=(98.692,'atm'))
Identical kinetics comments

Non-identical kinetics!
original:
rxn: C=C(16) + C=CC(30) => [CH2]CC([CH2])C(66)		origin: PDepNetwork #39
tested:
rxn: C=CC(30) + C=C(16) => [CH2]CC([CH2])C(66)		origin: PDepNetwork #39
k(1bar)|300K   |400K   |500K   |600K   |800K   |1000K  |1500K  |2000K  

k(T):  | -58.17| -42.17| -32.52| -26.04| -17.83| -12.83|  -6.08|  -2.69
k(T):  | -29.36| -21.27| -16.40| -13.13|  -8.99|  -6.47|  -3.07|  -1.35

Kinetics: Chebyshev(coeffs=[[-23.02,-0.0164,-0.0122,-0.007413],[29.41,0.01176,0.008717,0.005266],[0.2732,0.001358,0.001026,0.0006379],[0.05007,0.0007712,0.0005766,0.0003528],[-0.007379,0.0003885,0.0002909,0.0001783],[-0.02062,0.0001704,0.0001278,7.853e-05]], kunits='cm^3/(mol*s)', Tmin=(300,'K'), Tmax=(3000,'K'), Pmin=(0.001,'atm'), Pmax=(98.692,'atm'))
Kinetics: Chebyshev(coeffs=[[-9.141,2.483,-0.01289,-0.007827],[14.85,0.01142,0.008461,0.005105],[0.1403,0.0008223,0.0006268,0.0003947],[0.03315,0.0005011,0.0003743,0.0002288],[0.001035,0.0003139,0.0002346,0.0001435],[-0.007855,0.0002016,0.0001507,9.213e-05]], kunits='cm^3/(mol*s)', Tmin=(300,'K'), Tmax=(3000,'K'), Pmin=(0.001,'atm'), Pmax=(98.692,'atm'))
Identical kinetics comments

@mliu49, I was wrong yesterday to think that only the first network reaction instance gets included in the model, RMG seems to be doing the right thing here.


In the NC test, this branch corrected the representative structure of NNOH, so the test treats them as different species.


The methane test (with reactive N2) had some discrepancies, with 4 species not included in the new model, and 2 additional species. Some kinetics are different due to two reasons:

  • different sorting of resonance structures, e.g.:
Non-identical kinetics!
original:
rxn: CH2CHO(49) + oxygen(3) <=> HO2(8) + CH2CO(25)		origin: Disproportionation
tested:
rxn: CH2CHO(49) + oxygen(3) <=> CH2CO(25) + HO2(8)		origin: Disproportionation
k(1bar)|300K   |400K   |500K   |600K   |800K   |1000K  |1500K  |2000K  

k(T):  |   1.48|   2.55|   3.20|   3.63|   4.17|   4.49|   4.92|   5.13
k(T):  |  -3.79|  -1.32|   0.16|   1.15|   2.38|   3.12|   4.11|   4.60

Kinetics: Arrhenius(A=(6.02974e+11,'cm^3/(mol*s)'), n=0, Ea=(5.908,'kcal/mol'), T0=(1,'K'), comment="""Estimated using template [O2b;XH_s_Rrad] for rate rule [O2b;COpri_Csrad]
Kinetics: Arrhenius(A=(1.2044e+12,'cm^3/(mol*s)'), n=0, Ea=(13.55,'kcal/mol'), T0=(1,'K'), comment="""Estimated using template [O2b;Cdpri_Rrad] for rate rule [O2b;Cdpri_Orad]
kinetics: Estimated using template [O2b;XH_s_Rrad] for rate rule [O2b;COpri_Csrad]
kinetics: Estimated using template [O2b;Cdpri_Rrad] for rate rule [O2b;Cdpri_Orad]
  • Different reaction enthalpy change when a more representative resonance structure was found with lower H298, e.g:
Non-identical kinetics!
original:
rxn: O(T)(4) + N2O(34) <=> [O-][N+]=[N+][O-](105)		origin: R_Addition_MultipleBond
tested:
rxn: N2O(34) + O(T)(4) <=> [O][N]N=O(105)		origin: R_Addition_MultipleBond
k(1bar)|300K   |400K   |500K   |600K   |800K   |1000K  |1500K  |2000K  

k(T):  |  -6.42|  -4.08|  -2.80|  -2.03|  -1.22|  -0.86|  -0.64|  -0.75
k(T):  | -42.05| -30.80| -24.18| -19.85| -14.58| -11.55|  -7.77|  -6.09

Kinetics: Arrhenius(A=(9.99e+25,'cm^3/(mol*s)'), n=-5.73, Ea=(16.784,'kcal/mol'), T0=(1,'K'), comment="""Estimated using an average for rate rule [N3t_R;O_atom_triplet]
Kinetics: Arrhenius(A=(9.99e+25,'cm^3/(mol*s)'), n=-5.73, Ea=(65.694,'kcal/mol'), T0=(1,'K'), comment="""Estimated using an average for rate rule [N3t_R;O_atom_triplet]
kinetics: Estimated using an average for rate rule [N3t_R;O_atom_triplet]
kinetics: Estimated using an average for rate rule [N3t_R;O_atom_triplet]

The updated thermo, however, isn't alwas better. e.g.:

Non-identical thermo!
original:	[O-][N+]=[N+][O-]
tested:	[O][N]N=O
Hf(300K)  |S(300K)   |Cp(300K)  |Cp(400K)  |Cp(500K)  |Cp(600K)  |Cp(800K)  |Cp(1000K) |Cp(1500K) 
     57.60|     78.45|     16.64|     17.44|     18.01|     18.75|     21.18|     23.73|     25.71
    104.32|     27.82|      4.01|      3.87|      4.01|      4.23|      4.46|      4.37|      4.55
thermo: Thermo group additivity estimation: group(O2s-CsCs) + group(O2s-CsCs) + group(N3s-CsHH) + group(N3s-CsHH) + radical(NHJ_C) + radical(NHJ_C)
thermo: Thermo group additivity estimation: group(N3s-CsHH) + radical(CsOJ) + radical(NHJ_C)

[O-][N+]=[N+][O-] turned into a more representative form, [O][N]N=O, but now only one N group is recognized. It's a step in the right direction, but GAV must be update.
There were many changes to the edge.


The Methylcyclohexane (MCH) test had some difference in species, probably due to the only core reaction kinetics difference:

Non-identical kinetics!
original:
rxn: [CH2]C=CC=C[CH2](406) <=> C=CC1C=CC1(530)		origin: Birad_recombination
tested:
rxn: [CH2]C=CC=C[CH2](406) <=> C=CC1C=CC1(530)		origin: Birad_recombination
k(1bar)|300K   |400K   |500K   |600K   |800K   |1000K  |1500K  |2000K  

k(T):  |  11.29|  11.62|  11.82|  11.95|  12.11|  12.21|  12.34|  12.41
k(T):  |  10.31|  10.64|  10.82|  10.94|  11.08|  11.16|  11.25|  11.29

Kinetics: Arrhenius(A=(4e+12,'s^-1'), n=0, Ea=(1.8,'kcal/mol'), T0=(1,'K'), comment="""Estimated using template [Rn;C_rad_out_H/OneDe;Cpri_rad_out_2H] for rate rule [R4_SDS;C_rad_out_H/OneDe;Cpri_rad_out_2H]
Kinetics: Arrhenius(A=(3.24e+12,'s^-1'), n=-0.305, Ea=(1.98,'kcal/mol'), T0=(1,'K'), comment="""Estimated using template [R4;C_rad_out_2H;Cpri_rad_out_single] for rate rule [R4_SDS;C_rad_out_2H;Cpri_rad_out_H/OneDe]
kinetics: Estimated using template [Rn;C_rad_out_H/OneDe;Cpri_rad_out_2H] for rate rule [R4_SDS;C_rad_out_H/OneDe;Cpri_rad_out_2H]
kinetics: Estimated using template [R4;C_rad_out_2H;Cpri_rad_out_single] for rate rule [R4_SDS;C_rad_out_2H;Cpri_rad_out_H/OneDe]

Seems like labels *1 and *2 switched places here. I'm not too familiar with this family, but this is perhaps due to different sorting of the resonance structures.


The new H2S test crashed because of incorrect parsing of library names, I'll try to deal with it separately.

@mliu49
Copy link
Contributor

mliu49 commented Jun 3, 2018

Thanks for looking at the tests! I think this is ready to be merged. The final thing (which I almost forgot about) is whether or not to keep generateResonanceStructures.ipynb. The main reason that I'm hesitant is that I don't think it's necessarily useful for users. How would you feel about relocating it to the Work-In-Progress_scripts repo?

@alongd
Copy link
Member Author

alongd commented Jun 3, 2018

No problem relocating it.
Eventually (if you'll have time) it would be great to add this feature to the website (with and without filtration)

alongd and others added 7 commits June 2, 2018 21:49
As clearly stated in the InChI paper:
"Mesomerism is effectively eliminated in InChI. Mesomers have the same
InChIs (this is true for all possible InChI layouts of layers).
Actually, this is very natural. Mesomeric structures of a molecular
entity have the same basic connectivity but differ in bond orders, and
maybe by having atomic charges on different atoms. InChI does not use
bond orders and does not place charges on particular atoms; the
placement of hydrogen atoms in a mesomeric system, which would be
important for InChI, is always the same."
Source: InChI, the IUPAC International Chemical Identifier,
J. Cheminform 2015, 7, 23, doi: 10.1186/s13321-015-0068-4
Because training reactions are often entered with a specific
resonance form, this prevents changes to resonance structure
generation from affecting addition of rules from training sets.
@alongd alongd removed the Status: Ready for Review PR is complete and ready to be reviewed label Jun 3, 2018
@alongd
Copy link
Member Author

alongd commented Jun 3, 2018

Notebook removed. It should be ready to go now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants