Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed the solvation GAV heuristic #1832

Merged
merged 3 commits into from
Dec 3, 2019
Merged

fixed the solvation GAV heuristic #1832

merged 3 commits into from
Dec 3, 2019

Conversation

yunsiechung
Copy link
Contributor

Motivation or Problem

Solvation group additivity was estimating solute data by averaging over all molecules in species.molecule list (so it was averaging over all resonance structure). However, when the solvation solute groups were fitted from the experimental values, they were fitted using the SMILES of the most stable resonance structure in gas phase. Therefore, averaging over all resonance structures is not a correct way. Instead, it should be estimating solute data from only the most stable structure found by gas-phase.

Extra Note:
Once might be concerned that the most stable resonance structure might change in liquid phase. But the change in the resonance structures when a species is solvated is already reflected in the experimental solvation energies, which are measured for a certain species in general, not for its particular resonance structures. Since we fitted the solute group to these experimental solvation energies using the SMILES of the most stable gas phase resonance structures, we should be consistent with using the same SMILES when we estimate the solute data.

Description of Changes

In def get_solute_data_from_groups, instead of iterating over all molecules and averaging, it only calculates the solute data for the first item in the molecule list, species.molecule[0] as it is the SMILES of the most stable gas phase resonance structure found.

Testing

I did not add any test since the unittest for get_solute_data_from_groups already exists.
I can add more unittests if necessary.

Reviewer Tips

Suggestions for verifying that this PR works or other notes for the reviewer.

@codecov
Copy link

codecov bot commented Nov 26, 2019

Codecov Report

Merging #1832 into master will decrease coverage by 0.04%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1832      +/-   ##
==========================================
- Coverage   43.98%   43.94%   -0.05%     
==========================================
  Files          83       83              
  Lines       21581    21565      -16     
  Branches     5653     5652       -1     
==========================================
- Hits         9492     9476      -16     
+ Misses      11043    11029      -14     
- Partials     1046     1060      +14
Impacted Files Coverage Δ
rmgpy/data/solvation.py 63.78% <100%> (-1.34%) ⬇️
arkane/kinetics.py 12.14% <0%> (ø) ⬆️
rmgpy/data/statmech.py 42.2% <0%> (ø) ⬆️
rmgpy/rmg/pdep.py 12.21% <0%> (ø) ⬆️
rmgpy/data/kinetics/family.py 48.35% <0%> (ø) ⬆️
rmgpy/statmech/ndTorsions.py 59.78% <0%> (ø) ⬆️
rmgpy/yml.py 15.71% <0%> (ø) ⬆️
arkane/sensitivity.py 10% <0%> (ø) ⬆️
rmgpy/data/kinetics/database.py 50.61% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a87d79f...cbf446f. Read the comment docs.

Copy link
Contributor

@mliu49 mliu49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me. Does the existing unit test check values? Do you expect substantial changes in the resulting solvation corrections with this change?

@yunsiechung
Copy link
Contributor Author

yunsiechung commented Nov 27, 2019

@mliu49 The existing unit test checks the value, but the species in the test does not have any resonance structure, so the value did not change with this code.
When I compared the previous code with this code by comparing their estimated solvation energy values, they do change but the changes do not appear to be substantial in general.

I addded a unit test that estimates the solvation energy of a species using its various resonance structures in molecule list and check that the one calculated by the get_solute_data_from_groups method matches the solvation energy of the first molecule

@yunsiechung yunsiechung force-pushed the fix_solvation_GAV branch 2 times, most recently from c20f207 to e6a704f Compare November 27, 2019 15:41
For solute data group additivity, instead of averaging over all
resonance structures in species.molecule list, only get the solute data
estimation for the first item in species.molecule list as it corresponds
to the most stable resonance structure found by gas-phase thermo
estimation.
This heutristic is chosen so the solvation GAV method is consistent with
the gas phase thermo GAV method.
I simplified some lines in the existing solvation unit test
A unit test is added to check that the solute data are estimated
correctly using the GAV for the species with multiple resonance
structures. For the species with resonance structures, the
estimated solute data / solvation energy should match those estimated
for the first item in the species's molecule list.
Copy link
Contributor

@mliu49 mliu49 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I think this looks good.

@mliu49 mliu49 merged commit e8d9795 into master Dec 3, 2019
@mliu49 mliu49 deleted the fix_solvation_GAV branch December 3, 2019 21:33
@mliu49 mliu49 mentioned this pull request Dec 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants