Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling zpe #1619

Merged
merged 5 commits into from
Jun 17, 2019
Merged

Scaling zpe #1619

merged 5 commits into from
Jun 17, 2019

Conversation

alongd
Copy link
Member

@alongd alongd commented Jun 7, 2019

Motivation or Problem

Previously we used the frequency scaling factor to also scale the zero point energy correction in Arkane. Principally, these should be two different factors. Luckily, there's a simple ratio between them, a factor of 1.014 (+/- 0.002, see https://pubs.acs.org/doi/10.1021/ct100326h section 3.1.3) - largely independent of the model chemistry. So only of the factors one should be known, the second can be easily determined. This erroneous behaviour causes a systematic error of about 1.5 kJ/mol(!) - or 0.35 kcal/mol - to H298 (see below).

Thanks @dranasinghe for helping me understand what I'm doing :)

Description of Changes

Dividing the freq scaling factor by 1.014 to obtain the ZPEC scaling factor when scaling the ZPEC.

Here are some species I ran (purposely selected ones with no torsions) prior and after this change at ccsd(t)-f12/cc-pvtz-f12//wb97xd/6-311++g(d,p), energies are in kJ/mol:

Species H298, master H298, this branch H298, ATcT
CH4 -71.95 -73.56 -74.525 +/- 0.056
NH3 -41.81 -43.06 -45.558 +/- 0.030
CO2 -385.07 -385.49 -393.475 +/- 0.015
C2H4 55.97 54.13 52.39 +/- 0.12
C2H2 232.61 231.63 228.27 +/- 0.13

Before the changes we had an average deviation from ATcT values of 4.5 kJ/mol (for these species at this level), now it is 3.3 kJ/mol (not accounting for CO2, the deviation went down from 3.6 kJ/mol to 2.1 kJ/mol). If these species could be representative, the implication is that we're improving Arkane's H298 by about 1.2 to 1.5 kJ/mol, at least for this level.

@alongd alongd added the Arkane label Jun 7, 2019
@alongd alongd requested a review from cgrambow June 7, 2019 03:02
@alongd alongd self-assigned this Jun 7, 2019
Copy link

@cgrambow cgrambow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before I comment on all lines where you made changes, I'll ask some questions first.

Your second commit messages implies that E0 should always contain ZPE. I generally agree, but E0 is just an arbitrary symbol that one can choose however one would like to choose it. What prompted you to make this change?

Can I propose that you change ZPEC everywhere you use it to ZPE? I wouldn't really call ZPE a "correction"; it just is.

Have you verified that all frequency scale factors hard-coded in Arkane are actual frequency scale factors and not ZPE scale factors?

@codecov
Copy link

codecov bot commented Jun 7, 2019

Codecov Report

Merging #1619 into master will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1619      +/-   ##
==========================================
+ Coverage   41.47%   41.49%   +0.01%     
==========================================
  Files         176      176              
  Lines       29314    29314              
  Branches     6035     6035              
==========================================
+ Hits        12158    12163       +5     
+ Misses      16303    16299       -4     
+ Partials      853      852       -1
Impacted Files Coverage Δ
rmgpy/data/kinetics/family.py 52.9% <0%> (+0.23%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12c16bf...20b0d3e. Read the comment docs.

@alongd
Copy link
Member Author

alongd commented Jun 7, 2019

Thanks for the insightful Q's, it already helped (I re-organized the code a bit)

The terminology for E0 is take from both a conversation with Bill as well as from Arkane's documentation (http://reactionmechanismgenerator.github.io/RMG-Py/users/arkane/input.html#option-2-directly-enter-molecular-properties). Our code wasn't consistent with this terminology. I also modified the documentation a bit more in this PR.

Yes, I agree. I changed ZPEC back to ZPE.

I just went through all scaling factors, we indeed had a few that were actually the ZPE factors. I also added a few more, and now they are all consistent. Note that the CBS-QB3 scaling factor changed as well. Surprisingly, I did not observe any changes in H298 before and after changing it (which is "good" but strange). I'll check it a bit more.

@alongd
Copy link
Member Author

alongd commented Jun 8, 2019

An interesting observation:

From https://cccbdb.nist.gov/vibscalejust.asp we have the following frequencies scaling factors:

'ccsd(t)/aug-cc-pvdz': 0.963,
'ccsd(t)/aug-cc-pvtz': 0.970,
'ccsd(t)/aug-cc-pvqz': 0.975,

But from https://comp.chem.umn.edu/freqscale/version3b2.htm we have:

'ccsd(t)/aug-cc-pvtz': 1.001,

In this PR I mixed both, with was probably not a great idea, but the data shows significant inconsistencies... Perhaps eventually we'd like to just refit all the scaling factor after implementing the Truhlar algorithm in ARC (tag @dranasinghe).

@dranasinghe
Copy link
Contributor

dranasinghe commented Jun 8, 2019 via email

@alongd
Copy link
Member Author

alongd commented Jun 13, 2019

@cgrambow , could you take a look?

Copy link

@cgrambow cgrambow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Is the plan to wait with merging until @dranasinghe derives new parameters or do we go ahead as is?

elif '\\ZeroPoint=' in line:
line = line.strip() + f.readline().strip()
start = line.find('\\ZeroPoint=') + 11
end = line.find('\\', start)
scaledZPE = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor
scaled_zpe = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're passing your newly defined zpe_scale_factor to this method in statmech.py but here we're removing the scaled ZPE from the composite E0. Are we sure that should be the ZPE scale factor and not the frequency scale factor?

Or maybe the more important question is, why are we actually doing this? As far as I can tell, \ZeroPoint gets printed in the archive string at the end of Gaussian jobs, but if it's a composite method, the last occurrence of ZPE will be from the E(ZPE) string instead. Therefore, I am wondering if the ZPE as defined on line 288 ever gets used. Do you know? If it does get used, then why are we multiplying it by the scale factor? Shouldn't it be the same as the statement on line 283?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should use the zpe scale factor here. It makes sense, since we're scaling the parsed ZPE, not freqs.

Here's an example from a CBS-QB3 run for SO2OO:
E(ZPE)= 0.014845
\ZeroPoint=0.0149945\

So we get:
0.014845 / 0.0149945 = 0.9900 (which is the ZPE scale factor of CBS-QB3, see https://doi.org/10.1063/1.477924 on p. 2826 just before the conclusions)

I guess this answers the last questions re why we don't multiply by the scale factor on line 283, but we do on line 288.

I also think we don't have to have both lines 283 and 288, but it seems more robust (I don't know if one of these sometimes does not gets printed)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. This clears up my confusion.

elif energy[2] == 'E0-ZPE':
energy_temp = Quantity(energy[:2])
E0_withZPE = energy_temp.value_si # in J/mol
if energy[2].lower() in ['e_elect', 'e_electronic', 'electronic_energy']:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we try to make the naming consistent so that there aren't three different variable names that all correspond to the same thing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

else:
E0 = E0 * constants.E_h * constants.Na # Hartree/particle to J/mol
e_electronic = e_electronic * constants.E_h * constants.Na # convert Hartree/particle into J/mol

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can shorten to e_electronic *= constants.E_h * constants.Na

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member Author

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comments! They were addressed in fixups.
We shouldn't wait for better values, I think this PR has an important bug fix as pointed out by @dranasinghe. Once merged, users can theoretically pass their own freq scaling factors, and we'll finally treat them right. The values we have could (and will) be improved, but getting new values shouldn't delay merging this.

elif '\\ZeroPoint=' in line:
line = line.strip() + f.readline().strip()
start = line.find('\\ZeroPoint=') + 11
end = line.find('\\', start)
scaledZPE = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor
scaled_zpe = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should use the zpe scale factor here. It makes sense, since we're scaling the parsed ZPE, not freqs.

Here's an example from a CBS-QB3 run for SO2OO:
E(ZPE)= 0.014845
\ZeroPoint=0.0149945\

So we get:
0.014845 / 0.0149945 = 0.9900 (which is the ZPE scale factor of CBS-QB3, see https://doi.org/10.1063/1.477924 on p. 2826 just before the conclusions)

I guess this answers the last questions re why we don't multiply by the scale factor on line 283, but we do on line 288.

I also think we don't have to have both lines 283 and 288, but it seems more robust (I don't know if one of these sometimes does not gets printed)

elif energy[2] == 'E0-ZPE':
energy_temp = Quantity(energy[:2])
E0_withZPE = energy_temp.value_si # in J/mol
if energy[2].lower() in ['e_elect', 'e_electronic', 'electronic_energy']:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

else:
E0 = E0 * constants.E_h * constants.Na # Hartree/particle to J/mol
e_electronic = e_electronic * constants.E_h * constants.Na # convert Hartree/particle into J/mol
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

'ccsd(t)/aug-cc-pvtz': 1.001, # [3]
'ccsd(t)/aug-cc-pvqz': 0.975, # [2]
'ccsd(t)/cc-pv(t+d)z': 0.965, # [2]
'ccsd(t)-f12/cc-pvdz-f12': 0.979, # [2], taken as 'CCSD(T)/cc-pVDZ'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can update this line and the remaining lines using the newest version of Truhlar's database. It has data for CCSD(T)-F12/cc-pVDZ-F12, CCSD(T)-F12/cc-pVTZ-F12 (already have this), and CCSD(T)-F12/cc-pVQZ-F12 (two different versions but they're super close to each other). The CCSD(T)-F12/aug-cc-pVnZ scale factors can be based on the CCSD(T)-F12/cc-pVnZ-F12 factors because the cc-pVnZ-F12 and aug-cc-pVnZ basis sets are very similar to each other.

@alongd
Copy link
Member Author

alongd commented Jun 14, 2019

Thanks! Let me know when to squash

@alongd
Copy link
Member Author

alongd commented Jun 14, 2019

Squashed. Should be good to go!

@alongd alongd force-pushed the scaling_zpe branch 2 times, most recently from 6238ddf to 6c13ab7 Compare June 16, 2019 00:23
alongd added 5 commits June 15, 2019 21:03
The 1.014 factor represents the relationship between the harmonic
frequencies scaling factor and the zero point energy scaling factor,
see https://pubs.acs.org/doi/10.1021/ct100326h Section 3.1.3.
Previously we used E0 for the electronic energy, and E0_withZPE for the
corrected energy.
The updated terminology is E0 = e_electronic + ZPE
(E0 includes the ZPE)
- Corrected scaling factors which were for ZPE instead of freq
- Organized the references
- Removed commented out levels
- Added some more levels (one of which calculated by Duminda, as noted)
@alongd
Copy link
Member Author

alongd commented Jun 17, 2019

@cgrambow, rebased again. Should be good to go

@cgrambow cgrambow merged commit d9205b5 into master Jun 17, 2019
@cgrambow cgrambow deleted the scaling_zpe branch June 17, 2019 14:39
@mliu49 mliu49 mentioned this pull request Jul 23, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants