Scaling zpe #1619

alongd · 2019-06-07T03:02:41Z

Motivation or Problem

Previously we used the frequency scaling factor to also scale the zero point energy correction in Arkane. Principally, these should be two different factors. Luckily, there's a simple ratio between them, a factor of 1.014 (+/- 0.002, see https://pubs.acs.org/doi/10.1021/ct100326h section 3.1.3) - largely independent of the model chemistry. So only of the factors one should be known, the second can be easily determined. This erroneous behaviour causes a systematic error of about 1.5 kJ/mol(!) - or 0.35 kcal/mol - to H298 (see below).

Thanks @dranasinghe for helping me understand what I'm doing :)

Description of Changes

Dividing the freq scaling factor by 1.014 to obtain the ZPEC scaling factor when scaling the ZPEC.

Here are some species I ran (purposely selected ones with no torsions) prior and after this change at ccsd(t)-f12/cc-pvtz-f12//wb97xd/6-311++g(d,p), energies are in kJ/mol:

Species	H298, master	H298, this branch	H298, ATcT
CH4	-71.95	-73.56	-74.525 +/- 0.056
NH3	-41.81	-43.06	-45.558 +/- 0.030
CO2	-385.07	-385.49	-393.475 +/- 0.015
C2H4	55.97	54.13	52.39 +/- 0.12
C2H2	232.61	231.63	228.27 +/- 0.13

Before the changes we had an average deviation from ATcT values of 4.5 kJ/mol (for these species at this level), now it is 3.3 kJ/mol (not accounting for CO2, the deviation went down from 3.6 kJ/mol to 2.1 kJ/mol). If these species could be representative, the implication is that we're improving Arkane's H298 by about 1.2 to 1.5 kJ/mol, at least for this level.

cgrambow

Before I comment on all lines where you made changes, I'll ask some questions first.

Your second commit messages implies that E0 should always contain ZPE. I generally agree, but E0 is just an arbitrary symbol that one can choose however one would like to choose it. What prompted you to make this change?

Can I propose that you change ZPEC everywhere you use it to ZPE? I wouldn't really call ZPE a "correction"; it just is.

Have you verified that all frequency scale factors hard-coded in Arkane are actual frequency scale factors and not ZPE scale factors?

codecov · 2019-06-07T04:37:00Z

Codecov Report

Merging #1619 into master will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1619      +/-   ##
==========================================
+ Coverage   41.47%   41.49%   +0.01%     
==========================================
  Files         176      176              
  Lines       29314    29314              
  Branches     6035     6035              
==========================================
+ Hits        12158    12163       +5     
+ Misses      16303    16299       -4     
+ Partials      853      852       -1

Impacted Files	Coverage Δ
rmgpy/data/kinetics/family.py	`52.9% <0%> (+0.23%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12c16bf...20b0d3e. Read the comment docs.

alongd · 2019-06-07T15:09:54Z

Thanks for the insightful Q's, it already helped (I re-organized the code a bit)

The terminology for E0 is take from both a conversation with Bill as well as from Arkane's documentation (http://reactionmechanismgenerator.github.io/RMG-Py/users/arkane/input.html#option-2-directly-enter-molecular-properties). Our code wasn't consistent with this terminology. I also modified the documentation a bit more in this PR.

Yes, I agree. I changed ZPEC back to ZPE.

I just went through all scaling factors, we indeed had a few that were actually the ZPE factors. I also added a few more, and now they are all consistent. Note that the CBS-QB3 scaling factor changed as well. Surprisingly, I did not observe any changes in H298 before and after changing it (which is "good" but strange). I'll check it a bit more.

alongd · 2019-06-08T14:10:11Z

An interesting observation:

From https://cccbdb.nist.gov/vibscalejust.asp we have the following frequencies scaling factors:

'ccsd(t)/aug-cc-pvdz': 0.963,
'ccsd(t)/aug-cc-pvtz': 0.970,
'ccsd(t)/aug-cc-pvqz': 0.975,

But from https://comp.chem.umn.edu/freqscale/version3b2.htm we have:

'ccsd(t)/aug-cc-pvtz': 1.001,

In this PR I mixed both, with was probably not a great idea, but the data shows significant inconsistencies... Perhaps eventually we'd like to just refit all the scaling factor after implementing the Truhlar algorithm in ARC (tag @dranasinghe).

dranasinghe · 2019-06-08T15:53:35Z

I looked at CCCBDB, they use frequency scaling instead of ZPE as Truhlar did. I have my doubts regarding the accuracy of cccbdb results; I find it strange that ccsd(t)/tz ,qz results are off by ~.0.96-097. I expect them to be much closer to 0.99. further, if you take a look at the section "Poor results" there are lots of bad frequency calculations. I agree with the idea that we should refit the frequency scaling factors with Truhlar's algorithm. Furthermore, we should use ZPE to fit frequency scaling, if you use frequencies to fit scaling factors we need to worry about the symmetry of the vibrational mode; calculated vibrational modes might have different symmetry labels than the experiment. We can avoid all of it by fitting to ZPE. Question is should we use 15 molecules Truhlar used or expand the test set? Truhlar uses two test sets one with 9 molecules and other with 15 looks like both give similar results. So I think we will be fine if you just use the same test set. I will implement Truhlar's method in ARC this weekend. Duminda

…

On Sat, Jun 8, 2019 at 10:10 AM Alon Grinberg Dana ***@***.***> wrote: An interesting observation: From https://cccbdb.nist.gov/vibscalejust.asp we have the following frequencies scaling factors: 'ccsd(t)/aug-cc-pvdz': 0.963,'ccsd(t)/aug-cc-pvtz': 0.970,'ccsd(t)/aug-cc-pvqz': 0.975, But from https://comp.chem.umn.edu/freqscale/version3b2.htm we have: 'ccsd(t)/aug-cc-pvtz': 1.001, In this PR I mixed both, with was probably not a great idea, but the data shows significant inconsistencies... Perhaps eventually we'd like to just refit all the scaling factor after implementing the Truhlar algorithm in ARC (tag @dranasinghe <https://github.com/dranasinghe>). — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1619?email_source=notifications&email_token=AF763KKCAPS7GSKVHFJVO2DPZO4ULA5CNFSM4HVPFD72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODXHVKUY#issuecomment-500127059>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AF763KOWYOP5D7N4SX336CTPZO4ULANCNFSM4HVPFD7Q> .

alongd · 2019-06-13T19:10:30Z

@cgrambow , could you take a look?

cgrambow

Looks good to me. Is the plan to wait with merging until @dranasinghe derives new parameters or do we go ahead as is?

cgrambow · 2019-06-13T21:36:01Z

arkane/gaussian.py

                elif '\\ZeroPoint=' in line:
                    line = line.strip() + f.readline().strip()
                    start = line.find('\\ZeroPoint=') + 11
                    end = line.find('\\', start)
-                    scaledZPE = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor
+                    scaled_zpe = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor


You're passing your newly defined zpe_scale_factor to this method in statmech.py but here we're removing the scaled ZPE from the composite E0. Are we sure that should be the ZPE scale factor and not the frequency scale factor?

Or maybe the more important question is, why are we actually doing this? As far as I can tell, \ZeroPoint gets printed in the archive string at the end of Gaussian jobs, but if it's a composite method, the last occurrence of ZPE will be from the E(ZPE) string instead. Therefore, I am wondering if the ZPE as defined on line 288 ever gets used. Do you know? If it does get used, then why are we multiplying it by the scale factor? Shouldn't it be the same as the statement on line 283?

Yes, we should use the zpe scale factor here. It makes sense, since we're scaling the parsed ZPE, not freqs.

Here's an example from a CBS-QB3 run for SO2OO:
E(ZPE)= 0.014845
\ZeroPoint=0.0149945\

So we get:
0.014845 / 0.0149945 = 0.9900 (which is the ZPE scale factor of CBS-QB3, see https://doi.org/10.1063/1.477924 on p. 2826 just before the conclusions)

I guess this answers the last questions re why we don't multiply by the scale factor on line 283, but we do on line 288.

I also think we don't have to have both lines 283 and 288, but it seems more robust (I don't know if one of these sometimes does not gets printed)

Thanks for the explanation. This clears up my confusion.

cgrambow · 2019-06-13T21:39:52Z

arkane/statmech.py

-            elif energy[2] == 'E0-ZPE':
-                energy_temp = Quantity(energy[:2])
-                E0_withZPE = energy_temp.value_si  # in J/mol
+            if energy[2].lower() in ['e_elect', 'e_electronic', 'electronic_energy']:


Can we try to make the naming consistent so that there aren't three different variable names that all correspond to the same thing?

cgrambow · 2019-06-13T21:41:52Z

arkane/statmech.py

            else:
-                E0 = E0 * constants.E_h * constants.Na  # Hartree/particle to J/mol
+                e_electronic = e_electronic * constants.E_h * constants.Na  # convert Hartree/particle into J/mol


Can shorten to e_electronic *= constants.E_h * constants.Na

alongd

Thanks for your comments! They were addressed in fixups.
We shouldn't wait for better values, I think this PR has an important bug fix as pointed out by @dranasinghe. Once merged, users can theoretically pass their own freq scaling factors, and we'll finally treat them right. The values we have could (and will) be improved, but getting new values shouldn't delay merging this.

alongd · 2019-06-13T23:50:42Z

arkane/gaussian.py

                elif '\\ZeroPoint=' in line:
                    line = line.strip() + f.readline().strip()
                    start = line.find('\\ZeroPoint=') + 11
                    end = line.find('\\', start)
-                    scaledZPE = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor
+                    scaled_zpe = float(line[start:end]) * constants.E_h * constants.Na * frequencyScaleFactor


Yes, we should use the zpe scale factor here. It makes sense, since we're scaling the parsed ZPE, not freqs.

Here's an example from a CBS-QB3 run for SO2OO:
E(ZPE)= 0.014845
\ZeroPoint=0.0149945\

So we get:
0.014845 / 0.0149945 = 0.9900 (which is the ZPE scale factor of CBS-QB3, see https://doi.org/10.1063/1.477924 on p. 2826 just before the conclusions)

I guess this answers the last questions re why we don't multiply by the scale factor on line 283, but we do on line 288.

I also think we don't have to have both lines 283 and 288, but it seems more robust (I don't know if one of these sometimes does not gets printed)

alongd · 2019-06-14T00:21:30Z

arkane/statmech.py

-            elif energy[2] == 'E0-ZPE':
-                energy_temp = Quantity(energy[:2])
-                E0_withZPE = energy_temp.value_si  # in J/mol
+            if energy[2].lower() in ['e_elect', 'e_electronic', 'electronic_energy']:


alongd · 2019-06-14T00:22:12Z

arkane/statmech.py

            else:
-                E0 = E0 * constants.E_h * constants.Na  # Hartree/particle to J/mol
+                e_electronic = e_electronic * constants.E_h * constants.Na  # convert Hartree/particle into J/mol


cgrambow · 2019-06-14T16:13:19Z

arkane/statmech.py

+                 'ccsd(t)/aug-cc-pvtz': 1.001,  # [3]
+                 'ccsd(t)/aug-cc-pvqz': 0.975,  # [2]
+                 'ccsd(t)/cc-pv(t+d)z': 0.965,  # [2]
+                 'ccsd(t)-f12/cc-pvdz-f12': 0.979,  # [2], taken as 'CCSD(T)/cc-pVDZ'


You can update this line and the remaining lines using the newest version of Truhlar's database. It has data for CCSD(T)-F12/cc-pVDZ-F12, CCSD(T)-F12/cc-pVTZ-F12 (already have this), and CCSD(T)-F12/cc-pVQZ-F12 (two different versions but they're super close to each other). The CCSD(T)-F12/aug-cc-pVnZ scale factors can be based on the CCSD(T)-F12/cc-pVnZ-F12 factors because the cc-pVnZ-F12 and aug-cc-pVnZ basis sets are very similar to each other.

alongd · 2019-06-14T19:39:24Z

Thanks! Let me know when to squash

alongd · 2019-06-14T20:20:16Z

Squashed. Should be good to go!

The 1.014 factor represents the relationship between the harmonic frequencies scaling factor and the zero point energy scaling factor, see https://pubs.acs.org/doi/10.1021/ct100326h Section 3.1.3.

Previously we used E0 for the electronic energy, and E0_withZPE for the corrected energy. The updated terminology is E0 = e_electronic + ZPE (E0 includes the ZPE)

- Corrected scaling factors which were for ZPE instead of freq - Organized the references - Removed commented out levels - Added some more levels (one of which calculated by Duminda, as noted)

alongd · 2019-06-17T13:46:26Z

@cgrambow, rebased again. Should be good to go

alongd added the Arkane label Jun 7, 2019

alongd requested a review from cgrambow June 7, 2019 03:02

alongd self-assigned this Jun 7, 2019

cgrambow reviewed Jun 7, 2019

View reviewed changes

alongd force-pushed the scaling_zpe branch from 25a0f8c to 7dd98bb Compare June 7, 2019 15:02

alongd force-pushed the scaling_zpe branch from 7dd98bb to 459c445 Compare June 8, 2019 02:42

alongd force-pushed the scaling_zpe branch from 459c445 to 41e87d4 Compare June 13, 2019 19:09

cgrambow requested changes Jun 13, 2019

View reviewed changes

alongd commented Jun 14, 2019

View reviewed changes

cgrambow reviewed Jun 14, 2019

View reviewed changes

alongd force-pushed the scaling_zpe branch from 9d937c9 to 1fac484 Compare June 14, 2019 19:39

alongd force-pushed the scaling_zpe branch from 1fac484 to e5e0184 Compare June 14, 2019 20:17

alongd force-pushed the scaling_zpe branch 2 times, most recently from 6238ddf to 6c13ab7 Compare June 16, 2019 00:23

alongd added 5 commits June 15, 2019 21:03

Changed opening files to "with open() as f" in Arkane Logs

5448290

Implement ZPE scaling factor

ce8707a

The 1.014 factor represents the relationship between the harmonic frequencies scaling factor and the zero point energy scaling factor, see https://pubs.acs.org/doi/10.1021/ct100326h Section 3.1.3.

Corrected the energy (e_elect, zpe, and E0) terminology in Arkane

2c9d0c8

Previously we used E0 for the electronic energy, and E0_withZPE for the corrected energy. The updated terminology is E0 = e_electronic + ZPE (E0 includes the ZPE)

Updated the documentation to reflect the E0 parsing terminology

bc242dc

Updated the frequency scale factor assignemt

20b0d3e

- Corrected scaling factors which were for ZPE instead of freq - Organized the references - Removed commented out levels - Added some more levels (one of which calculated by Duminda, as noted)

alongd force-pushed the scaling_zpe branch from 6c13ab7 to 20b0d3e Compare June 16, 2019 01:03

cgrambow approved these changes Jun 17, 2019

View reviewed changes

cgrambow merged commit d9205b5 into master Jun 17, 2019

cgrambow deleted the scaling_zpe branch June 17, 2019 14:39

mliu49 mentioned this pull request Jul 23, 2019

RMG-Py v2.4.1 release #1664

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scaling zpe #1619

Scaling zpe #1619

alongd commented Jun 7, 2019

cgrambow left a comment

codecov bot commented Jun 7, 2019 •

edited

Loading

alongd commented Jun 7, 2019

alongd commented Jun 8, 2019

dranasinghe commented Jun 8, 2019 via email

alongd commented Jun 13, 2019

cgrambow left a comment

cgrambow Jun 13, 2019

alongd Jun 13, 2019

cgrambow Jun 14, 2019

cgrambow Jun 13, 2019

alongd Jun 14, 2019

cgrambow Jun 13, 2019

alongd Jun 14, 2019

alongd left a comment

alongd Jun 13, 2019

alongd Jun 14, 2019

alongd Jun 14, 2019

cgrambow Jun 14, 2019

alongd commented Jun 14, 2019

alongd commented Jun 14, 2019

alongd commented Jun 17, 2019 •

edited

Loading

Scaling zpe #1619

Scaling zpe #1619

Conversation

alongd commented Jun 7, 2019

Motivation or Problem

Description of Changes

cgrambow left a comment

Choose a reason for hiding this comment

codecov bot commented Jun 7, 2019 • edited Loading

Codecov Report

alongd commented Jun 7, 2019

alongd commented Jun 8, 2019

dranasinghe commented Jun 8, 2019 via email

alongd commented Jun 13, 2019

cgrambow left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alongd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alongd commented Jun 14, 2019

alongd commented Jun 14, 2019

alongd commented Jun 17, 2019 • edited Loading

codecov bot commented Jun 7, 2019 •

edited

Loading

alongd commented Jun 17, 2019 •

edited

Loading