Arkane output fixes #1607

goldmanm · 2019-05-29T18:32:51Z

Motivation or Problem

Arkane makes calculating rate coefficients easy, but suffers from not outputting all information needed in publications and also sometimes has unexpected behavior due to poor structuring.

Description of Changes

This PR modularizes core methods, like loading statmech, kinetics, and thermo jobs; fixes the --plot option; uses species labels in chemkin files; adds try-except statments to prevent output from erroring; creates methods to parse T1 diagnostics, levels of theory, and geometries; allows for termolecular products to be output by pdep; outputs more quantum information necessary for publications; and makes chemkin output and merge models more robust.

Testing

Many of the new methods have unit tests to ensure proper functioning. I have run this version for large pressure dependent networks and highp kinetics and they function properly. I have not tested the explorer functionality, though the test methods for it continue to work.

Reviewer Tips

Test the explorer functionality. Ensure code makes sense and is written in a usable and expandable way.

alongd · 2019-05-29T20:13:42Z

@goldmanm, thanks!
Before the review, could you take a look at the RMG-Tests?

codecov · 2019-05-30T03:12:57Z

Codecov Report

Merging #1607 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1607      +/-   ##
==========================================
- Coverage   41.88%   41.86%   -0.02%     
==========================================
  Files         176      176              
  Lines       29368    29368              
  Branches     6059     6059              
==========================================
- Hits        12301    12296       -5     
- Misses      16188    16192       +4     
- Partials      879      880       +1

Impacted Files	Coverage Δ
rmgpy/quantity.py	`0% <ø> (ø)`	⬆️
rmgpy/data/kinetics/family.py	`52.67% <0%> (-0.24%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update adfa67b...841c8a0. Read the comment docs.

mliu49 · 2019-06-03T19:42:18Z

FYI, to re-trigger RMG-tests, restart the travis push build.

alongd

Thanks for this great contribution!
I really appreciate the time and effort it seems to have taken, thanks for organizing Arkane's behaviour.
Since we recently had a PEP style clean-up in Arkane, I added many styling comments. You could instead use some auto-tool to point these and other things out (and auto-fix them, if you'd like). So overall this PR ended with many comments, but many of them are tiny style-related.

General questions:

The rotor loading from text file is being deprecated. I think we should add a method to load it from the csv file instead (or did I miss it?)
We'll soon implement 2D / ND rotors. Could you add a placeholder attribute in the csv file to specify the type of rotor, and default to 1D-HR?

arkane/main.py

rmgpy/chemkin.pyx

rmgpy/pdep/configuration.pyx

arkane/statmech.py

rmgpy/tools/merge_models.py

arkane/gaussian.py

arkane/gaussianTest.py

arkane/gaussian.py

goldmanm · 2019-06-10T00:00:42Z

I am a bit confused about what RMG-tests is telling me. It gives an error on superminimal of

Regression input file not found. Not running a regression test.
The command "./run.sh rmg/superminimal no" exited with 0.

My intuition is that this is saying something errored since it couldn't find the input file for regression testing, and I am confused how my changes in arkane caused this issue. Also when I try superminimal on my own computer, it works as expected.

@mjohnson541 and @mliu49, y'all were the last ones to modify check.sh, so y'all might know what I can do about this error.

mliu49 · 2019-06-10T00:05:07Z

It's simply saying that it's not running the regression test (which compares model outputs) because there isn't an input file for it. This is separate from the RMG input file.

The failure is from the liquid oxidation example, which had a problem reading comments from the Chemkin file.

goldmanm · 2019-06-10T00:10:37Z

I don't know why it's liquid oxidation given the error message has 'superminimal' in it's description.

mliu49 · 2019-06-10T00:18:30Z

The error is on line 1538: https://travis-ci.org/ReactionMechanismGenerator/RMG-tests/jobs/543479094#L1538

goldmanm · 2019-06-10T00:22:04Z

Thank you so much @mliu49! I'll figure it out now and get it fixed.

goldmanm · 2019-06-10T02:08:26Z

I think I fixed the issue which dealt with parsing string comments. I took the file generated from the oxidation example, which contained the warning:

! This reaction has length 53 which is 1 longer than Chemkin allows.
! Consider shortening species names to run chemkin.

And tested it on chemkin, which gives no error. I even added 16 more characters, and still no error. I originally got the limit of 52 characters from an RMG post arguing we can only have 10 character long species because the reaction length cannot be longer than 52. I checked the recent version of chemkin documentation on Raineir 'input.pdf' and it gives no such restriction on character length.

Maybe in the last decade or so chemkin has become more flexible at parsing reactions, and it is no longer necessary to have this warning I wrote (or limit chemkin identifiers to 10 characters; we can use the full 16 characters). What do y'all think? I can post the extending chemkin identifiers to 16 characters as an issue and (for this PR) remove the warnings shown above?

mliu49 · 2019-06-10T15:49:40Z

We currently have Chemkin 17.0, so the rmgpy.chemkin module is severely outdated...

That aside, it seems like the 16 character limit on species identifiers is still valid. However, reaction strings don't seem to have a limit as you mentioned.

I think it's fine to no longer check reaction line lengths. I'm still not sure how I feel about increasing species identifier lengths past 16 though...

rmgpy/chemkin.pyx

goldmanm · 2019-06-10T17:06:57Z

I fixed the suggestions and added 2 more commits (one setting chemkin identifier length to 16 and one creating the --no-plot option)

mliu49 · 2019-06-10T19:34:24Z

rmgpy/chemkin.pyx

    """
    lines = entry.splitlines()
-    species = str(lines[0][0:18].split()[0].strip())
+    species = str(lines[0][0:22].split()[0].strip())


The thermo specification has columns 1-16 for the species name and columns 19-24 for a comment.

Was there a reason you chose 22? I think it could probably be safely changed to 24.

Just to clarify a bit, if we're reading external chemkin files, we should still be able to parse the identifier and comment properly. If we're reading RMG chemkin files where we might have an extra long identifier, we might as well use the full 24 characters. If we're writing chemkin files, using 22 characters already breaks chemkin compatibility, so it won't matter if we use 24. So this can be changed at lines 1470 and 1968 as well.

Good idea. I modified it to allow longer strings here.

mliu49 · 2019-06-10T21:58:35Z

Just to double-check, would you like to include this in the release? (I'm currently aiming for Friday.)

If we're not confident, it could be held off until the next one. I think we'll likely need a small release (e.g. 2.4.1) before the Arkane paper anyway, for incoming features/fixes.

Also, seems like there are new issues with RMG-tests related to species identifiers in Cantera...

goldmanm · 2019-06-11T01:16:49Z

I would like for it to get incorporated in this release for my butanol paper, though I don't think it is worth it if all of us are not confident in it.

mliu49 · 2019-06-11T14:48:19Z

I see. I think we can include it then, once we fix the RMG-tests build.

mliu49 · 2019-06-11T15:48:14Z

I looked into why RMG-tests is failing. I think it's simply a bug related to how RMG-tests works.

The tests are failing on the regression test, which is run by an RMG-Py script. The script runs Cantera simulations using RMG-Py wrappers, which involves calling getSpeciesIdentifer at multiple points.

RMG-tests uses the testing version of RMG-Py to run this test (changed recently, but I think it would fail even if we were still using the benchmark version). Thus, when we try to simulate the benchmark model (with old identifiers) and call the new getSpeciesIdentifier function, we get a different identifier which does not match one in the model.

Therefore, I think we can actually ignore the RMG-tests failure for this PR. If everything else is ok, then I think we can merge this.

I only looked at the changes to chemkin.pyx, which I think is good now. @alongd, could you approve based on whether you think the Arkane changes are ok?

goldmanm · 2019-06-12T12:35:54Z

I was thinking that was the issue with RMG-Tests but couldn't find exactly where the comparison was happening. I wonder if the cantera species & molecules are built from the rmg objects, instead of using the cti file, if this would prevent this error from appearing when changing getSpeciesIdentifier.

When I ran the regression test with RMG-Py (using the new version for both, which is redundant, i know), it worked perfectly fine.

alongd · 2019-06-12T13:00:10Z

@goldmanm, I'm testing out this branch. Ran the Arkane species Benzyl example, and got:

(rmg_env) alongd@alongd:~/Code/RMG-Py/examples/arkane/species/Benzyl$ python $rmgpy/Arkane.py input.py
Using Theano backend.
Traceback (most recent call last):
  File "/home/alongd/Code/RMG-Py//Arkane.py", line 57, in <module>
    arkane.parseCommandLineArguments()
  File "/home/alongd/Code/RMG-Py/arkane/main.py", line 140, in parseCommandLineArguments
    self.plot = args.plot
AttributeError: 'Namespace' object has no attribute 'plot'

Were you able to run it smoothly?
It's strange, since our tests should run all of Arkane's examples

Previously, having termolecular products would break Arkane since it doesn't properly account for the reverse reaction.

This commit changes the default behavior of Arkane to generate plots, and integrates this change into the default documentation.

This commit changes the default behavior of a high-p kinetics job to only draw the PES when the user specifies that they would like plots, by using the '-p' option when running arkane. This allows users who want automatically generated graphs to have them, whereas those that would not like those graphs will not obtain extra files or run extra code.

This commit refactors arkane thermo output by splitting `save` into `write_output` and `write_chemkin`, adding try, except statements on outputs, and passing the output directory instead of the `output.py` file path to each method.

This commit refactors arkane statmech output by renaming `save` to `write_output`, adding try/except statements on outputs, and passing the output directory instead of the `output.py` file path to each method.

This commit refactors arkane kinetics output by splitting `save` into `write_output`, `write_yaml` and `write_chemkin`, adding try/except statements on outputs, and passing the output directory instead of the `output.py` file path to each method.

Previous commit got job.execute to take in the output directory instead of the `output.py` file. This commit gets commonTest.py to use this new input style.

This commit allows Log objects to parse out the T1 diagnostic, so it can be automatically found and returned for Arkane users. The method was also added to the QchemLog class.

Previously, it was not possible to retrieve and save the point group used for symmetry calculations in Arkane's output data for supporting information. This commit refactors a Log method to also output point group information, which involved renaming it to the more general `get_symmetry_properties`.

This commit refactors where the supporting information is saved to be at the end of StatMechJob.load method instead of partially in the __init__ method and throughout the load method. This commit also adds the symmetry, optical isomers, energies (with and without zero-point energy), T1 diagnostic, and geometry to the exported csv file. This commit also prevents from creating the supporting_information.csv file when no species will be output, simplifying the process for other users of Arkane. Try/except statements are added to reduce potential for errors. This commit renamed the modified e_electronic to e_electronic_with_corrections to allow for the base electronic energy to be output. This commit also renames the unused e_0 to e0

This commit prevents plotHinderedRotors from being called unless the user uses the `--plot` argument. It also has a try except statement to prevent plotting from stopping a job.

Previously, hindered rotor data was extracted from log data and saved into the directory where the log file was containted. This decentralized the information making it more difficult to compare and plot. This commit places this data inside `hindered_rotor_scan_data.csv` for all species and transition states in the arkane run, which facilitates future usage of the data.

This commit adds methods for Arkane to automatically read hindered rotor scan information, like which dihedral is being pivoted, which dihedrals are being fixed, and outputs it to the hindered rotor csv file for easier analysis.

Previously, plotHinderedRotor was called under the `load` method, which gave limited flexibility as to where the hindered rotor plot would be saved and was also unintuitive since it was not in a method labeled `output`. This commit refactors the plotHinderedRotor method by separating it into two separate methods, one for internally storing the data, and a second for writing the plot to disk. This should reduce any potential errors in writing from impacting the loading and creating of a model. The refactor also saves the plots in the Arkane output folder, which should be more intuitive for users.

This commit updates Arkane documentation to indicate the label given species is used in outputs of Arkane, indicate that symmetry and optical isomers are no longer required inputs, as well as update the hindered rotor section to indicate that symmetry is not required as well.

If not provided, it will read from the frequencies file.

Added method to get the D1 diagnostic from coupled cluster calculations with molpro.

This commit outputs the D1 diagnostic for coupled cluster calculations in the supporting_information.py when running Arkane calculations.

This commit gets StatmechJob to use getatom_correction and get_bac, so that the corrected energies from both methods can be included in future statmech analysis (for example, not including bond corrections for kinetics calculations, but including them in thermo calculations). Since apply_energy_correction is no longer used, a deprecation warning has been added.

The documentation previously said the order of atoms in the bond correction labels is NOT important. This is entirely incorrect. This commit corrected this by copying info from the comments in applyBondEnergyCorrections.

This commit uses the molecule.enuerate_bonds to generate the bonds for bond energy corrections if the user did not specify the bonds.

Previously statmech.py used `isinstance(x, GaussianLog, QchemLog, MolproLog))` when determining if the logfile has the proper methods implemented. This may cause issues when adding more subclasses, if these lines are not properly modified. This commit reduces that potential error by replacing that command with `isinstance(x, Log) and type(obj).__name__ != 'Log'`

goldmanm · 2019-07-27T06:01:56Z

@alongd I improved the docstrings for the methods in the thermo, kinetics, and statmech files for their execute, save_output and save_chemkin methods. Examples below:

        Execute the thermodynamics job, saving the results within
        the `output_directory`.

        If `plot` is true, then plots of the raw and fitted values for heat
        capacity, entropy, enthalpy, gibbs free energy, and hindered rotors
        will be saved.

        Save the results of the thermodynamics job to the `output.py` file located
        in `output_directory`.

        Appends the thermo block to `chem.inp` and species name to
        `species_dictionary.txt` within the `outut_directory` specified

Thank you for such extensive help improving this PR.

alongd

Great, thanks!!

goldmanm requested review from alongd and mjohnson541 May 29, 2019 18:32

goldmanm force-pushed the arkane_output_fixes branch from 4e55259 to 82d511c Compare May 30, 2019 17:57

alongd reviewed Jun 4, 2019

View reviewed changes

goldmanm force-pushed the arkane_output_fixes branch 3 times, most recently from f27be77 to cc0cd59 Compare June 9, 2019 19:47

goldmanm force-pushed the arkane_output_fixes branch from cc0cd59 to fa91e2e Compare June 10, 2019 01:31

mliu49 reviewed Jun 10, 2019

View reviewed changes

rmgpy/chemkin.pyx Outdated Show resolved Hide resolved

mliu49 reviewed Jun 10, 2019

View reviewed changes

rmgpy/chemkin.pyx Outdated Show resolved Hide resolved

goldmanm force-pushed the arkane_output_fixes branch from fa91e2e to c3c892e Compare June 10, 2019 17:04

mliu49 reviewed Jun 10, 2019

View reviewed changes

goldmanm force-pushed the arkane_output_fixes branch from c3c892e to 3d9738c Compare June 11, 2019 01:51

goldmanm added 5 commits July 27, 2019 01:08

Support termolecular products in Arkane pdep

bb67d46

Previously, having termolecular products would break Arkane since it doesn't properly account for the reverse reaction.

Set default of Arkane to generate plots.

eac22a7

This commit changes the default behavior of Arkane to generate plots, and integrates this change into the default documentation.

Refactor Arkane thermo output

f61c1d9

This commit refactors arkane thermo output by splitting `save` into `write_output` and `write_chemkin`, adding try, except statements on outputs, and passing the output directory instead of the `output.py` file path to each method.

Refactor Arkane statmech output

89691a3

This commit refactors arkane statmech output by renaming `save` to `write_output`, adding try/except statements on outputs, and passing the output directory instead of the `output.py` file path to each method.

goldmanm force-pushed the arkane_output_fixes branch from 9511243 to b48c0f9 Compare July 27, 2019 05:53

goldmanm added 17 commits July 27, 2019 02:01

Refactor Arkane kinetics output

74eb863

This commit refactors arkane kinetics output by splitting `save` into `write_output`, `write_yaml` and `write_chemkin`, adding try/except statements on outputs, and passing the output directory instead of the `output.py` file path to each method.

Make execute() calls use directory in commonTest

dde36b6

Previous commit got job.execute to take in the output directory instead of the `output.py` file. This commit gets commonTest.py to use this new input style.

Add get_T1_diagnostic method to log file parsers

955533a

This commit allows Log objects to parse out the T1 diagnostic, so it can be automatically found and returned for Arkane users. The method was also added to the QchemLog class.

prevent errors from plotHinderedRotors

6f8a64b

This commit prevents plotHinderedRotors from being called unless the user uses the `--plot` argument. It also has a try except statement to prevent plotting from stopping a job.

Read hindered rotor scan information and output to csv

045ffd3

This commit adds methods for Arkane to automatically read hindered rotor scan information, like which dihedral is being pivoted, which dihedrals are being fixed, and outputs it to the hindered rotor csv file for easier analysis.

Make geometry input for Arkane.statmech optional

cef40c2

If not provided, it will read from the frequencies file.

Add method get_D1_diagnostic

9d9be9b

Added method to get the D1 diagnostic from coupled cluster calculations with molpro.

Output D1 diagnostic for arkane in supporting_info.csv

f7300c3

This commit outputs the D1 diagnostic for coupled cluster calculations in the supporting_information.py when running Arkane calculations.

Correct documentation error for bond correction labels

efa7c5a

The documentation previously said the order of atoms in the bond correction labels is NOT important. This is entirely incorrect. This commit corrected this by copying info from the comments in applyBondEnergyCorrections.

Automatically generate bonds for bond energy corrections

97c31e8

This commit uses the molecule.enuerate_bonds to generate the bonds for bond energy corrections if the user did not specify the bonds.

goldmanm force-pushed the arkane_output_fixes branch from b48c0f9 to 841c8a0 Compare July 27, 2019 06:01

alongd approved these changes Jul 27, 2019

View reviewed changes

alongd merged commit 1369f9e into master Jul 27, 2019

alongd deleted the arkane_output_fixes branch July 27, 2019 15:22

mliu49 mentioned this pull request Nov 26, 2019

RMG v3.0.0 Release Planning #1830

Closed

mliu49 mentioned this pull request Dec 16, 2019

RMG-Py v3.0.0 Release #1852

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Arkane output fixes #1607

Arkane output fixes #1607

goldmanm commented May 29, 2019

alongd commented May 29, 2019

codecov bot commented May 30, 2019 •

edited

Loading

mliu49 commented Jun 3, 2019

alongd left a comment

goldmanm commented Jun 10, 2019 •

edited

Loading

mliu49 commented Jun 10, 2019

goldmanm commented Jun 10, 2019

mliu49 commented Jun 10, 2019

goldmanm commented Jun 10, 2019

goldmanm commented Jun 10, 2019

mliu49 commented Jun 10, 2019

goldmanm commented Jun 10, 2019

mliu49 Jun 10, 2019

mliu49 Jun 10, 2019

goldmanm Jun 11, 2019

mliu49 commented Jun 10, 2019 •

edited

Loading

goldmanm commented Jun 11, 2019

mliu49 commented Jun 11, 2019

mliu49 commented Jun 11, 2019

goldmanm commented Jun 12, 2019

alongd commented Jun 12, 2019 •

edited

Loading

goldmanm commented Jul 27, 2019 •

edited

Loading

alongd left a comment

Arkane output fixes #1607

Arkane output fixes #1607

Conversation

goldmanm commented May 29, 2019

Motivation or Problem

Description of Changes

Testing

Reviewer Tips

alongd commented May 29, 2019

codecov bot commented May 30, 2019 • edited Loading

Codecov Report

mliu49 commented Jun 3, 2019

alongd left a comment

Choose a reason for hiding this comment

goldmanm commented Jun 10, 2019 • edited Loading

mliu49 commented Jun 10, 2019

goldmanm commented Jun 10, 2019

mliu49 commented Jun 10, 2019

goldmanm commented Jun 10, 2019

goldmanm commented Jun 10, 2019

mliu49 commented Jun 10, 2019

goldmanm commented Jun 10, 2019

mliu49 Jun 10, 2019

Choose a reason for hiding this comment

mliu49 Jun 10, 2019

Choose a reason for hiding this comment

goldmanm Jun 11, 2019

Choose a reason for hiding this comment

mliu49 commented Jun 10, 2019 • edited Loading

goldmanm commented Jun 11, 2019

mliu49 commented Jun 11, 2019

mliu49 commented Jun 11, 2019

goldmanm commented Jun 12, 2019

alongd commented Jun 12, 2019 • edited Loading

goldmanm commented Jul 27, 2019 • edited Loading

alongd left a comment

Choose a reason for hiding this comment

codecov bot commented May 30, 2019 •

edited

Loading

goldmanm commented Jun 10, 2019 •

edited

Loading

mliu49 commented Jun 10, 2019 •

edited

Loading

alongd commented Jun 12, 2019 •

edited

Loading

goldmanm commented Jul 27, 2019 •

edited

Loading