Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arkane output fixes #1607

Merged
merged 23 commits into from
Jul 27, 2019
Merged

Arkane output fixes #1607

merged 23 commits into from
Jul 27, 2019

Conversation

goldmanm
Copy link
Contributor

Motivation or Problem

Arkane makes calculating rate coefficients easy, but suffers from not outputting all information needed in publications and also sometimes has unexpected behavior due to poor structuring.

Description of Changes

This PR modularizes core methods, like loading statmech, kinetics, and thermo jobs; fixes the --plot option; uses species labels in chemkin files; adds try-except statments to prevent output from erroring; creates methods to parse T1 diagnostics, levels of theory, and geometries; allows for termolecular products to be output by pdep; outputs more quantum information necessary for publications; and makes chemkin output and merge models more robust.

Testing

Many of the new methods have unit tests to ensure proper functioning. I have run this version for large pressure dependent networks and highp kinetics and they function properly. I have not tested the explorer functionality, though the test methods for it continue to work.

Reviewer Tips

Test the explorer functionality. Ensure code makes sense and is written in a usable and expandable way.

@goldmanm goldmanm requested review from alongd and mjohnson541 May 29, 2019 18:32
@alongd
Copy link
Member

alongd commented May 29, 2019

@goldmanm, thanks!
Before the review, could you take a look at the RMG-Tests?

@codecov
Copy link

codecov bot commented May 30, 2019

Codecov Report

Merging #1607 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1607      +/-   ##
==========================================
- Coverage   41.88%   41.86%   -0.02%     
==========================================
  Files         176      176              
  Lines       29368    29368              
  Branches     6059     6059              
==========================================
- Hits        12301    12296       -5     
- Misses      16188    16192       +4     
- Partials      879      880       +1
Impacted Files Coverage Δ
rmgpy/quantity.py 0% <ø> (ø) ⬆️
rmgpy/data/kinetics/family.py 52.67% <0%> (-0.24%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update adfa67b...841c8a0. Read the comment docs.

@goldmanm goldmanm force-pushed the arkane_output_fixes branch from 4e55259 to 82d511c Compare May 30, 2019 17:57
@mliu49
Copy link
Contributor

mliu49 commented Jun 3, 2019

FYI, to re-trigger RMG-tests, restart the travis push build.

Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great contribution!
I really appreciate the time and effort it seems to have taken, thanks for organizing Arkane's behaviour.
Since we recently had a PEP style clean-up in Arkane, I added many styling comments. You could instead use some auto-tool to point these and other things out (and auto-fix them, if you'd like). So overall this PR ended with many comments, but many of them are tiny style-related.

General questions:

  • The rotor loading from text file is being deprecated. I think we should add a method to load it from the csv file instead (or did I miss it?)
  • We'll soon implement 2D / ND rotors. Could you add a placeholder attribute in the csv file to specify the type of rotor, and default to 1D-HR?

arkane/main.py Outdated Show resolved Hide resolved
rmgpy/chemkin.pyx Outdated Show resolved Hide resolved
rmgpy/pdep/configuration.pyx Outdated Show resolved Hide resolved
rmgpy/pdep/configuration.pyx Outdated Show resolved Hide resolved
rmgpy/pdep/configuration.pyx Outdated Show resolved Hide resolved
arkane/statmech.py Outdated Show resolved Hide resolved
rmgpy/tools/merge_models.py Outdated Show resolved Hide resolved
arkane/gaussian.py Outdated Show resolved Hide resolved
arkane/gaussianTest.py Outdated Show resolved Hide resolved
arkane/gaussian.py Outdated Show resolved Hide resolved
@goldmanm goldmanm force-pushed the arkane_output_fixes branch 3 times, most recently from f27be77 to cc0cd59 Compare June 9, 2019 19:47
@goldmanm
Copy link
Contributor Author

goldmanm commented Jun 10, 2019

I am a bit confused about what RMG-tests is telling me. It gives an error on superminimal of

Regression input file not found. Not running a regression test.
The command "./run.sh rmg/superminimal no" exited with 0.

My intuition is that this is saying something errored since it couldn't find the input file for regression testing, and I am confused how my changes in arkane caused this issue. Also when I try superminimal on my own computer, it works as expected.

@mjohnson541 and @mliu49, y'all were the last ones to modify check.sh, so y'all might know what I can do about this error.

@mliu49
Copy link
Contributor

mliu49 commented Jun 10, 2019

It's simply saying that it's not running the regression test (which compares model outputs) because there isn't an input file for it. This is separate from the RMG input file.

The failure is from the liquid oxidation example, which had a problem reading comments from the Chemkin file.

@goldmanm
Copy link
Contributor Author

I don't know why it's liquid oxidation given the error message has 'superminimal' in it's description.

@mliu49
Copy link
Contributor

mliu49 commented Jun 10, 2019

@goldmanm
Copy link
Contributor Author

Thank you so much @mliu49! I'll figure it out now and get it fixed.

@goldmanm goldmanm force-pushed the arkane_output_fixes branch from cc0cd59 to fa91e2e Compare June 10, 2019 01:31
@goldmanm
Copy link
Contributor Author

I think I fixed the issue which dealt with parsing string comments. I took the file generated from the oxidation example, which contained the warning:

! This reaction has length 53 which is 1 longer than Chemkin allows.
! Consider shortening species names to run chemkin.

And tested it on chemkin, which gives no error. I even added 16 more characters, and still no error. I originally got the limit of 52 characters from an RMG post arguing we can only have 10 character long species because the reaction length cannot be longer than 52. I checked the recent version of chemkin documentation on Raineir 'input.pdf' and it gives no such restriction on character length.

Maybe in the last decade or so chemkin has become more flexible at parsing reactions, and it is no longer necessary to have this warning I wrote (or limit chemkin identifiers to 10 characters; we can use the full 16 characters). What do y'all think? I can post the extending chemkin identifiers to 16 characters as an issue and (for this PR) remove the warnings shown above?

@mliu49
Copy link
Contributor

mliu49 commented Jun 10, 2019

We currently have Chemkin 17.0, so the rmgpy.chemkin module is severely outdated...

That aside, it seems like the 16 character limit on species identifiers is still valid. However, reaction strings don't seem to have a limit as you mentioned.

I think it's fine to no longer check reaction line lengths. I'm still not sure how I feel about increasing species identifier lengths past 16 though...

rmgpy/chemkin.pyx Outdated Show resolved Hide resolved
rmgpy/chemkin.pyx Outdated Show resolved Hide resolved
@goldmanm goldmanm force-pushed the arkane_output_fixes branch from fa91e2e to c3c892e Compare June 10, 2019 17:04
@goldmanm
Copy link
Contributor Author

I fixed the suggestions and added 2 more commits (one setting chemkin identifier length to 16 and one creating the --no-plot option)

"""
lines = entry.splitlines()
species = str(lines[0][0:18].split()[0].strip())
species = str(lines[0][0:22].split()[0].strip())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thermo specification has columns 1-16 for the species name and columns 19-24 for a comment.

Was there a reason you chose 22? I think it could probably be safely changed to 24.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify a bit, if we're reading external chemkin files, we should still be able to parse the identifier and comment properly. If we're reading RMG chemkin files where we might have an extra long identifier, we might as well use the full 24 characters. If we're writing chemkin files, using 22 characters already breaks chemkin compatibility, so it won't matter if we use 24. So this can be changed at lines 1470 and 1968 as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. I modified it to allow longer strings here.

@mliu49
Copy link
Contributor

mliu49 commented Jun 10, 2019

Just to double-check, would you like to include this in the release? (I'm currently aiming for Friday.)

If we're not confident, it could be held off until the next one. I think we'll likely need a small release (e.g. 2.4.1) before the Arkane paper anyway, for incoming features/fixes.

Also, seems like there are new issues with RMG-tests related to species identifiers in Cantera...

@goldmanm
Copy link
Contributor Author

I would like for it to get incorporated in this release for my butanol paper, though I don't think it is worth it if all of us are not confident in it.

@goldmanm goldmanm force-pushed the arkane_output_fixes branch from c3c892e to 3d9738c Compare June 11, 2019 01:51
@mliu49
Copy link
Contributor

mliu49 commented Jun 11, 2019

I see. I think we can include it then, once we fix the RMG-tests build.

@mliu49
Copy link
Contributor

mliu49 commented Jun 11, 2019

I looked into why RMG-tests is failing. I think it's simply a bug related to how RMG-tests works.

The tests are failing on the regression test, which is run by an RMG-Py script. The script runs Cantera simulations using RMG-Py wrappers, which involves calling getSpeciesIdentifer at multiple points.

RMG-tests uses the testing version of RMG-Py to run this test (changed recently, but I think it would fail even if we were still using the benchmark version). Thus, when we try to simulate the benchmark model (with old identifiers) and call the new getSpeciesIdentifier function, we get a different identifier which does not match one in the model.

Therefore, I think we can actually ignore the RMG-tests failure for this PR. If everything else is ok, then I think we can merge this.

I only looked at the changes to chemkin.pyx, which I think is good now. @alongd, could you approve based on whether you think the Arkane changes are ok?

@goldmanm
Copy link
Contributor Author

I was thinking that was the issue with RMG-Tests but couldn't find exactly where the comparison was happening. I wonder if the cantera species & molecules are built from the rmg objects, instead of using the cti file, if this would prevent this error from appearing when changing getSpeciesIdentifier.

When I ran the regression test with RMG-Py (using the new version for both, which is redundant, i know), it worked perfectly fine.

@alongd
Copy link
Member

alongd commented Jun 12, 2019

@goldmanm, I'm testing out this branch. Ran the Arkane species Benzyl example, and got:

(rmg_env) alongd@alongd:~/Code/RMG-Py/examples/arkane/species/Benzyl$ python $rmgpy/Arkane.py input.py
Using Theano backend.
Traceback (most recent call last):
  File "/home/alongd/Code/RMG-Py//Arkane.py", line 57, in <module>
    arkane.parseCommandLineArguments()
  File "/home/alongd/Code/RMG-Py/arkane/main.py", line 140, in parseCommandLineArguments
    self.plot = args.plot
AttributeError: 'Namespace' object has no attribute 'plot'

Were you able to run it smoothly?
It's strange, since our tests should run all of Arkane's examples

goldmanm added 5 commits July 27, 2019 01:08
Previously, having termolecular products would break Arkane
since it doesn't properly account for the reverse reaction.
This commit changes the default behavior of Arkane to generate
plots, and integrates this change into the default documentation.
This commit changes the default behavior of a high-p kinetics
job to only draw the PES when the user specifies that they would
like plots, by using the '-p' option when running arkane. This
allows users who want automatically generated graphs to have them,
whereas those that would not like those graphs will not obtain
extra files or run extra code.
This commit refactors arkane thermo output by splitting `save`
into `write_output` and `write_chemkin`, adding try, except
statements on outputs, and passing the output directory instead
of the `output.py` file path to each method.
This commit refactors arkane statmech output by renaming `save`
to `write_output`, adding try/except
statements on outputs, and passing the output directory instead
of the `output.py` file path to each method.
@goldmanm goldmanm force-pushed the arkane_output_fixes branch from 9511243 to b48c0f9 Compare July 27, 2019 05:53
goldmanm added 17 commits July 27, 2019 02:01
This commit refactors arkane kinetics output by splitting `save`
into `write_output`, `write_yaml` and `write_chemkin`, adding try/except
statements on outputs, and passing the output directory instead
of the `output.py` file path to each method.
Previous commit got job.execute to take in the output directory
instead of the `output.py` file. This commit gets commonTest.py
to use this new input style.
This commit allows Log objects to parse out the T1 diagnostic,
so it can be automatically found and returned for Arkane users.

The method was also added to the QchemLog class.
Previously, it was not possible to retrieve and save the
point group used for symmetry calculations in Arkane's
output data for supporting information.

This commit refactors a Log method to also output point
group information, which involved renaming it to the
more general `get_symmetry_properties`.
This commit refactors where the supporting information is saved
to be at the end of StatMechJob.load method instead of partially
in the __init__ method and throughout the load method.

This commit also adds the symmetry, optical isomers,
energies (with and without zero-point energy), T1 diagnostic,
and geometry to the exported csv file.

This commit also prevents from creating the supporting_information.csv
file when no species will be output, simplifying the process for
other users of Arkane.

Try/except statements are added to reduce potential for errors.

This commit renamed the modified e_electronic to e_electronic_with_corrections
to allow for the base electronic energy to be output.

This commit also renames the unused e_0 to e0
This commit prevents plotHinderedRotors from being called
unless the user uses the `--plot` argument. It also has a try
except statement to prevent plotting from stopping a job.
Previously, hindered rotor data was extracted from log data
and saved into the directory where the log file was containted.
This decentralized the information making it more difficult to
compare and plot.

This commit places this data inside `hindered_rotor_scan_data.csv`
for all species and transition states in the arkane run, which
facilitates future usage of the data.
This commit adds methods for Arkane to automatically read
hindered rotor scan information, like which dihedral is being
pivoted, which dihedrals are being fixed, and outputs it to
the hindered rotor csv file for easier analysis.
Previously, plotHinderedRotor was called under the `load` method,
which gave limited flexibility as to where the hindered rotor plot
would be saved and was also unintuitive since it was not in a
method labeled `output`.

This commit refactors the plotHinderedRotor method by separating
it into two separate methods, one for internally storing the data,
and a second for writing the plot to disk. This should reduce
any potential errors in writing from impacting the loading and
creating of a model.

The refactor also saves the plots in the Arkane output folder,
which should be more intuitive for users.
This commit updates Arkane documentation to indicate the label
given species is used in outputs of Arkane, indicate that symmetry
and optical isomers are no longer required inputs, as well as
update the hindered rotor section to indicate that symmetry is not
required as well.
If not provided, it will read from the frequencies file.
Added method to get the D1 diagnostic from coupled cluster
calculations with molpro.
This commit outputs the D1 diagnostic for coupled cluster calculations
in the supporting_information.py when running Arkane calculations.
This commit gets StatmechJob to use getatom_correction
and get_bac, so that the corrected energies
from both methods can be included in future statmech analysis
(for example, not including bond corrections for kinetics
calculations, but including them in thermo calculations).

Since apply_energy_correction
is no longer used, a deprecation warning has been added.
The documentation previously said the order of atoms in the bond
correction labels is NOT important. This is entirely incorrect.
This commit corrected this by copying info from the comments in
applyBondEnergyCorrections.
This commit uses the molecule.enuerate_bonds to generate the bonds
for bond energy corrections if the user did not specify the bonds.
Previously statmech.py used `isinstance(x, GaussianLog, QchemLog, MolproLog))`
when determining if the logfile has the proper methods implemented.
This may cause issues when adding more subclasses, if these lines
are not properly modified. This commit reduces that potential error
by replacing that command with

`isinstance(x, Log) and type(obj).__name__ != 'Log'`
@goldmanm goldmanm force-pushed the arkane_output_fixes branch from b48c0f9 to 841c8a0 Compare July 27, 2019 06:01
@goldmanm
Copy link
Contributor Author

goldmanm commented Jul 27, 2019

@alongd I improved the docstrings for the methods in the thermo, kinetics, and statmech files for their execute, save_output and save_chemkin methods. Examples below:

        Execute the thermodynamics job, saving the results within
        the `output_directory`.

        If `plot` is true, then plots of the raw and fitted values for heat
        capacity, entropy, enthalpy, gibbs free energy, and hindered rotors
        will be saved.
        Save the results of the thermodynamics job to the `output.py` file located
        in `output_directory`.
        Appends the thermo block to `chem.inp` and species name to
        `species_dictionary.txt` within the `outut_directory` specified

Thank you for such extensive help improving this PR.

Copy link
Member

@alongd alongd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks!!

@alongd alongd merged commit 1369f9e into master Jul 27, 2019
@alongd alongd deleted the arkane_output_fixes branch July 27, 2019 15:22
@mliu49 mliu49 mentioned this pull request Dec 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants