Merge model improvements #1649

goldmanm · 2019-07-11T02:00:56Z

Motivation or Problem

merge_models has some unexpected behavior, like throwing divide by zero errors when empty models are given and renaming the species in a chemkin model

Description of Changes

This PR cleans up some of the workings of the merge_models tool

refactored the execute method to call two separate methods that perform different tasks, which allows for better unittesting
added a unittest to ensure proper behavior
prevents print statements from causing divide by zero errors for models without any reactions :)
minimize reindexing to only those species in conflicts.

Testing

Haha. I ran the new unittest. :) I also ran merge model on minimal and superminimal examples.

Reviewer Tips

Run merge models on two of your favorite RMG generated models :)

mliu49

I made a couple comments. Could you also rebase this onto master without all of the commits from arkane output fixes?

mliu49 · 2019-07-12T17:22:13Z

rmgpy/tools/merge_models.py

+    # ensure no species with same name and index
+    label_index_dict = {}
+    for s in final_model.species:
+        if s.label not in label_index_dict.keys():


This can just be if s.label not in label_index_dict. No need to call keys().

good to know! Just made the code simpler!

mliu49 · 2019-07-12T17:24:11Z

rmgpy/tools/merge_models.py

+        else:
+            if s.index in label_index_dict[s.label]:
+                # obtained a duplicate
+                s.index = max(label_index_dict[s.label]) + 1


Do we want to have unique indices? In other words, do we want to reindex starting from the largest index in the entire model + 1?

This code isn't guaranteeing unique indexes at all. To do that, we would pretty much have to renumber everything except the first model, which would make this PR much less useful.

From my point of view, if the user starts with an un-indexed mechanism, like aramcomech, the output should also not have indexes.

Is there a benefit to unique indices that I've overlooked? My main goal for this PR was to not have any CHEMKIN or cantera strings conflict, while keeping those names as similar as possible to the input names.

To give an example of the functioning in this PR: if model 1 has CH3(15) and model 2 has H2O(15), then neither of their names would be changed, since both the name and index don't match.

I would think for many users, avoiding chemkin or cantera names changing would be preferable to ensuring every species has a unique number.

Let me know your thoughts on this

Ah, I forgot that we would also need to check for species which already have the same indices.

I think it would only matter if you were merging RMG models and wanted to load the merged model back into RMG, which would parse the indices from the labels. I'm not sure exactly what effect duplicate indices would have in that scenario though. It might not actually matter.

I'm ok with the behavior you implemented. I think the key desired outcome is that the resulting merged mechanism file is directly simulate-able in Chemkin or Cantera.

mliu49 · 2019-07-12T17:24:50Z

rmgpy/rmg/model.py

+        #for spec in finalModel.species:
+        #    if spec.label not in ['Ar','N2','Ne','He']:
+        #        spec.index = speciesIndex + 1
+        #        speciesIndex += 1


Since you implemented a different indexing system, I would just remove this section instead of commenting it out.

codecov · 2019-07-16T02:02:05Z

Codecov Report

Merging #1649 into master will increase coverage by 0.2%.
The diff coverage is 82.85%.

@@            Coverage Diff            @@
##           master    #1649     +/-   ##
=========================================
+ Coverage   41.68%   41.89%   +0.2%     
=========================================
  Files         176      176             
  Lines       29341    29354     +13     
  Branches     6049     6052      +3     
=========================================
+ Hits        12232    12299     +67     
+ Misses      16239    16179     -60     
- Partials      870      876      +6

Impacted Files	Coverage Δ
rmgpy/rmg/model.py	`41.7% <ø> (+3.42%)`	⬆️
rmgpy/tools/merge_models.py	`46.91% <82.85%> (+32.62%)`	⬆️
rmgpy/data/kinetics/family.py	`52.9% <0%> (+0.23%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6274d70...a4f0320. Read the comment docs.

Previously merge_models would error if no species or reactions found. This commit prevents the divide by zero error from stopping merge_models.

This commit refactors merge_model to separate out the merging and saving of the model so that it can be effectively unittested.

This commit adds unittests to test merge_model functionality.

This commit replaces a complete reindex for merged models with a smart (as in 'smart speaker') algorithm that only reindexes a species when it is has both an identical label and index with another species. This prevents conflicts and reduces unnecessary reindexing.

alongd requested a review from mliu49 July 11, 2019 18:40

mliu49 reviewed Jul 12, 2019

View reviewed changes

goldmanm force-pushed the merge_model_improvements branch from 0d32c24 to 3e4272a Compare July 16, 2019 02:02

goldmanm force-pushed the merge_model_improvements branch 2 times, most recently from 87f2160 to eb995cf Compare July 24, 2019 18:02

goldmanm added 4 commits July 24, 2019 15:06

Improve merge_model robustness

b6de3d5

Previously merge_models would error if no species or reactions found. This commit prevents the divide by zero error from stopping merge_models.

Refactor merge_model

490fe75

This commit refactors merge_model to separate out the merging and saving of the model so that it can be effectively unittested.

Add unittests for merge_model

0c5c468

This commit adds unittests to test merge_model functionality.

goldmanm force-pushed the merge_model_improvements branch from eb995cf to a4f0320 Compare July 24, 2019 19:06

mliu49 approved these changes Jul 24, 2019

View reviewed changes

mliu49 merged commit 2664c50 into master Jul 24, 2019

mliu49 deleted the merge_model_improvements branch July 24, 2019 20:01

mliu49 mentioned this pull request Nov 26, 2019

RMG v3.0.0 Release Planning #1830

Closed

mliu49 mentioned this pull request Dec 16, 2019

RMG-Py v3.0.0 Release #1852

Merged

kspieks mentioned this pull request Apr 25, 2020

Fix indexing while merging models #1933

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge model improvements #1649

Merge model improvements #1649

goldmanm commented Jul 11, 2019

mliu49 left a comment

mliu49 Jul 12, 2019

goldmanm Jul 16, 2019

mliu49 Jul 12, 2019

goldmanm Jul 16, 2019

goldmanm Jul 16, 2019

goldmanm Jul 16, 2019

goldmanm Jul 16, 2019

mliu49 Jul 16, 2019

mliu49 Jul 12, 2019

goldmanm Jul 16, 2019

codecov bot commented Jul 16, 2019 •

edited

Loading

Merge model improvements #1649

Merge model improvements #1649

Conversation

goldmanm commented Jul 11, 2019

Motivation or Problem

Description of Changes

Testing

Reviewer Tips

mliu49 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov bot commented Jul 16, 2019 • edited Loading

Codecov Report

codecov bot commented Jul 16, 2019 •

edited

Loading