-
-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable modification of user input data fields in Python #1031
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1031 +/- ##
=======================================
Coverage 72.59% 72.60%
=======================================
Files 356 356
Lines 46508 46508
=======================================
+ Hits 33763 33765 +2
+ Misses 12745 12743 -2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have some thoughts later this week.
OK ... I mostly have a philosophical question about the implementation, which boils down to what When I first saw the term Going forward, I can see two options for how users interact with objects:
The PR as it stands is (1), and is a legitimate 'solution'; I mainly want to make sure that there may not be a potential use case for (2) as this would be the time to start. The implementation would be quite simple, where all that is needed is a second |
It occurred to me recently that with the addition of YAML serialization, even the phrase "input file" is a little bit weird, since now you can output an input file. I'm certainly open to suggestions for alternative terminology here, as I do see why the name I don't quite know what a potential use case for (2) would be, but as you said, this information can be retrieved by loading another |
Re "input file" vs. some other name, I've been using "mechanism file" in my own code, which I think works. The rebuttal question that immediately comes to mind, though: even if we are "outputting" a YAML file, is is only useful/a thing we care about to the extent that it will one day be imported as an "input" for some other, future simulation/numerical experiment? I actually think the answer is "no - it has other uses." i.e if you are fitting experimental data and adjusting the mechanism details on the fly, those rates, thermo values, etc, would be the result of the fitting, and something you would report out. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @speth. I don't have any strong opinions about the naming distinction, since as @decaluwe points out, these files can be used in many different contexts. "Mechanism file" implies to me that it involves some reaction mechanism, which also may not always be the case. "Input file" seems to be the best choice to me, but no strong opinion.
I do have several questions here about the implementation, probably because this is my first time really digging into the details. So if my questions/suggestions don't make any sense, please take them in the "please teach me something" spirit 😄
Clear any user-provided auxillary input data, so that the data given by | ||
`input_data` or `write_yaml` will only include values generated by Cantera. | ||
""" | ||
self.thermo.input().clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see how this distinguishes between data that the user has added using update_user_data
and the data coming from the YAML file that was used to create the Solution
. Can you clarify that for me? Thanks 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't distinguish - both are erased here. But once the object's been constructed, Cantera doesn't need a pristine copy of the data from the input file, so clearing this doesn't have any noticeable effect besides removing the auxiliary data, since the necessary fields will be calculated through the calls to getParameters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in principle, this does not support true round-tripping of an input file, right? If all the data are cleared, and any data computed by Cantera is not quite identical to the original file, the resulting generated file wouldn't be identical?
This may be a pedantic thing, but since the docstring says one thing, and the code actually does something else, I foresee potential problems as a result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I think we're getting pretty far off the purpose of this PR, as this basically all about features that were introduced in #984 if not earlier, but I can try to provide a bit of an explanation.
We don't keep track of how an object has changed since it was instantiated from the input file, so any time you ask for the input_data
or call write_yaml
, that's being calculated on the fly. I don't think there's anything that promises to precisely round-trip an input file, and there are currently a variety of ways in which that isn't currently possible. Some of these will probably be addressed relatively soon, like the serialization of the file-level metadata (as requested in #1013), while others may not (like retaining comments in YAML files). The only promise that I think we should be trying to make regarding serialization is that you can generate an input file that will recreate an object in its current state, and I think that is currently the case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in principle, this does not support true round-tripping of an input file, right? If all the data are cleared, and any data computed by Cantera is not quite identical to the original file, the resulting generated file wouldn't be identical?
This may be a pedantic thing, but since the docstring says one thing, and the code actually does something else, I foresee potential problems as a result.
This is quite similar to what I had mentioned in one of my comments above (what does input data mean?). Either path forward is fine; I believe round-trip conversion would be simple to implement as stated earlier:
The implementation would be quite simple, where all that is needed is a second
AnyMap
member variable that holds information added after the objects are instantiated, plus some tweaks for logic in some of the getters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine if we consider changing/discussing the round-trip behavior out of scope for this PR. That said, I'd prefer if this docstring were a little more accurate to reflect what's actually happening. Since this very explicitly says that user-provided data will be cleared, it implies that's the only thing that's cleared. Maybe something like:
Clear auxiliary input data, including user-provided values. When
write_yaml
orinput_data
are called, Cantera will regenerate the input data set to reproduce the current object state.
I'm not sure if this is accurate though; I'm thinking of the case where a user may have manually, e.g., changed a species thermo entry (for example, to debug a discontinuous polynomial) and that data will be written out instead of the data that was originally used to create the Solution()
. Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had been using "auxiliary" to mean data that is not used by Cantera, so there is no auxiliary input data that isn't user-provided. I guess this would be more precise:
Clear all saved input data, so that the data given by
input_data
orwrite_yaml
will only include values generated by Cantera based on the current object state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds great to me! Thanks!
Data specified this way is stored when writing to YAML files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @speth!
Changes proposed in this pull request
update_user_data
andclear_user_data
methods to Python objects with aninput_data
propertyThis allows custom data to be stored when writing to YAML files. Implementation of this was, as expected, 🍰 with the
dict_to_anymap
method already in place thanks to #1014.Checklist
scons build
&scons test
) and unit tests address code coverage