Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement recreation of SolutionArray objects from stored data #687

Merged
merged 5 commits into from
Oct 3, 2019

Conversation

ischoegl
Copy link
Member

@ischoegl ischoegl commented Aug 9, 2019

Please fill in the issue number this pull request is fixing:

Fixes #543

Changes proposed in this pull request:

This commit implements two new methods for SolutionArray:

  • restore_data can restore data previously exported by collect_data
  • read_csv restores data previously saved by write_csv
  • read_hdf restores data saved by write_hdf

Intended usage is:

# create a SolutionArray, do some manipulations and save results
states = ct.SolutionArray(gas)
...
states.write_csv('some_file.csv')

and, at a later point

# restore prior work in a new python session
states = ct.SolutionArray(gas)
states.read_csv('some_file.csv')

PS: once PR #680 is merged, an equivalent method for read_hdf will be added to this PR. Done.

@ischoegl ischoegl changed the title Implement recreation of SolutionArray from stored data Implement recreation of SolutionArray objects from stored data Aug 9, 2019
@codecov
Copy link

codecov bot commented Aug 9, 2019

Codecov Report

Merging #687 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #687      +/-   ##
==========================================
- Coverage   70.62%   70.62%   -0.01%     
==========================================
  Files         372      372              
  Lines       43575    43565      -10     
==========================================
- Hits        30777    30768       -9     
+ Misses      12798    12797       -1
Impacted Files Coverage Δ
src/oneD/OneDim.cpp 74.89% <0%> (-0.42%) ⬇️
src/equil/vcs_report.cpp 81.73% <0%> (+0.43%) ⬆️
src/thermo/Elements.cpp 96.2% <0%> (+0.69%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 97356a4...8562a94. Read the comment docs.

@ischoegl ischoegl force-pushed the add-read-to-solution-array branch 5 times, most recently from 0dd6946 to e84ec2d Compare August 11, 2019 17:57
@codecov
Copy link

codecov bot commented Aug 11, 2019

Codecov Report

Merging #687 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #687   +/-   ##
=======================================
  Coverage   70.63%   70.63%           
=======================================
  Files         372      372           
  Lines       43567    43567           
=======================================
  Hits        30773    30773           
  Misses      12794    12794
Impacted Files Coverage Δ
src/transport/GasTransport.cpp 90.58% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5184ebc...e84ec2d. Read the comment docs.

@bryanwweber
Copy link
Member

Moving this discussion here so if that commit is deleted, the discussion isn't lost.

where a user may save T, P, X histories for a phase transition (being able to reconstruct this history is a completely legitimate assumption).

I agree this is a legitimate case.

The proposed setter checks for inconsistent states, and picks the most suitable setter based on whether the state is or is not saturated.

I don't think the setter is correct. If, say, the quality is saved such that it's close to saturated vapor, T and p are not unique compared to the saturated liquid, so using T, p to set the state isn't unique. At least, that's how I'm reading the elif condition.

In view of existing setters for TP, TX and PX (all of which involve implicit assumptions), I feel like allowing for TPX import is a cleaner solution.

I'm not sure what assumptions you mean here. TP is only valid for single-phase states, while TX and PX are only valid for two-phase. I believe TP will raise an exception if T and p are not independent.

In any case, shouldn't it be possible to implement the check in the SolutionArray method? If the phase is a PureFluid, and X is given and is not None, use that plus another independent variable. Or, try using T, P and if it fails, try T, X.

@bryanwweber
Copy link
Member

bryanwweber commented Aug 12, 2019

Here is the rest of the conversation about the commit adding a TPX setter for PureFluid phases:

@bryanwweber:

I'd have to search through the commit history, but I believe this was explicitly removed previously. For pure fluids, t, p, and x overspecify the state.

@ischoegl:

This is true, but prevents an import of saved data being loaded back into a SolutionArray, where a user may save T, P, X histories for a phase transition (being able to reconstruct this history is a completely legitimate assumption). The proposed setter checks for inconsistent states, and picks the most suitable setter based on whether the state is or is not saturated.

It can be said that saving T, D (or other histories) would make this unnecessary. In view of existing setters for TP, TX and PX (all of which involve implicit assumptions), I feel like allowing for TPX import is a cleaner solution.

(Sorry about the close/reopen, I fat-fingered the buttons on my phone)

@bryanwweber bryanwweber reopened this Aug 12, 2019
@ischoegl
Copy link
Member Author

ischoegl commented Aug 12, 2019

I don't think the setter is correct. If, say, the quality is saved such that it's close to saturated vapor, T and p are not unique compared to the saturated liquid, so using T, p to set the state isn't unique. At least, that's how I'm reading the elif condition.

I believe the setter is correct. X is only a legitimate property when conditions are saturated, i.e. under a case when T and P are not uniquely defining the state during a phase change. Conversely, in a single phase T and P are not linked, and X either has to be 0 or 1. I.e., in a single phase, X becomes essentially meaningless as a property. Afaik, no other values are permissible for a simple compressible substance.

Edit: re-reading the scenario, we're actually on the same page. Note that this is only possible if P=P_sat(T), which is when the if condition catches (also: there already is a unit test for this).

In any case, shouldn't it be possible to implement the check in the SolutionArray method? If the phase is a PureFluid, and X is given and is not None, use that plus another independent variable. Or, try using T, P and if it fails, try T, X.

The current implementation uses a single setter once suitable properties are identified, i.e. implementing a TPX setter is much simpler. It would require convoluted logic to handle PureFluid TP/TX setting as a special case.

@ischoegl
Copy link
Member Author

ischoegl commented Aug 12, 2019

Looking over open issues, there are some repercussions for #595, which relocates restoring states to the C++ core. A proper solution would entail defining _full_states (or equivalent) for C++ objects rather than defining them in the Cython layer.

Edit: #644 is another example where setting of states is not handled consistently with phase definitions.

As mentioned, I believe that a list of legitimate setters should be generated within C++. Nevertheless, I do not think that SolutionArray.restore_data needs to defer state restoration to the underlying C++ objects. The data labeling for export/import is defined within Python, and there is no underlying data structure that exists within C++.

@ischoegl ischoegl closed this Aug 13, 2019
@ischoegl ischoegl reopened this Aug 13, 2019
@ischoegl ischoegl force-pushed the add-read-to-solution-array branch 2 times, most recently from 2e609b8 to 7331100 Compare August 13, 2019 19:10
@bryanwweber
Copy link
Member

@ischoegl See #307 and #308 where the TPX was "removed" (more specifically, defined only to have a getter and no setter). I'll have some more thoughts later, but I didn't want to lose those issue numbers! 😄

@ischoegl
Copy link
Member Author

ischoegl commented Sep 4, 2019

To summarize some thoughts on setting states within C++ and the python interface:

  • C++ should have well-defined independent parameters that specify the state; the choice of parameters may differ depending on the type of phase implemented (see save/restore ThermoPhase state assumes density is an independent variable #595)
  • While SolutionArray objects only exist in Python, they do utilize setters that are implemented in the Python layer and tie into C++. I.e. Python setters will have to be aware of what states need to be assigned
  • At the moment, most Python setters make use of ThermoPhase, with PureFluid being the only inheriting class that redefines setters.
  • Creating a more flexible C++ state definition will require a more differentiated approach for Python representations of different phase objects.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, this looks very good to me. The set of cases handled is quite comprehensive, and I appreciate the effort put into making any valid combination of properties work for setting the state.

The only one that I question is reintroducing the TPX setter to PureFluid. There are better choices easily available -- the default properties include the best one, temperature and density, so I don't think we need to create a fallback option that mostly works but has potential for weirdness for points that fall within the threshold of the isclose checks.

interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/test/test_thermo.py Outdated Show resolved Hide resolved
Ingmar Schoegl and others added 3 commits September 24, 2019 18:14
* setter for `PureFluid.TPX` property is added to allow for automatic
restoration of consistent thermodynamic data
* alternative to 'TP', 'TX' or 'PX' which may not uniquely describe
a valid thermodynamic state
* add unit test
This commit implements new methods for SolutionArray:
* `restore_data` can restore data previously exported by `collect_data`
* `read_csv` restores data previously saved by `write_csv`
* unit tests are added for SolutionArray's based on ThermoPhase and PureFluid
@ischoegl
Copy link
Member Author

While my initial instinct was that there was a way to get yourself into trouble with the TPX setter, the logic seems pretty solid and I haven't been able to figure out a way to get any surprising behavior out of it.

👍

@bryanwweber suggested a hidden setter, i.e. _TPX. Thoughts?

@bryanwweber
Copy link
Member

@speth Yeah, I was mis-remembering what a None value for a property meant in the setters. My original thought was that None should not set any value for that property, when in fact it is holding that property constant, and T=300, P=200000, and X=0.5 is indeed inconsistent and should raise the error. 🤦‍♂️

I still think the ValueError would be surprising behavior in many cases from the user perspective, even if it is the correct behavior. What about using _TPX as the setter for PureFluids? Then a user won't accidentally use it, but the logic doesn't have to be reimplemented in the restore functions. I guess the question is, is an attribute error better than a value error?

@ischoegl
Copy link
Member Author

ischoegl commented Sep 26, 2019

The hidden setter _TPX is a solution I hadn't thought of ... it won't create much overhead (except for adding a character in the mode string). I don't have a strong preference here though ...

@speth
Copy link
Member

speth commented Sep 26, 2019

Well, the TPX property exists in either case, so there's no way to prevent a user from trying to set it. Currently, they get an AttributeError that doesn't really explain why that value isn't writable, but with the implementation in this PR, they will get a pretty clear description of what is wrong.

@ischoegl
Copy link
Member Author

ischoegl commented Sep 26, 2019

@bryanwweber ...

What's nagging at me is there just doesn't seem to be a good reason for anyone to use TPX for PureFluids aside from the specific use case in this PR of restoring bunches of states.

True. At the same time, any user has to get pretty lucky to 'guess' a valid 3-property state without getting the ValueError that points to what's awry.

Is there any way to detect that the user is calling the setter vs. the functions added here?

Could we resolve this by applying PEP 20 (it is implemented in Python after all?) 😉

@bryanwweber
Copy link
Member

bryanwweber commented Sep 26, 2019

Here are my thoughts with the "hidden" _TPX setter:

  1. The main benefit to me is that it sends a clear signal that TPX is not meant to be set for PureFluid phases. This might help relatively inexperienced users avoid a pitfall and clearly indicates the intent to future developers.
  2. The disadvantage is that any users who do try to set TPX anyways get an AttributeError instead of a possibly more useful ValueError. In addition, it adds another property that would be specific to PureFluid phases.
  3. Since it is specific to PureFluid, it doesn't generalize if, in the future, we decide to allow more complicated EOS (R-K or P-R comes to mind) to have their state set with the mixture quality. I'm not sure if this is something we can or will support, but on my mind.

I did some experimenting this morning trying to do weird things with the regular TPX setter and I also couldn't find any really weird results. One suggestion I have is to do a bit of type checking on the X part of the setting. For instance, I think it would help to improve the error message in the following situation, which I think would be common if people try to set mole fractions of a PureFluid:

In [6]: w.TX = 500, 0.5

In [7]: w.TPX
Out[7]: (500.0, 2634957.1389629007, 0.5)

In [8]: w.TPX = None, None, {'CH4': 1.0}
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-8-1ecf254b1d00> in <module>
----> 1 w.TPX = None, None, {'CH4': 1.0}

~/GitHub/cantera/interfaces/cython/cantera/thermo.pyx in cantera._cantera.PureFluid.TPX.__set__()
   1578             X = values[2] if values[2] is not None else self.X
   1579             if np.isclose(P, self.thermo.satPressure(T)):
-> 1580                 self.TX = T, X
   1581             elif np.isclose(X, 0.) or np.isclose(X, 1.):
   1582                 self.TP = T, P

~/GitHub/cantera/interfaces/cython/cantera/thermo.pyx in cantera._cantera.PureFluid.TX.__set__()
   1464             T = values[0] if values[0] is not None else self.T
   1465             X = values[1] if values[1] is not None else self.X
-> 1466             self.thermo.setState_Tsat(T, X)
   1467
   1468     property PX:

TypeError: must be real number, not dict

That TypeError is pretty clear, but I think it could be more helpful if TPX checked for a number and raised an error, rather than deferring to TX.

One other suggestion is to print the T, P, and X values in the ValueError message. It's already pretty long, and I don't immediately see a way to shorten it, but I think that would be good too.

@ischoegl
Copy link
Member Author

ischoegl commented Sep 26, 2019

That TypeError is pretty clear, but I think it could be more helpful if TPX checked for a number and raised an error, rather than deferring to TX.

One other suggestion is to print the T, P, and X values in the ValueError message. It's already pretty long, and I don't immediately see a way to shorten it, but I think that would be good too.

Both suggestions are reasonable and can be implemented with little effort. Will get to it later today (which also allows @speth to chime in).

PS: ad other EOS ... the dual-purpose X being used for mole fraction and mixture quality depending on context raises some 'interesting' questions for R-K, etc.. I do not have an opinion beyond this observation at this point (it certainly goes well beyond this PR - and I haven't looked into what is being done in other PR's that are currently implementing these).

@bryanwweber
Copy link
Member

bryanwweber commented Sep 26, 2019

To my understanding, the other EOS types are multi-species capable and X means mole fraction. Given this is the case, it may make sense to use Q for quality at some point in the future. Clearly out of scope for this PR, but just throwing it out there...

@ischoegl
Copy link
Member Author

Given this is the case, it may make sense to use Q for quality at some point in the future. Clearly out of scope for this PR, but just throwing it out there...

👍 that would have been my go-to solution also

@speth
Copy link
Member

speth commented Sep 26, 2019

All of the choices here are nonideal. I don't really like the idea of adding the _TPX property which will only be used in this one scenario. At the same time, that same logic basically applies to adding a setter for TPX.

I'm also starting to think that option 3 doesn't sound so bad, but I'm still OK with option 1 as implemented.

Given this is the case, it may make sense to use Q for quality at some point in the future. Clearly out of scope for this PR, but just throwing it out there...

This would also be consistent with other naming conventions that have been adopted in the Python module, e.g. D for density instead of R for Rho. I think there are additional, larger complexities for the multi-species two-phase phases that are currently unsolved, e.g. how to access the composition of each phase which currently can't be done.

@ischoegl
Copy link
Member Author

ischoegl commented Sep 26, 2019

🤷‍♂️ ... I can go either way. I have a mild preference for 1 (mostly PEP 20 related), but do not have a problem with 3. On the other hand, since the ValueError will appear for anyone not respecting thermodynamics, I feel the facetious urge to leave things in place 😁

Copy link
Member

@bryanwweber bryanwweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 👍 with the properties as they are here (i.e., option 1), and I've suggested a few small changes here.

interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/composite.py Outdated Show resolved Hide resolved
interfaces/cython/cantera/thermo.pyx Outdated Show resolved Hide resolved
interfaces/cython/cantera/thermo.pyx Show resolved Hide resolved
interfaces/cython/cantera/test/test_purefluid.py Outdated Show resolved Hide resolved
Incorporate reviewer suggestions for error messages (user feedback) in
PureFluid.TPX setter and SolutionArray.restore_data
@speth speth merged commit fe7be79 into Cantera:master Oct 3, 2019
@ischoegl ischoegl deleted the add-read-to-solution-array branch December 16, 2019 15:56
@stijn76
Copy link
Contributor

stijn76 commented Jul 15, 2021

I'm very new to this forum (and to any development forum for that matter), so my apologies in case my remark is not formulated at the right place or in the right way. I was using the latest cantera version, and trying to recreate a SolutionArray objects from stored data, but got a ValueError. The problems does not seem to be solved in the composite.py source code files which I found here at github.

The minimal code to reproduce the problem is:

import cantera as ct

premix = ct.Solution('gri30.yaml')
states = ct.SolutionArray(premix)
test1 = ct.SolutionArray(premix)
test2 = ct.SolutionArray(premix)

T_in = 300
P_in = 101300
air2gas = 7.6
premix.X = 'N2:'+str(air2gas*0.79) + ', O2:'+str(air2gas*0.21) + ', CH4:'+str(1.0)
premix.TP = T_in , P_in
states.append(T=premix.T, P=premix.P, X=premix.X)
states.write_csv('test1.csv', 'X')

# Reading the csv file with only 1 data line, throws an IndexError
# test1.read_csv('test1.csv')

air2gas = 8.3
premix.X = 'N2:'+str(air2gas*0.79) + ', O2:'+str(air2gas*0.21) + ', CH4:'+str(1.0)
premix.TP = T_in , P_in
states.append(T=premix.T, P=premix.P, X=premix.X)
states.write_csv('test2.csv', 'X')

# Reading the csv file with more than 1 data line, throws a ValueError
test2.read_csv('test2.csv')

I believe the problem is situation in de definition of the "join" function in the "composite.py" file. More specifically, within that definition, valid species is declared as:

valid_species = {s[2:]: labels.index(s) for s in spc
                             if s in labels}

but should be declared as

valid_species = {s[:]: labels.index(s) for s in spc
                             if s in labels}

or else the ValueError will always be thrown because valid_species represents species_names without prefixes ('X_', or 'Y_') while all_species represents species_names with those prefixes.

if len(valid_species) != len(all_species):
                    incompatible = list(set(valid_species) ^ set(all_species))
                    raise ValueError('incompatible species information for '
                                    '{}'.format(incompatible))

Warm regards, Stijn

@stijn76
Copy link
Contributor

stijn76 commented Jul 15, 2021

...sorry, the large bold sentences in my previous post were meant to be comments in the python code...

@speth
Copy link
Member

speth commented Jul 15, 2021

I think your post from earlier today in the Cantera Users' Group is the right place to discuss this. For anyone else finding this, the post can be found at https://groups.google.com/g/cantera-users/c/AFa5u3F6JdQ.

@ischoegl
Copy link
Member Author

ischoegl commented Jul 15, 2021

@stijn76 ... I edited your post so it shows up properly (code blocks should be between triple back ticks).

There is a minor bug in your code, as you do not export the full state to csv (only concentrations). But even fixing that, the errors you pointed out persist, and I was able to confirm what you pointed out above.

Interestingly, things mostly work for h2o2.yaml (which is used for unit tests here), although the single line bug still persists. In other words, this works:

import cantera as ct

gas = ct.Solution('h2o2.yaml')  # if changed to gri30.yaml, a ValueError is thrown for read_csv
states2 = ct.SolutionArray(gas)
gas.X = 'H2:0.5, O2:0.4'
gas.TP = 300, 101300
states2.append(T=gas.T, P=gas.P, X=gas.X)
states2.append(T=gas.T, P=gas.P, X=gas.X) # if commented out, this will throw an IndexError for read_csv

states2.write_csv('test3.csv', ('T', 'P', 'X'))
b = ct.SolutionArray(gas)
b.read_csv('test3.csv')

PS: Please create a new issue/bug report, rather than attaching this to this old pull request.

@stijn76
Copy link
Contributor

stijn76 commented Jul 16, 2021

thanks for your help and advise! I've created a new bug report.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SolutionArray save / load
4 participants