Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't read phase node from XML file when loading NASA cti and xml files packaged with Cantera #910

Closed
mcgeochd opened this issue Aug 8, 2020 · 14 comments
Labels
Input Input parsing and conversion (for example, ck2yaml) question

Comments

@mcgeochd
Copy link

mcgeochd commented Aug 8, 2020

Problem description

Creating a Solution using any of the NASA cti or xml files, nasa, nasa_gas, nasa_condensed, raises:

File "interfaces\cython\cantera\base.pyx", line 29, in cantera._cantera._SolutionBase.__cinit__
File "interfaces\cython\cantera\base.pyx", line 61, in cantera._cantera._SolutionBase._init_cti_xml
ValueError: Couldn't read phase node from XML file

Steps to reproduce

import cantera as ct
ct.Solution('nasa_gas.xml')

Behavior

Running

import cantera as ct
ct.Solution('gri30.xml')

works as expected according to the Python tutorial (https://cantera.org/tutorials/python-tutorial.html). Similarly, using KOH, silane, or water also works, so I would expect to be able to do the same with any of the nasa files that come prepackaged with Cantera on install. I haven't altered any of the files in any way since installing Cantera.

System information

  • Cantera version: 2.4.0
  • OS: Windows 10
  • Python: 3.7.6
  • VS Code: 1.47.3
@bryanwweber
Copy link
Member

Hi and thank you for posting this! This is as-designed. There are no phase-type nodes in the NASA data files, they contain only species thermodynamic information. See this thread on the Google Group for ways to use the data: https://groups.google.com/forum/m/#!searchin/cantera-users/Nasa.xml/cantera-users/NxwemTuK_RI

@bryanwweber bryanwweber added Input Input parsing and conversion (for example, ck2yaml) question labels Aug 9, 2020
@ischoegl
Copy link
Member

ischoegl commented Aug 9, 2020

While I know that this is by design, I also agree that this is not an intuitive behavior. Thus, I am wondering whether it may not make sense to allow for a more flexible loading approach in future versions of the YAML format (CTI and XML are presumably phased out after 2.5 is released). Similarly, I believe it would make sense to allow for loading of Kinetics without phase or species definitions?

Regardless, as this is not a bug, it may make sense to close this issue here and create a feature request over at Cantera/enhancements?

@bryanwweber
Copy link
Member

@ischoegl I agree the behavior is different from some of the other files we distribute, but I don't see it as non-intuitive. This is simply another (less common) use case for input files in Cantera.

What do you mean by a more flexible loading approach? If you mean to include one or several phase definitions, I don't see the point of doing so. I don't think anyone would actually want all of the species in the file involved in their phase, since the types of molecules are so disparate. If we want to define several phases, then we have to get into the question of which species should be included in which phases, how many phases to include, what should their names be, etc. I think it is best to leave it flexible without a phase definition.

What would be the use of a Kinetics object without a ThermoPhase or species? I don't see the similarity to this issue...

Let's continue discussion here for now, rather than splitting the conversation. Feature requests are not forbidden in this repo, particularly if they come out of other issues.

@ischoegl
Copy link
Member

ischoegl commented Aug 9, 2020

@bryanwweber ...

What do you mean by a more flexible loading approach? [...] What would be the use of a Kinetics object without a ThermoPhase or species?

I think my comments here draw on observations from a range of prior discussions and are about the 'big picture'. It ultimately boils down to the current implementation where there is no clear separation of Phase, SpeciesThermo and Kinetics. Forgetting everything about the current implementation, it is conceivable to load a list of species without having to necessarily assign them to a phase; the same holds true for kinetics (see discussion on ct.Reaction.listFromFile in #821). The phase node would then only establish a context as well as link species and reactions.

This boils down to having 4 main components of a Solution object rather than the current three, i.e.

  • Thermo: holds list of Species ... loading does not require definition of phases
  • Kinetics: holds list of Reactions ... loading does not require knowledge of phases / species definitions
  • Transport: no change
  • Phase: picks required species and establishes context for reactions (i.e. initialization of phase happens last; no calculations - TPX getters/setters or reaction rates - are possible without having a phase defined)

An immediate result of this separation would be that YAML files like nasa.yaml could be loaded; likewise, it's conceivable to load a YAML file that only contains reactions (which is currently also not possible). From my perspective, this would be more intuitive to new users (i.e. any YAML file can be loaded as a Solution object), and hopefully clear up some complex interactions in the C++ code base.

Another result would be that handling of model switches would become more flexible/consistent. I.e. on-the-fly switches of thermo models would be straight-forward and could be done the same way as a switch from mixture-averaged to multi-component Transport. An example is Redlich-Kwong vs ideal-gas models in nDodecane_Reitz.yaml.

@bryanwweber
Copy link
Member

@ischoegl I will take some time to consider what you've got here more thoroughly, but I have a few specific questions/thoughts on first reading. I'm focusing on the end-user perspective, since that's where you end up in your second-to-last paragraph. I haven't thought about how this might affect/simplify Cantera's internals, but that is another worthwhile discussion to have.

... Forgetting everything about the current implementation, it is conceivable to load a list of species without having to necessarily assign them to a phase; the same holds true for kinetics (see discussion on ct.Reaction.listFromFile in #821). The phase node would then only establish a context as well as link species and reactions.

I guess this was the use case which was originally intended in the extract_submechanism.py example, i.e., building a Solution out of a bunch of individual parts. It seems like a fine idea, as we discussed on #821, although a little bit complicated by what one wants the user to be able to do with that list of reactions, other than use it as an input to the Solution constructor.

This boils down to having 4 main components of a Solution object rather than the current three, i.e.

* `Thermo`: holds list of `Species` ... loading does not require definition of phases

* `Kinetics`:  holds list of `Reactions` ... loading does not require knowledge of phases / species definitions

* `Transport`: no change

* `Phase`: picks required species and establishes context for reactions (i.e. initialization of phase happens last; no calculations - `TPX` getters/setters or reaction rates - are possible without having a phase defined)

I'm not sure how this would look different than the current situation from the perspective of an end user. Moreover, considering that Transport parameters are usually included on a per-species basis, why is Transport separate from Thermo, in-so-far as Thermo is described here as holding a list of Species?

An immediate result of this separation would be that YAML files like nasa.yaml could be loaded; likewise, it's conceivable to load a YAML file that only contains reactions (which is currently also not possible). From my perspective, this would be more intuitive to new users (i.e. any YAML file can be loaded as a Solution object), and hopefully clear up some complex interactions in the C++ code base.

Leaving aside the changes to Cantera internals, I don't think this would necessarily be simpler from a user perspective. In this scenario, there is only one type of object that the user creates, and they have to keep track of whether or not it is able to calculate reaction rates, or only species thermo information, or whether it can do the whole simulation. In addition, the name Solution is meant in the chemical sense (which you probably know), and implies that it is some sort of mixture of species present, which are able to interact in various ways (diffusion, reacting, etc.). Although this distinction isn't strictly maintained everywhere (see the PureFluid class), using that same object to do all sorts of different things may just change where the non-intuitive behavior happens instead of removing it.

Finally, nasa.yaml can currently be loaded (by Species.listFromFile()), just not by Solution(). As I said in the last paragraph, I don't think it is necessarily an improvement to the user interface to allow only species definitions to be loaded by calling Solution().

Another result would be that handling of model switches would become more flexible/consistent. I.e. on-the-fly switches of thermo models would be straight-forward and could be done the same way as a switch from mixture-averaged to multi-component Transport. An example is Redlich-Kwong vs ideal-gas models in nDodecane_Reitz.yaml.

This would indeed be convenient, but as I said, I haven't thought about how these changes will affect the internals, so I don't have a comment here 😄

@mcgeochd
Copy link
Author

mcgeochd commented Aug 9, 2020

Hi and thank you for posting this! This is as-designed. There are no phase-type nodes in the NASA data files, they contain only species thermodynamic information. See this thread on the Google Group for ways to use the data: https://groups.google.com/forum/m/#!searchin/cantera-users/Nasa.xml/cantera-users/NxwemTuK_RI

Hi, thanks for getting back to me on this. The link above redirects me to: https://groups.google.com/forum/m/#!overview

I also tried:

https://groups.google.com/forum/m/#!searchin/cantera-users/Nasa.xml/cantera-users
https://groups.google.com/forum/m/#!searchin/cantera-users/Nasa.xml
https://groups.google.com/forum/m/#!searchin/cantera-users

All of which also redirect me to: https://groups.google.com/forum/m/#!overview

@ischoegl
Copy link
Member

ischoegl commented Aug 9, 2020

I'm not sure how this would look different than the current situation from the perspective of an end user. Moreover, considering that Transport parameters are usually included on a per-species basis, why is Transport separate from Thermo, in-so-far as Thermo is described here as holding a list of Species?

I think most of the changes would be ‘under the hood’; there are only a handful of situations where the end user would see a difference (nasa.yaml is one of them). Regarding your comment, it is certainly conceivable to merge transport and thermo.

the name Solution is meant in the chemical sense (which you probably know), and implies that it is some sort of mixture of species present, which are able to interact in various ways (diffusion, reacting, etc.). Although this distinction isn't strictly maintained everywhere (see the PureFluid class), using that same object to do all sorts of different things may just change where the non-intuitive behavior happens instead of removing it.

I am aware of the intended meaning of Solution, although it’s not necessarily clear in general, as solution has a more common meaning that can create confusion, especially to non-native speakers.

Regarding purist vs pragmatic usage, I think there are arguments for and against either way. I believe using Solution as a catch-all object for YAML import, which raises proper exceptions would be easiest for end users. I don’t think that it can be assumed that people study the docs prior to using Cantera, and will start with examples. Almost all examples start with creating Solution objects from various input formats and it is non-intuitive that some of the existing input files can be imported while others cannot (see this very issue).

PS: my own interpretation of Solution is that it represents a Phase, although this clashes with the current nomenclature (and would open an entirely new dimension to this discussion).

@ischoegl
Copy link
Member

ischoegl commented Aug 9, 2020

@mcgeochd ... I believe the link is https://groups.google.com/forum/#!topic/cantera-users/NxwemTuK_RI @bryanwweber lists some examples for how to import species in the first response.

@mcgeochd
Copy link
Author

mcgeochd commented Aug 9, 2020

@mcgeochd ... I believe the link is https://groups.google.com/forum/#!topic/cantera-users/NxwemTuK_RI

Thank you, that works now

@bryanwweber
Copy link
Member

@ischoegl thanks, I had some trouble with the link from my phone!

@bryanwweber
Copy link
Member

@ischoegl

Now back at a computer, to respond to your comments 😄

it is non-intuitive that some of the existing input files can be imported while others cannot (see this very issue).

Indeed, in the current implementation this is somewhat confusing. Obviously this issue is not the first time this question has been asked. However, I think that using Solution() to load any input file will lead to a different, but equally as non-intuitive confusion. Namely, people will wonder why they can put gri30.yaml into a Reactor, but not nasa.yaml, since they loaded both with Solution()... So my concern is not that the change would be bad per-se, but that it would just be a different set of the same kind of questions, which is not really progress. Again, this is purely from a user-interface side, and many of the internal changes you suggest may have merit independent of how or whether they change the user interface.

my own interpretation of Solution is that it represents a Phase, although this clashes with the current nomenclature (and would open an entirely new dimension to this discussion).

Without wanting to descend too far down this rabbit hole (language is hard!), Phase (at least when I teach thermo 1) is a single substance + a single state of matter. Even if we relax the single substance part of the definition, we still have the PureFluid class which (although transparent to the user) derives from Solution at the implementation level and allows multiple mechanical phases to exist. This is why I had to go with the awkward phase_of_matter for the attribute, because just calling it phase was ambiguous ☹️

I'd be happy to discuss further, but I think we're sufficiently far afield of this issue (which seems to be resolved for the OP). Since you have somewhat more mature thoughts on the bigger picture here, I'd suggest we move the discussion to Enhancements to talk about the bigger picture.

@mcgeochd Did that post on the Google Group resolve your issue?

@ischoegl
Copy link
Member

@bryanwweber ... yup, I forgot about the PureFluid / phase_of_matter; whenever I teach thermo, I apparently use the same definitions as you do 😉 ... agreed on Enhancements.

@mcgeochd
Copy link
Author

@bryanwweber yep, the link solved this issue, thanks for the help and the insightful discussion @ischoegl

@bryanwweber
Copy link
Member

Thanks @mcgeochd!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Input Input parsing and conversion (for example, ck2yaml) question
Projects
None yet
Development

No branches or pull requests

3 participants