Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some arrays need accompanying properties array #282

Open
gold2718 opened this issue Apr 10, 2020 · 28 comments
Open

Some arrays need accompanying properties array #282

gold2718 opened this issue Apr 10, 2020 · 28 comments
Assignees
Labels
capgen bugs, requests, etc. that involve ccpp_capgen enhancement

Comments

@gold2718
Copy link
Collaborator

In order to facilitate processing by unrelated suites, arrays of constituents (e.g., a Volume Mixing Ratio (VMR) array) must be accompanied by an array of constituent properties where each element of the properties array contains the properties of the corresponding constituent.
This issue is an extension of the Proposal for Communicating Constituent Properties which contains more information on specific properties but which has been dormant for about 1.5 years.
Some properties of the constituent properties array.

  • The standard name of the constituent properties array is related to the standard name of the associated constituent array. For instance, if the constituent array's standard name is volume_mixing_ratio_array, the standard name of the associated properties array would be properties_of_volume_mixing_ratio_array.
  • The properties for each constituent are defined in that constituent's metadata.
  • The properties for each constituent are collected into a Derived Data Type (DDT) which is created by the CCPP Framework and made available via an element of a public, protected array of objects this DDT.
  • The elements of the DDT will be private with an "accessor" method for each property.

@AndrewJConley, @mattldawson, @cacraigucar, thoughts?

Some implementation thoughts:

  • I am thinking of implementing the DDT in its own module or perhaps in ccpp_kinds.F90 (since it is a kind).
  • A particular constituent properties array will probably be written either to its own module or to the suite CAP file. It will be initialized at the top of the suite's initialization subroutine (there are still compiler issues with initializing DDT arrays as module data). An advantage of having it in its own module is avoiding possible circular dependency issues since the array will be defined in one suite but will also used by a different suite and the host model.
  • The accessor functions could be used directly but will also probably be accessible as host cap functions (which take a standard name plus an index) to ease use from host models.
  • Direct accessor use by physics schemes will be enabled by having the properties array as an input (via the standard name above).
@gold2718 gold2718 added enhancement capgen bugs, requests, etc. that involve ccpp_capgen labels Apr 10, 2020
@mattldawson
Copy link

Hi @gold2718 - looks great to me! I think this will help avoid the Sisyphean task of standardizing names of lumped organic species across chemical mechanisms, emissions schemes, photolysis modules, etc. and allow schemes to treat species based on their properties (as opposed to their names) to the extent this is possible.

Having these derived types in separate modules sounds like a reasonable approach.

One very minor thing: when it comes time to name these properties for chemical species could we use something along the lines of chemical_species_volume_mixing_ratio and chemical_species_properties or gas_volume_mixing_ratio and gas_properties? - only because these are properties of the species and not necessarily their volume mixing ratios.

@gold2718
Copy link
Collaborator Author

Note that this issue is related to, and may subsume #276.

@gold2718 gold2718 self-assigned this Apr 20, 2020
@gold2718
Copy link
Collaborator Author

@mattldawson, Thinking about this a bit, I think we could have one array called chemical_species_array which could have some species which are VMR and some which are MMR (or something else such as number concentration). The distinction would be made via the standard name (e.g., ozone_vmr vs. ozone_mmr). The same would be true of a gas_species_array.
Another question would be if we want to have a water_species_array which could be useful in computing wet to dry to wet conversions (answering the question of how many water species to include).
Thoughts? @grantfirl, @climbfuji, @AndrewJConley, @cacraigucar?

@mattldawson
Copy link

Hi @gold2718, So just to make sure I'm understanding, there would be something like:

  !> for explicitly named species...
  real :: ozone_vmr(n_col, n_cell)
  type(chem_prop_t) :: ozone_properties
  real :: no2_mmr(n_col, n_cell)
  type(chem_prop_t) :: no2_properties

  !> ... and for ambiguously named species...
  real :: gas_species_vmr(n_col, n_cell, n_species) ! (or mmr or #/cc or whatever is decided)
  type(chem_prop_t) :: gas_species_properties(n_species)

Is this sort of the idea?

@AndrewJConley
Copy link

AndrewJConley commented Apr 21, 2020

For the gas-phase chemicals, the chemists work with the number density of all of the species. Number density has a lower dependence on the context of the host model. I.e., to what thing is the ratio constructed in an arbitrary host model? Wet ratio? Dry ratio? Wet ratio excluding condensed material?
Of course, this is perspective is particular to the collection of chemical models being developed for NCAR/ACOM

@AndrewJConley
Copy link

As far as wet/dry/mass/volume conversions go, those are all scheme/model-specific decisions, as the denominator depends upon the scheme’s assumptions as to what the denominator is.

@gold2718
Copy link
Collaborator Author

@mattldawson, You can have an explicitly-named species as an element of an array.

[ q ]
  standard_name = gas_species_array
  type = real |   kind = kind_phys
  units = kg/kg moist or dry air depending on type
  dimensions = (horizontal_dimension, vertical_layer_dimension, number_of_chemical_species)
[ q(:,:,index_of_ozone_vmr) ]
  standard_name = ozone_vmr
  type = real |   kind = kind_phys
  units = ppmv # Or whatever units we want here
  dimensions = (horizontal_dimension, vertical_layer_dimension)
[ q(:,:,index_of_co2_mmr) ]
  standard_name = co2_mmr
  type = real |   kind = kind_phys
  units = kg kg-1
  dimensions = (horizontal_dimension, vertical_layer_dimension)

The issue here is what is the type of all the other elements of q which are not listed explicitly? We could require every species to be "named" even if a lot of the standard names are things like standard_name_101_vmr or standard_name_923423829_mmr.

@gold2718
Copy link
Collaborator Author

@AndrewJConley, that's why I think the water_species_array is an interesting concept. Whoever controls that array controls the definition of 'wet' for the run.

@fvitt
Copy link
Collaborator

fvitt commented Apr 23, 2020

I agree with @AndrewJConley. Things would be simpler if q is the number density of all the tracers rather than a ratio. If a scheme wants a mixing ratio it can derive it from the number densities, defined however they like.

@fvitt
Copy link
Collaborator

fvitt commented Apr 23, 2020

It seems to me like we need a collection of constituent objects. Each object be queried for properties or attributes and state (concentration).

@gold2718
Copy link
Collaborator Author

Suggestion from @climbfuji:
In addition to chemical_species_array, gas_species_array and water_species_array. we could define arrays such as advected_species_array and other special-purpose arrays which would be known to the framework. This would enable, for example, the host model to specify the advected_species_array in its 'host' metadata and the framework would provide or update that array on each call to a CCPP physics group.

@climbfuji
Copy link
Collaborator

Suggestion from @climbfuji:
In addition to chemical_species_array, gas_species_array and water_species_array. we could define arrays such as advected_species_array and other special-purpose arrays which would be known to the framework. This would enable, for example, the host model to specify the advected_species_array in its 'host' metadata and the framework would provide or update that array on each call to a CCPP physics group.

I followed up with GFDL on this question, and they already have that capability in their dycore. That is, two separate arrays of tracers, q and qdiag (the latter not advected). This capability is just not used in the UFS. So all it takes is to make the connection of those arrays with the CCPP standard names.

@AndrewJConley
Copy link

@fvitt Regarding constituent information. There is a proposed standard for chemistry mechanisms and chemical information at https://github.com/NCAR/MechanismToCode/tree/master/schemas. What do you think of the information and the way that information is organized? How do you think that would interact with the ccpp-framework?

@AndrewJConley
Copy link

Regarding the question of "do we want to have a water_species_array"? I can see the need for that among the parameterizations that need to know that data. For example, see the work being done by Peter Lauritzen to clarify thermodynamics in the physics parameterizations of CESM. But for Chemistry, we need the number density of gas-phase species.

@fvitt
Copy link
Collaborator

fvitt commented Apr 24, 2020

@AndrewJConley the chemical information is provided via JSON format. Will we be allowed to provide the chemical tracer information, such as the species names, at run time when the JSON is read or do we need to list the species names in the metadata for the ccpp-framework at build time? I suppose the JSON file could be read at build time. This is not clear to me.

@AndrewJConley
Copy link

@fvitt @mattldawson The JSON mechanism specification includes the list of molecule names and the molecule properties (molecular weight, henry's law coefficients, activity coefficients, and is extensible for other properties). The mechanism is to be used run-time. At the moment it is used build-time, but the next version of the solvers will not need the preprocessor. The gas-phase solver needs none of the molecular properties, but the combined gas-aerosol solver will require some molecular properties. The solver requires number density (molecules/cc). Other processes (emissions, wet/dry dep, aqueous chemistry, radiative transfer, etc) require information about the molecules. Name-based choices in algorithms are much more risky than prooperty-based choices. The chemistry solver is being built under the assumption that the host model can provide the number-based composition of the atmosphere (N2, O2, Ar, H2O, O3, etc).

@gold2718
Copy link
Collaborator Author

@fvitt

It seems to me like we need a collection of constituent objects. Each object be queried for properties or attributes and state (concentration).

I would love to see more detail about this proposal. By 'collection of objects', do you mean more than one constituent array? Something besides constituent arrays?
The reason for an array of DDTs carrying constituent properties is to be able to query by index as well as by standard name. Do you see a problem with that?

@gold2718
Copy link
Collaborator Author

The chemistry solver is being built under the assumption that the host model can provide the number-based composition of the atmosphere (N2, O2, Ar, H2O, O3, etc).

The MI in MICM is supposed to stand for model independent so this assumption seems ill founded. I think it is better to think of chemistry as specifying its needs and what it supplies.

@gold2718
Copy link
Collaborator Author

Name-based choices in algorithms are much more risky than prooperty[sic]-based choices

I agree that properties are important. However, I am not sure there are enough properties for an unrelated component to guess that the species in question is ozone. The name really helps!

Properties can be set at run time. If the owner (creator) of the array sets the protected = True metadata attribute, that will prevent any other module from changing any property.

@AndrewJConley
Copy link

@gold2718

The MI in MICM is supposed to stand for model independent so this assumption seems ill founded. I think it is better to think of chemistry as specifying its needs and what it supplies

This raises the question of where chemistry begins and ends. Is that a CCPP_framework decision or CESM decision?

@AndrewJConley
Copy link

@gold2718 As we have discussed, the physics parameterizations have listed the "standard physical gas-phase species". But I'm still a bit confused. Where does the wet-removal data for Ozone come from? It is "chemistry" or the JSON data. It seems more flexible to be a run-time option than a hard-coded registry entry.

@AndrewJConley
Copy link

@fvitt @gold2718 At the moment our proposed standard allows run-time representation of heterogeneous data for (for example) wet removal. Bases and acids have different types of data to initialize those methods. To throw out an ideas, perhaps there could be a number-density-of-gas-phase-species array and a co-indexed properties-of-number-density-of-gas-phase-species array. The properties-of array would be tiny in size in comparison to the gas-phase composition array itself, but the properties-of array would support considerable flexibility and extensibility in terms of its schema (including the protection status listed above).

@gold2718
Copy link
Collaborator Author

gold2718 commented May 4, 2020

The MI in MICM is supposed to stand for model independent so this assumption seems ill founded. I think it is better to think of chemistry as specifying its needs and what it supplies

This raises the question of where chemistry begins and ends. Is that a CCPP_framework decision or CESM decision?

Neither has or needs full control here. Chemistry specifies its needs and what it supplies. The host model specifies what it supplies. Any overlap will be recognized by the framework (assuming the standard name is documented in chemistry) and the framework will take care of proper data handling.

@AndrewJConley
Copy link

The MI in MICM is supposed to stand for model independent so this assumption seems ill founded. I think it is better to think of chemistry as specifying its needs and what it supplies

This raises the question of where chemistry begins and ends. Is that a CCPP_framework decision or CESM decision?

Neither has or needs full control here. Chemistry specifies its needs and what it supplies. The host model specifies what it supplies. Any overlap will be recognized by the framework (assuming the standard name is documented in chemistry) and the framework will take care of proper data handling.

I'm not asking about the interface between the host and the scheme, but instead whether the radiative transfer for photolysis is part of chemistry or some other suite.

@gold2718
Copy link
Collaborator Author

gold2718 commented May 5, 2020

@AndrewJConley

But I'm still a bit confused. Where does the wet-removal data for Ozone come from? It is "chemistry" or the JSON data.

First, your term, "chemistry", is vague, I'm not sure what you mean.
Second, no one (outside of the MICM developers and maybe users) care what your input data format is. Please do not focus on input data formats, let's focus on three areas: code (e.g., chemistry code), metadata, and run-time data.
Third, where does the wet-removal data for ozone come from now? It is not in master_gas_wetdep_list.xml.

In general, I would expect wet-removal data to get into the model in a fashion similar to how it does in CAM today -- that is, as run-time information (see my answer to an earlier question by @fvitt below).

@gold2718
Copy link
Collaborator Author

gold2718 commented May 5, 2020

@fvitt:

Will we be allowed to provide the chemical tracer information, such as the species names, at run time when the JSON is read or do we need to list the species names in the metadata for the ccpp-framework at build time? I suppose the JSON file could be read at build time.

I think that both are possible, I think the choice is going to depend on what the chemistry design ends up looking like. Recent discussions seem to be leaning the direction of fairly little (if any) code generation with mostly run-time configuration of processes, species, and reactions. In that scenario, it makes sense (to me anyway) that the configuration data would be passed to the chemistry suite at init time (possibly in the form a file path name). The chemistry init routine would then populate the species property array or arrays with data now contained in namelists such as wetdep_inparm.
Helpful?

@gold2718
Copy link
Collaborator Author

gold2718 commented May 5, 2020

I'm not asking about the interface between the host and the scheme, but instead whether the radiative transfer for photolysis is part of chemistry or some other suite.

This sounds like an issue for the SIMA steering committee, not the CCPP framework. Am I missing something?

@gold2718
Copy link
Collaborator Author

gold2718 commented May 5, 2020

To throw out an ideas, perhaps there could be a number-density-of-gas-phase-species array and a co-indexed properties-of-number-density-of-gas-phase-species array.

I am not sure what is new about this thought. To me, it looks like exactly what is being proposed in this issue. What am I missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
capgen bugs, requests, etc. that involve ccpp_capgen enhancement
Projects
None yet
Development

No branches or pull requests

5 participants