Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align FieldArray with Python xarray and netCDF data model #48

Closed
sjdaines opened this issue Apr 28, 2023 · 2 comments · Fixed by #107
Closed

Align FieldArray with Python xarray and netCDF data model #48

sjdaines opened this issue Apr 28, 2023 · 2 comments · Fixed by #107

Comments

@sjdaines
Copy link
Member

The PALEOmodel.FieldArray is intended to work in the same way as the Python xarray, which itself uses the data model from netCDF CF conventions. Currently there are some key differences:

  • Dimensions vs Coordinates. This is the major difference, these do not currently follow the xarray/CF data model. We do want to clearly separate 'dimension' and 'coordinate' in order to allow different coordinate systems (pressure vs height for example). However it's not clear that a single 'dimension coordinate (xarray) / coordinate variable (CF)' with the same name as a dimension is what we want, as that makes it more complicated to connect eg cell centre and face coordinates. See xarray discussion Terminology for the various coordinates pydata/xarray#1295. It might be clearer to attach a list of 'preferred' coordinates to named dimensions, instead of using the xarray/CF naming convention.
sjdaines added a commit to PALEOtoolkit/PALEOboxes.jl that referenced this issue May 4, 2023
See PALEOtoolkit/PALEOmodel.jl#48

Initial test updating NamedDimension so that `coords` is just
a vector of names of preferred coordinates.

NB: shows that there are no tests for get_region in this package.
@sjdaines
Copy link
Member Author

sjdaines commented May 4, 2023

This will need changes in both PALEOboxes and PALEOmodel in order to update the abstractions used for grids, dimensions and coordinates, and more clearly define the data model that is used to store model output internally (PALEOmodel.OutputWriters.OutputMemory) and on disk (currently just a serialisation of the internal format using jld2)

Currently the internal (and hence serialized external) format for model output uses the NamedDimension and AbstractGrid implementation. So the first priority has to be to implement a netCDF data format for storing model output on disk, so that the internal implementation can be updated without affecting existing work-in-progress that relies on saved output. Then the data model for grids/dimensions/coordinates and the representation of model output can be updated.

One design choice that should then be reconsidered is how to represent grid information internally: via an abstract interface (as at present), vs an explicit lower-level data model (as netCDF does). Currently PALEOboxes grids implement an interface, and don't expose dimensions etc in a low level generic way. Instead there are three api calls:

  • internal_size used to define dimensions for arrays in CellSpace
  • create_default_cellrange used to create CellRange with appropriate indices
  • get_region used to take slices and return appropriate data and coordinates in an xarray-like container for output analysis

sjdaines added a commit to sjdaines/PALEOmodel.jl that referenced this issue May 5, 2023
See PALEOtoolkit#48
PALEOtoolkit/PALEOboxes.jl#85

- Generalize FieldArray
- move get_region and coordinate manipulation code from PALEOboxes.jl
- rename FixedCoord -> RecordCoord as now only used by
 FieldRecord (and should be used by OutputWriter)

TODO this currently allows a model (only the black sea config tested)
to run and store output, but the FieldRecord etc (and surely FieldArray
and plotting) does not work.
Stop at this point as this illustrates a review of the data model
for internal data storage (at least) is needed.
Which requires implementing a standardised stable output format on disk
first.
sjdaines referenced this issue Dec 31, 2024
Simplify and tidy up FieldRecord and netcdf output
@sjdaines
Copy link
Member Author

Implemented by #107

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant