-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementation of salem-style x, y, and z coordinates #14
Conversation
Good work on the PR, I think this is a nice starting point. Did you get the dask integration working yet? I tried it and somehow the calculation, especially of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor comments :)
xwrf/diagnostics.py
Outdated
This operation should be called before destaggering or any other cleaning operation. | ||
""" | ||
# Potential temperature | ||
if _check_if_dask(dataset['T']): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this only work if dask
is present?
I guess we should either
- only allow
xwrf
usage withdask
-> check somewhere earlier for dask (like in__init__
oropen_dataset
) - also allow
xwrf
usage withoutdask
-> remove these checks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really just a placeholder to capture the intent of needing a Dask check of some kind before diagnostics are computed. It would be a good conversation to have as to what level of Dask-based operations are required vs. optional, and if optional, how those choices are specified.
Thanks! And no, Dask isn't working with this yet since I'm not sure how to get Dask-backed arrays through the wrapped netcdf4 backend yet. Once I (or someone else) can figure out how best to route the |
I thought so. I'm also not really sure on how to do this and not getting a huge amount of info from the |
In [5]: ds = xr.open_dataset(path, engine="xwrf", chunks={"south_north": 50})
In [6]: ds
Out[6]:
<xarray.Dataset>
Dimensions: (Time: 1, south_north: 546, west_east: 480)
Coordinates:
XLONG (south_north, west_east) float32 dask.array<chunksize=(50, 480), meta=np.ndarray>
XLAT (south_north, west_east) float32 dask.array<chunksize=(50, 480), meta=np.ndarray>
Dimensions without coordinates: Time, south_north, west_east
Data variables:
Q2 (Time, south_north, west_east) float32 dask.array<chunksize=(1, 50, 480), meta=np.ndarray>
PSFC (Time, south_north, west_east) float32 dask.array<chunksize=(1, 50, 480), meta=np.ndarray> P.S.: The existing backend plugin isn't robust enough(things such as time decoding aren't handled properly when using dask)... I plan to look into this at some point. Also, notice how |
Alright! So if I'm interpreting what you're saying correctly along with https://github.com/pydata/xarray/blob/main/xarray/backends/api.py#L335-L512, |
Wouldn't creating custom backend arrays be overkill? Assuming we want to support reading files via the Python-netCDF4 library, we might be able to write a custom data store that borrows from xarray's I think there's value in keeping the backend plugin simple (e.g. performing simple tasks such as decoding coordinates, fixing attributes/metadata, etc) and everything else outside the backend. Deriving quantities doesn't seem simple enough to warrant having this functionality during the data loading. Some of the benefits of deriving quantities outside the backend are that this approach: (1) doesn't obfuscate what's going on, |
Also, Wouldn't it be beneficial for deriving quantities to be backend agnostic? I'm imagining cases in which the data have been post-processed and saved in a different format (e.g. Zarr) and you still want to be able to use the same code for deriving quantities on the fly. |
This sounds like it factors directly into the "keep the solutions as general as possible (so that maybe also MPAS can profit from it)" discussion. However, I feel that we have to think about the user-perspective too. I don't have any set opinions on this and we should definitely discuss this maybe in a larger group too. Here some thoughts on this so far: I think the reason users like @andersy005 do you already have some other ideas on how one could handle this elegantly? Also, should we move this discussion to a separate issue maybe? |
@andersy005 @lpilz I think discussing this API question (what xwrf features go in the backend and what (if any) go elsewhere) is good to discuss on its own issue. I'll refine the scope of this PR to just the coordinate operations (both because of that conversation, and that the approach used here won't really work anyway given how the backends and Dask chunking interact). |
974ffdc
to
413afe2
Compare
Something is still fishy with the CRS for me. If I transform like this:
and then use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion to add XGCM-attributes to vertical coordinates
90af0ad
to
7917f31
Compare
I've pushed some updates to remove the Berlin-specific conditions, fix the TRUELAT2 missing case, and to add CF/COMODO attrs to the vertical coords!
I unfortunately don't quite have the right intuition to tell if that's a datum shift error as described in https://fabienmaussion.info/2018/01/06/wrf-projection/ or if something else is going on. What dataset are you using? Also, @fmaussion do you have any insight on this? |
I attached the dataset in question (had to rename it to I don't think this is the problem described in the blog post, because we do explicitly add the spherical datum ( Also: here the code ds = xr.open_dataset("wrfout_d03_2018-10-18_00:00:00_only_1time.nc", chunks='auto', engine=xwrf.WRFBackendEntrypoint)
from pyproj import Transformer, CRS
trf = Transformer.from_crs(CRS.from_epsg(4326), CRS.from_cf(ds['wrf_projection'].attrs))
x, y = trf.transform(8.553972274305062, 49.509090717658204)
var_in_question = ds.T.isel(bottom_top=0).interp(west_east=x, south_north=y)
print(var_in_question) |
Can Salem read and plot this properly? |
This looks OK in salem to me: https://gist.github.com/fmaussion/97beef70976231a58c914819966643f5 With a great plot haha: |
@jthielen Would you mind merging |
Maybe we should add some tests of |
c4aaf96
to
ba8a064
Compare
36ce734
to
3223827
Compare
This PR has been reworked to be based on #44 (so for now should not be merged until I can rebase on |
3223827
to
e6e4f84
Compare
Now that #44 has been merged, this has been updated to just include the relevant commit! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! 👍🏽
The code coverage for postprocess.py
is relatively low but we can address this in a separate PR
Also, should we add a test for xgcm interoperability to ensure xgcm is able to construct the |
Unfortunately, with the new changes
I tried adding some |
@lpilz: This sound like a good test to add. |
@lpilz Thanks for the heads up on that! Unidata/MetPy#2347 has been created upstream in MetPy to resolve it. I've added a test against xgcm, and I can definitely do likewise against MetPy (and cf_xarray?) next for coordinate identification (it'll probably have to be version bounded given Unidata/MetPy#2347, but oh well). |
Thanks for always fixing these so quickly. Awesome work @jthielen! While I in principle agree that we should test this against as many utilities which might be used in conjunction with xwrf as possible, maybe this can be done in future PRs. We could e.g. do this in the documentation phase when writing tutorials and stuff... I think this PR is quite nice already and doesn't have to be the MOAPRs ;) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @jthielen! Thanks for all the great work!
Merge whenever ready. |
8b02525
to
9b3c016
Compare
With 3 approvals, I went ahead and merged it after resolving the merge conflicts with main. #49 has been created to keep track of the testing-related concerns brought up here! |
Change Summary
As alluded to in #2, including dimension coordinates in the grid mapping/projection space is a key feature for integrating with other tools in the ecosystem like
metpy
andxgcm
. In this (draft) PR, I've combined code ported fromsalem
with some of my own one-off scripts and what already exists inxwrf
to meet this goal. In particular, this introduces apyproj
dependency (for CRS handling and transforming the domain center point from lon/lat to easting/northing). Matching the assumptions already present inxwrf
andsalem
, this implementation assumes we do not have a moving domain (which simplifies things greatly). Also, this implements thec_grid_axis_shift
attr as-needed, soxgcm
should be able to interpret our coords automatically, eliminating the need for direct handling (like #5) inxwrf
.Also, because it existed in salem and my scripts alongside the dimension coordinate handling, I also included my envisioned diagnostic field calculations. These are deliberately limited to only those four fields that require WRF-specific handling:g
(9.81 m s**2) that may not match the value used elsewhere~~Unless I'm missing something, any other diagnostics should be derivable using these or other existing fields in a non-WRF-specific way (and so, fit outside ofxwrf
). If the netcdf4 backend already handles Dask chunks, then this should "just work" as it is currently written. However, I'm not sure how this should behave with respect to lazy-loading whenchunks
are not specified, so that is definitely a discussion to have in relation to #10.Right now, no tests are included, as this is just a draft implementation to get the conversation started on how we want to approach these features. So, please do share your thoughts and ask questions!Related issue number
Checklist