Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifies coord attrs lookup for non-coord levels on GRIB2 files #204

Merged
merged 3 commits into from
Jul 28, 2022

Conversation

darothen
Copy link
Contributor

I ran across an interesting edge case while trying to read in some GRIB files from the NOAA HRRR model using @martindurant's awesome improvements from #198. The edge case occurs when you supply a typeOfLevel filter, but the level is not actually a coordinate in the dataset. The most common use case for this is selecting the surface fields (including certain thermodynamic fields like skin temperature, or winds). In this case, the attribute lookup into the decoded COORD_ATTRS on the dataset hat cfgrib produces will fail, but the scan_grib will keep on running through, producing empty references that need to be dealt with.

The changelist here is a very, very naive way of fixing this - simply grabbing the basic attributes that have already been inspected. I'm sure there are better ways. But this PR is primarily to exposure to the community and get feedback for a better solution.

To Reproduce

  • Grab a recent HRRR file from NOMADS (e.g. here, subbing in the most recent YYYYMMDD for the date)
  • Run the following code snippet:
    from kerchunk.grib2 import scan_grib
    hrrr_fn = "/path/to/hrrr/grib"
    grib_references = scan_grib(
        hrrr_fn,
        common=['time', 'latitude', 'longitude', 'valid_time']
        storage_options={}, 
        inline_threshold=0,
        filter={'typeOfLevel': 'surface', 'shortName': 't'},  # for selecting the single surface temp message 
    )

kerchunk/grib2.py Outdated Show resolved Hide resolved
@martindurant martindurant mentioned this pull request Jul 28, 2022
@emfdavid
Copy link
Contributor

emfdavid commented Jul 28, 2022

I verified this fixes the issue with surface data from HRRR.
I did notice that when running with this commit ad2dedc installed, I get the above warning which should be fine as debug, but more concerning, I get the debug output from scan_grib too.
I think the logging setup from #198 is removed in the main branch of kerchunk/grib2.py, but that would be extremely noisy!

@martindurant
Copy link
Member

That's right, on that branch the debugs are still in, but they are gone on main.

@darothen
Copy link
Contributor Author

Thanks for the feedback @martindurant and @emfdavid. I’ll clean this up tonight.

Apologies @martindurant - if I realized you wanted to cut a new release with the GRIB re-factor I would’ve expedited this :(

@martindurant
Copy link
Member

No problem, @darothen , releases are cheap

@emfdavid
Copy link
Contributor

@darothen Thank you for solving this issue!

@martindurant
Copy link
Member

Will merge when green. We can decide later if we want to make these attributes of the top level dataset rather than coordinates.

@martindurant martindurant merged commit 00a75e3 into fsspec:main Jul 28, 2022
@darothen darothen deleted the fix_noncoord_level branch July 30, 2022 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants