Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing data for the WFDE5 cmorizer #2215

Closed
remi-kazeroni opened this issue Jun 29, 2021 · 2 comments · Fixed by #2232
Closed

Missing data for the WFDE5 cmorizer #2215

remi-kazeroni opened this issue Jun 29, 2021 · 2 comments · Fixed by #2232

Comments

@remi-kazeroni
Copy link
Contributor

Describe the bug

The recently merged cmorizer for the WFDE5 #1991 data works fine but the test recipe recipe_check_obs.yml crashes with the cmorized data. It returns the following error:

esmvalcore.cmor.check.CMORCheckError: There were errors in variable pr:
time: Frequency day does not match input data

It seems to me that raw data for the first year and the pr variable is not complete because recipe_check_obs.yml runs fine with start_year: 1980 instead of 1979. The suspicious files seem to be:

  • mip: day
  • variable: pr
  • filenames: Rainf_WFDE5_CRU+GPCC_197901_v1.1.nc, Rainf_WFDE5_CRU_197901_v1.1.nc, Snowf_WFDE5_CRU+GPCC_197901_v1.1.nc, Snowf_WFDE5_CRU_197901_v1.1.nc.
    For example, ncdump -h Rainf_WFDE5_CRU_197901_v1.1.nc gives:
netcdf Rainf_WFDE5_CRU_197901_v1.1 {
dimensions:
        time = 737 ;
        lat = 360 ;
        lon = 720 ;
variables:
        double time(time) ;
                time:_FillValue = NaN ;
                time:standard_name = "time" ;
                time:long_name = "Time" ;
                time:axis = "T" ;
                time:units = "hours since 1900-01-01" ;
                time:calendar = "proleptic_gregorian" ;
        double lat(lat) ;
                lat:_FillValue = NaN ;
                lat:long_name = "Latitude" ;
                lat:units = "degrees_north" ;
                lat:standard_name = "latitude" ;
                lat:axis = "Y" ;
        double lon(lon) ;
                lon:_FillValue = NaN ;
                lon:long_name = "Longitude" ;
                lon:units = "degrees_east" ;
                lon:standard_name = "longitude" ;
                lon:axis = "X" ;
        float Rainf(time, lat, lon) ;
                Rainf:_FillValue = 1.e+20f ;
                Rainf:units = "kg m-2 s-1" ;
                Rainf:long_name = "Rainfall Flux" ;
                Rainf:standard_name = "rainfall_flux" ;

where time = 737 should be 744.

I think we should just remove the year 1979 for the pr variable and all dataset version from the recipe because the data is incomplete. What do you think @mwjury @valeriupredoi?

Please attach
Minimal example:
main_log_debug.txt
recipe_check_WFDE5_pr_day_1.yml.txt

@valeriupredoi
Copy link
Contributor

we have two options here: we forget about 1979 completely as you propose or we add a mask to the data in the 1979 file that masks the missing data; I don't know to what extent the data is really needed to be from 1979 as well, I say we forget about it and if users complain a lot then we produce a 1979 data file with a mask in it?

@mwjury
Copy link
Contributor

mwjury commented Jul 14, 2021

Good catch @remi-kazeroni!
It concerns the first 7 timesteps of the raw 1979 PR hourly files. I'd assume that this has a negligible impact on the daily/monthly aggregations, but nevertheless, better save than sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants