Question about day of year climatology statistics #302
-
QuestionI want to store climatological variables in a netCDF file that are calculated on a day-of-year basis for two different reference periods (1981-2010 and 1991-2020). All of the years in the data have been converted to leap years, and February 29th values have been interpolated where these are missing. I then take the subset of the data that is within the given reference period, and I group it based on the day of year and calculate the climatological variables. One example of such a variable is the median. The result is 366 values for each day of year of the given reference period. I've been trying to look at chapter 7.4. Climatological Statistics in the CF conventions to see whether it's possible to use the cell_methods and climatology_bounds defined there to store this data, but I'm not sure my data fits into what's specified there. From what I understand, the standard only permits the following cell_methods:
Do any of these methods cover my use case? I don't think any of these cover things that are calculated on a day-of-year basis? cc @TomLav |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 3 replies
-
Thanks, @huaracheguarache, much appreciated! Here is the promised response: What is the characteristic of the daily values? If the daily data were means, then I think what you need would be
Each pair of coordinate bounds spans the lower bound of the day in the first year to the upper bound of the same day in the last year. There is no flexibility on this. The time of day of the lower and upper bounds doesn't have to be midnight - they're whatever is defined in the original daily data. The corresponding coordinate value is (anywhere) inside the day from the first year. It could also have been inside the same day from any of the years, e.g. any of the following I hope that helps, |
Beta Was this translation helpful? Give feedback.
-
Dear Michael @huaracheguarache Yes, if you have more than one climatology you can simply concatenate their climatological time coordinates, as you have suggested. It doesn't matter that they overlap. CF permits cells to overlap in general in any kind of coordinate. The Best wishes Jonathan |
Beta Was this translation helpful? Give feedback.
-
Ok, great!
Actually, our use case is per day of year, not day of month. It's due to the way we have decided to deal with leap years in our data by converting all years to leap years and interpolating between February 28th and March 1st to get the value of February 29th for the non-leap years. After that we calculate the climatology on a day of year basis to produce a plot of sea ice data: https://ice.metsis-api.met.no/daily |
Beta Was this translation helpful? Give feedback.
-
Thanks for the explanation, @huaracheguarache. Yes, I understand that you interpolate. Having done that, you express the day within the year as DDMMM e.g. 28 Feb. That is what I mean by "day of month". Sorry not to be clear. It seems that the existing convention is suitable for your need, isn't it. Jonathan |
Beta Was this translation helpful? Give feedback.
Thanks, @huaracheguarache, much appreciated! Here is the promised response:
What is the characteristic of the daily values? If the daily data were means, then I think what you need would be
time: mean within years time: median over years
. The daily (i.e. "within years") aspect is encoded in the time coordinates and, crucially, their bounds:Each pair of …