-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ERA5 Solar Position Time Shift Broken for Certain Time Spans #256
Comments
The culprit is this attempt to infer the frequency of the time index: atlite/atlite/datasets/era5.py Lines 157 to 174 in a0bd4b0
An ERA5 CDS request spanning June 30 and July 1st looks like this: {
'product': 'reanalysis-era5-single-levels',
'year': '2022',
'month': [6, 7],
'day': [30, 1],
'time': ['00:00', '01:00', '02:00', '03:00', '04:00', '05:00', '06:00', '07:00', '08:00', '09:00', '10:00', '11:00', '12:00', '13:00', '14:00', '15:00', '16:00', '17:00', '18:00', '19:00', '20:00', '21:00', '22:00', '23:00']
} which returns results covering June 1st, June 30th, July 1st, and July 30th: Time index contents
From print(ds.time) Output:
Because the time index is not continuous, |
I think the fix should be as simple as just always setting the time shift to 30 minutes (proposed diff). The atlite/atlite/datasets/era5.py Lines 352 to 359 in a0bd4b0
To my understanding, this product's resolution is always hourly (see docs), so there's no reason to attempt to infer a different frequency. What do you think @euronion @FabianHofmann ? |
Comparing ERA5 Values Requested at Different Time SamplingsHere's a comparison of values received from ERA5 at 1h, 2h, 3h, 4h, and 6h sampling: Code to Generate the Above Table
import xarray
import atlite.datasets.era5 as era5
from dask.utils import SerializableLock
import functools
# Create lists of hours like [00:00, 01:00, 02:00, ...], sampled
# every hour, every 2 hours, etc.
time_sampling = {}
for rate in [1, 2, 3, 4, 6]:
time_sampling[rate] = [f"{hour:02}:00" for hour in range(0, 24, rate)]
retrieval_params = {
'product': 'reanalysis-era5-single-levels',
'area': [57.0, -0.5, 56.0, 0.5],
'chunks': {'time': 100},
'grid': [0.25, 0.25],
'tmpdir': '/tmp',
'lock': SerializableLock(),
'year': '2013',
'month': [1],
'day': [1]
}
param_sets = {hour: {**retrieval_params, **{'time': time}} for hour, time in time_sampling.items()}
def retrieve_data_for_single_raster(params: dict) -> "xarray.DataSet":
variable = [
"surface_net_solar_radiation",
"surface_solar_radiation_downwards",
"toa_incident_solar_radiation",
"total_sky_direct_solar_radiation_at_surface",
]
ds = era5.retrieve_data(variable=variable, **params)
return ds.sel(latitude=56, longitude=0).load().to_dataframe()[["ssr", "ssrd", "tisr", "fdir"]]
# Retrieve ERA5 data for each different time sampling
raw_ds = {hour: retrieve_data_for_single_raster(params) for hour, params in param_sets.items()}
def join_dfs(left_sampling_and_df, right_sampling_and_df):
left_sampling, left_df = left_sampling_and_df
right_sampling, right_df = right_sampling_and_df
suffix = f"_{right_sampling}h"
return left_sampling, left_df.join(right_df, on='time', how='left', rsuffix=suffix)
# Merge all data into a single dataframe to show differences in values
# for each hour next to each other
_, merged = functools.reduce(join_dfs, raw_ds.items())
samplings_compared = merged.sort_index(axis='columns')\
.query('time.dt.hour > 7 and time.dt.hour < 18')\
.astype(float)\
.round(decimals=2)
# Remove date part
samplings_compared.index = samplings_compared.index.time
samplings_compared.to_csv('/tmp/samplings_compared.csv') While this does show that the time shift is not proportional to the sampling, there's an additional weird twist to it. The values for 2h, 3h, 4h, and 6h sampling seem to be equal - but the values for 1h sampling are slightly different (by less than 1% at noon). It would be interesting to find out why that is, but as far as this issue is concerned - I think it's still more appropriate to always shift the time by 30 minutes, than by half of the sampling interval. |
@zoltanmaric I am impressed how fast you are catching up with the atlite code! Definitely makes sense! |
Interesting, but I would not bother with the value differences for different time resolutions, since it is related to ERA5 internals only. |
Description
When building an ERA5 cutout for a period that doesn't cover entire months (e.g. for
time=slice("2022-06-30 00:00", "2022-07-01 23:00")
), the solar position is shifted by half a day instead of half an hour - effectively inverting the solar position (maximum at local midnight, minimum at local noon).Expected Behavior
Solar position should be shifted by 30 minutes.
Actual Behavior
Solar position is shifted by 12 hours.
Example
Your Environment
atlite
at revision 230aa8aRelated to #158 and #199
The text was updated successfully, but these errors were encountered: