You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some timeseries files have subdaily time that isn't used (dropped) and others include it, but then filter out all measurements that aren't taken at noon. The result of the former is that the earliest measurement per day is used (often a cold bias) and the latter uses a single time but excludes other times that grab samples were taken.
I'd be in favor of treating them consistently, but also would be in favor of deferring the downsampling behavior to a much later stage instead of handling it on a per-file basis. Reason being is that we could choose to evaluate models on an hourly basis (and would therefore want hourly obs if we have them), or we could drop samples from certain times of the day/time, or we could interpolate to a single time point during the day. We'd want that logic to live in one function and probably be applied closer to the end of the chain.
The text was updated successfully, but these errors were encountered:
My intention was to try to treat things similar to the lowest common sampling frequency denominator, which (I think) are generally biweekly hand measurements. If the sample time is ignored that was probably an oversight on my part (definitely for the giant MPCA tsv).
I agree on dealing with sample frequency all at once. A universal method would need to account for single daily measurements, and several different frequencies of automated collection — there are many files with measurements every 6 hours, and some hourly or 15 minutes.
Some timeseries files have subdaily time that isn't used (dropped) and others include it, but then filter out all measurements that aren't taken at noon. The result of the former is that the earliest measurement per day is used (often a cold bias) and the latter uses a single time but excludes other times that grab samples were taken.
I'd be in favor of treating them consistently, but also would be in favor of deferring the downsampling behavior to a much later stage instead of handling it on a per-file basis. Reason being is that we could choose to evaluate models on an hourly basis (and would therefore want hourly obs if we have them), or we could drop samples from certain times of the day/time, or we could interpolate to a single time point during the day. We'd want that logic to live in one function and probably be applied closer to the end of the chain.
The text was updated successfully, but these errors were encountered: