Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support nanosecords in num2date / date2num #346

Open
ChrisBarker-NOAA opened this issue Aug 28, 2024 · 1 comment
Open

Support nanosecords in num2date / date2num #346

ChrisBarker-NOAA opened this issue Aug 28, 2024 · 1 comment

Comments

@ChrisBarker-NOAA
Copy link

I notice that cftime does not support nanoseconds in the date2num and num2date:

In [10]: cftime.date2num(datetime.now(), "microseconds since 2024-08-28T12:00:00")
Out[10]: np.int64(2835169629)

In [11]: cftime.date2num(datetime.now(), "nanoseconds since 2024-08-28T12:00:00")
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[11], line 1
----> 1 cftime.date2num(datetime.now(), "nanoseconds since 2024-08-28T12:00:00")

File src/cftime/_cftime.pyx:252, in cftime._cftime.date2num()

File src/cftime/_cftime.pyx:105, in cftime._cftime._dateparse()

ValueError: In general, units must be one of 'microseconds', 'milliseconds', 'seconds', 'minutes', 'hours', or 'days' (or select abbreviated versions of these).  For the '360_day' calendar, 'months' can also be used, or for the 'noleap' calendar 'common_years' can also be used. Got 'nanoseconds' instead, which are not recognized.

I don't actually need this, and maybe no one that uses cftime does, but it comes up because the Python xarray library uses nanoseconds to store datetimes, and therefore uses nanoseconds by default when encoded CF-style, which then breaks code that needs to read those files.

Granted, this REALLY is an issue with xarray, and I'm trying to get that changed -- but maybe we could make the change here as well.

The Challenge:

The reason nanoseconds aren't supported is because the datetime object(s) themselves only support milliseconds.

So how to deal with that?

  1. date2num could still work --though that's not actually useful.

  2. num2date could truncate to milliseconds -- that would, in practice, not change anything, as no one is currently successfully using nanoseconds, so any existing data would be losslessly converted anyway. However, silently losing precision is a "bad thing"™️ so options:
    a) Truncate, but issue a warning when using nanoseconds -- but does anyone ever notice warnings?

    b) An error (or warning) only if the conversion is lossy -- e.g. the data do have sub-millisecond precision in them -- I like this except it would involve an extra check in the code -- enough to meaningfully effect performance??

    c) A flag (False by default): truncate_to_milliseconds

    • If the flag is False, then you'd get an error when trying to read nanoseconds.
    • If the flag is true, then nanoseconds would be silently truncated
    • If the flag is False, and the user tries to use nanoseconds, the error message suggests that the flag could be set.

If there is support for this, I'd be willing to write a PR.

@jswhit
Copy link
Collaborator

jswhit commented Aug 30, 2024

I'd rather wait and see how pydata/xarray#9154 turns out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants