Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use np.asarray to convert a 1-D array to datetime type in array_to_datetime #2481

Merged
merged 8 commits into from
Apr 6, 2023
7 changes: 7 additions & 0 deletions examples/tutorials/advanced/date_time_charts.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,13 @@
)
fig.show()

###############################################################################
#
# PyGMT doesn't recognize non-ISO datetime strings like "Jun 05, 2018". If your
# data contain non-ISO datetime strings, you can convert them to a recognized
# format using :func:`pandas.to_datetime` and then pass it to PyGMT.
#

###############################################################################
# Mixing and matching Python ``datetime`` and ISO dates
# -----------------------------------------------------
Expand Down
52 changes: 25 additions & 27 deletions pygmt/clib/conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
import warnings

import numpy as np
import pandas as pd
from pygmt.exceptions import GMTInvalidInput


Expand Down Expand Up @@ -252,10 +251,9 @@ def kwargs_to_ctypes_array(argument, kwargs, dtype):

def array_to_datetime(array):
"""
Convert an 1-D datetime array from various types into pandas.DatetimeIndex
(i.e., numpy.datetime64).
Convert a 1-D datetime array from various types into numpy.datetime64.

If the input array is not in legal datetime formats, raise a "ParseError"
If the input array is not in legal datetime formats, raise a ValueError
exception.

Parameters
Expand All @@ -272,58 +270,58 @@ def array_to_datetime(array):

Returns
-------
array : 1-D datetime array in pandas.DatetimeIndex (i.e., numpy.datetime64)
array : 1-D datetime array in numpy.datetime64

Raises
------
ValueError
If the datetime string is invalid.

Examples
--------
>>> import datetime
>>> # numpy.datetime64 array
>>> x = np.array(
... ["2010-06-01", "2011-06-01T12", "2012-01-01T12:34:56"],
... dtype="datetime64",
... dtype="datetime64[ns]",
... )
>>> array_to_datetime(x)
DatetimeIndex(['2010-06-01 00:00:00', '2011-06-01 12:00:00',
'2012-01-01 12:34:56'],
dtype='datetime64[ns]', freq=None)
array(['2010-06-01T00:00:00.000000000', '2011-06-01T12:00:00.000000000',
'2012-01-01T12:34:56.000000000'], dtype='datetime64[ns]')

>>> # pandas.DateTimeIndex array
>>> import pandas as pd
>>> x = pd.date_range("2013", freq="YS", periods=3)
>>> array_to_datetime(x) # doctest: +NORMALIZE_WHITESPACE
DatetimeIndex(['2013-01-01', '2014-01-01', '2015-01-01'],
dtype='datetime64[ns]', freq='AS-JAN')
>>> array_to_datetime(x)
array(['2013-01-01T00:00:00.000000000', '2014-01-01T00:00:00.000000000',
'2015-01-01T00:00:00.000000000'], dtype='datetime64[ns]')

>>> # Python's built-in date and datetime
>>> x = [datetime.date(2018, 1, 1), datetime.datetime(2019, 1, 1)]
>>> array_to_datetime(x) # doctest: +NORMALIZE_WHITESPACE
DatetimeIndex(['2018-01-01', '2019-01-01'],
dtype='datetime64[ns]', freq=None)
>>> array_to_datetime(x)
array(['2018-01-01T00:00:00.000000', '2019-01-01T00:00:00.000000'],
dtype='datetime64[us]')

>>> # Raw datetime strings in various format
>>> x = [
... "2018",
... "2018-02",
... "2018-03-01",
... "2018-04-01T01:02:03",
... "5/1/2018",
... "Jun 05, 2018",
... "2018/07/02",
Comment on lines -308 to -310
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These three types of datetime inputs are no longer supported. Actually, they're never supported because they won't be recognized as valid datetime strings in the _check_dtype_and_dim function.

... ]
>>> array_to_datetime(x)
DatetimeIndex(['2018-01-01 00:00:00', '2018-02-01 00:00:00',
'2018-03-01 00:00:00', '2018-04-01 01:02:03',
'2018-05-01 00:00:00', '2018-06-05 00:00:00',
'2018-07-02 00:00:00'],
dtype='datetime64[ns]', freq=None)
array(['2018-01-01T00:00:00', '2018-02-01T00:00:00',
'2018-03-01T00:00:00', '2018-04-01T01:02:03'],
dtype='datetime64[s]')

>>> # Mixed datetime types
>>> x = [
... "2018-01-01",
... np.datetime64("2018-01-01"),
... datetime.datetime(2018, 1, 1),
... ]
>>> array_to_datetime(x) # doctest: +NORMALIZE_WHITESPACE
DatetimeIndex(['2018-01-01', '2018-01-01', '2018-01-01'],
dtype='datetime64[ns]', freq=None)
>>> array_to_datetime(x)
array(['2018-01-01T00:00:00.000000', '2018-01-01T00:00:00.000000',
'2018-01-01T00:00:00.000000'], dtype='datetime64[us]')
"""
return pd.to_datetime(array)
return np.asarray(array, dtype=np.datetime64)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there was a reason we used pd.to_datetime instead of np.asarray in #464, maybe to support funny dates like 'Jun 05, 2018', but since the tests pass and you mentioned in https://github.com/GenericMappingTools/pygmt/pull/2481/files#r1158514955 that it's not actually supported by _check_dtype_and_dim, then it should be fine 🤞

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there was a reason we used pd.to_datetime instead of np.asarray in #464, maybe to support funny dates like 'Jun 05, 2018',

In the Plotting datetime charts tutorial, we probably need to add a paragraph or a subsection to explain that, for non-ISO datetimes like 'Jun 05, 2018', people should use pd.to_datetime to process it first before passing it to PyGMT.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Under https://www.pygmt.org/v0.9.0/tutorials/advanced/date_time_charts.html#generating-an-automatic-region, pd.to_datetime is used to convert dates like "20220712" into a recognizable format. Maybe add a few sentences somewhere there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 changes: 1 addition & 1 deletion pygmt/clib/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -745,7 +745,7 @@ def _check_dtype_and_dim(self, array, ndim):
if array.dtype.type not in DTYPES:
try:
# Try to convert any unknown numpy data types to np.datetime64
array = np.asarray(array, dtype=np.datetime64)
array = array_to_datetime(array)
except ValueError as e:
raise GMTInvalidInput(
f"Unsupported numpy data type '{array.dtype.type}'."
Expand Down