Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-70647: Deprecate strptime day of month parsing without a year present to avoid leap-year bugs #117107

Merged
merged 9 commits into from
Apr 3, 2024
33 changes: 33 additions & 0 deletions Doc/library/datetime.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1079,6 +1079,24 @@ Other constructors, all class methods:
time tuple. See also :ref:`strftime-strptime-behavior` and
:meth:`datetime.fromisoformat`.

.. versionchanged:: 3.13

If *format* specifies a day of month without a year a
:exc:`DeprecationWarning` is now emitted. This is to avoid a quadrennial
leap year bug in code seeking to parse only a month and day as the
default year used in absence of one in the format is not a leap year.
Such *format* values may raise an error as of Python 3.15. The
workaround is to always include a year in your *format*. If parsing
*date_string* values that do not have a year, explicitly add a year that
is a leap year before parsing:

.. doctest::

>>> from datetime import datetime
>>> date_string = "02/29"
>>> when = datetime.strptime(f"{date_string};1984", "%m/%d;%Y") # Avoids leap year bug.
>>> when.strftime("%B %d") # doctest: +SKIP
pganssle marked this conversation as resolved.
Show resolved Hide resolved
'February 29'


Class attributes:
Expand Down Expand Up @@ -2657,6 +2675,21 @@ Notes:
for formats ``%d``, ``%m``, ``%H``, ``%I``, ``%M``, ``%S``, ``%j``, ``%U``,
``%W``, and ``%V``. Format ``%y`` does require a leading zero.

(10)
When parsing a month and day using :meth:`~.datetime.strptime`, always
include a year in the format. If the value you need to parse lacks a year,
append an explicit dummy leap year. Otherwise your code will raise an
exception when it encounters leap day because the default year used by the
parser is not a leap year. Users run into this bug every four years...

>>> strptime(f"{month_day};1984", "%m/%d;%Y") # No leap year bug.
encukou marked this conversation as resolved.
Show resolved Hide resolved

.. deprecated-removed:: 3.13 3.15
:meth:`~.datetime.strptime` calls using a format string containing
a day of month without a year now emit a
:exc:`DeprecationWarning`. In 3.15 or later we may change this into
an error or change the default year to a leap year. See :gh:`70647`.

.. rubric:: Footnotes

.. [#] If, that is, we ignore the effects of Relativity
Expand Down
20 changes: 19 additions & 1 deletion Lib/_strptime.py
Original file line number Diff line number Diff line change
Expand Up @@ -250,12 +250,30 @@ def pattern(self, format):
format = regex_chars.sub(r"\\\1", format)
whitespace_replacement = re_compile(r'\s+')
format = whitespace_replacement.sub(r'\\s+', format)
year_in_format = False
day_of_month_in_format = False
while '%' in format:
directive_index = format.index('%')+1
format_char = format[directive_index]
processed_format = "%s%s%s" % (processed_format,
format[:directive_index-1],
self[format[directive_index]])
self[format_char])
format = format[directive_index+1:]
match format_char:
case 'Y' | 'y' | 'G':
year_in_format = True
case 'd':
day_of_month_in_format = True
if day_of_month_in_format and not year_in_format:
Copy link
Member Author

@gpshead gpshead Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking this warns when a %d day-of-month is being parsed when there is no indicator of what the Month is (in which case the default of January presumably allows the day to go up to 31) but i presume such formats are unusual and thus rare enough to elide extra logic for and warn on those anyways. Being more pedantic would require one more bool in the conditional and the number of strptime format characters representing some form of month is large/ugly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be mildly in favor of deprecating all these ambiguous kinds of formats anyway, though I do think "add a fixed year to the string to be parsed" is an uglier hack than adding some mechanism for explicitly specifying defaults (e.g. accepting a default_datetime: datetime | None parameter, or default_kwargs={...}).

Which I suppose is my way of saying that I generally agree with this — I suspect that any format that has a year and a day-of-month without an actual month is way more likely to be due to an error than not. Preferably we'd give people a more accurate warning in that case, but for now any kind of warning to call attention to the situation should help.

import warnings
warnings.warn("""\
Parsing dates involving a day of month without a year specified is ambiguious
and fails to parse leap day. The default behavior will change in Python 3.15
to either always raise an exception or to use a different default year (TBD).
To avoid trouble, add a specific year to the input & format.
See https://github.com/python/cpython/issues/70647.""",
DeprecationWarning,
stacklevel=5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 5 looks fragile, but I don't see a good way around it, short of teaching warnings new tricks.
Hopefully this code won't get refactored in the next few years.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow, nice job with the time machine :)

Suggested change
stacklevel=5)
skip_file_prefixes=(os.path.dirname(__file__),))

Looks like this would be the first use of skip_file_prefixes in the stdlib.

return "%s%s" % (processed_format, format)

def compile(self, format):
Expand Down
13 changes: 13 additions & 0 deletions Lib/test/datetimetester.py
Original file line number Diff line number Diff line change
Expand Up @@ -2793,6 +2793,19 @@ def test_strptime_single_digit(self):
newdate = strptime(string, format)
self.assertEqual(newdate, target, msg=reason)

def test_strptime_leap_year(self):
# GH-70647: warns if parsing a format with a day and no year.
with self.assertRaises(ValueError):
# The existing behavior that GH-70647 seeks to change.
self.theclass.strptime('02-29', '%m-%d')
with self.assertWarnsRegex(DeprecationWarning,
r'.*day of month without a year.*'):
self.theclass.strptime('03-14.159265', '%m-%d.%f')
with self._assertNotWarns(DeprecationWarning):
self.theclass.strptime('20-03-14.159265', '%y-%m-%d.%f')
with self._assertNotWarns(DeprecationWarning):
self.theclass.strptime('02-29,2024', '%m-%d,%Y')

def test_more_timetuple(self):
# This tests fields beyond those tested by the TestDate.test_timetuple.
t = self.theclass(2004, 12, 31, 6, 22, 33)
Expand Down
8 changes: 8 additions & 0 deletions Lib/test/test_time.py
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,8 @@ def test_strptime(self):
'j', 'm', 'M', 'p', 'S',
'U', 'w', 'W', 'x', 'X', 'y', 'Y', 'Z', '%'):
format = '%' + directive
if directive == 'd':
format += ',%Y' # Avoid GH-70647.
strf_output = time.strftime(format, tt)
try:
time.strptime(strf_output, format)
Expand All @@ -299,6 +301,12 @@ def test_strptime_exception_context(self):
time.strptime('19', '%Y %')
self.assertIs(e.exception.__suppress_context__, True)

def test_strptime_leap_year(self):
# GH-70647: warns if parsing a format with a day and no year.
with self.assertWarnsRegex(DeprecationWarning,
r'.*day of month without a year.*'):
time.strptime('02-07 18:28', '%m-%d %H:%M')

def test_asctime(self):
time.asctime(time.gmtime(self.t))

Expand Down
10 changes: 10 additions & 0 deletions Lib/test/test_unittest/test_assertions.py
Original file line number Diff line number Diff line change
Expand Up @@ -386,6 +386,16 @@ def testAssertWarns(self):
'^UserWarning not triggered$',
'^UserWarning not triggered : oops$'])

def test_assertNotWarns(self):
def warn_future():
warnings.warn('xyz', FutureWarning, stacklevel=2)
self.assertMessagesCM('_assertNotWarns', (FutureWarning,),
warn_future,
['^FutureWarning triggered$',
'^oops$',
'^FutureWarning triggered$',
'^FutureWarning triggered : oops$'])

def testAssertWarnsRegex(self):
# test error not raised
self.assertMessagesCM('assertWarnsRegex', (UserWarning, 'unused regex'),
Expand Down
22 changes: 22 additions & 0 deletions Lib/unittest/case.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,23 @@ def __exit__(self, exc_type, exc_value, tb):
self._raiseFailure("{} not triggered".format(exc_name))


class _AssertNotWarnsContext(_AssertWarnsContext):

def __exit__(self, exc_type, exc_value, tb):
self.warnings_manager.__exit__(exc_type, exc_value, tb)
if exc_type is not None:
# let unexpected exceptions pass through
return
try:
exc_name = self.expected.__name__
except AttributeError:
exc_name = str(self.expected)
for m in self.warnings:
w = m.message
if isinstance(w, self.expected):
self._raiseFailure(f"{exc_name} triggered")


class _OrderedChainMap(collections.ChainMap):
def __iter__(self):
seen = set()
Expand Down Expand Up @@ -811,6 +828,11 @@ def assertWarns(self, expected_warning, *args, **kwargs):
context = _AssertWarnsContext(expected_warning, self)
return context.handle('assertWarns', args, kwargs)

def _assertNotWarns(self, expected_warning, *args, **kwargs):
"""The opposite of assertWarns. Private due to low demand."""
context = _AssertNotWarnsContext(expected_warning, self)
return context.handle('_assertNotWarns', args, kwargs)

def assertLogs(self, logger=None, level=None):
"""Fail unless a log message of level *level* or higher is emitted
on *logger_name* or its children. If omitted, *level* defaults to
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
Start the deprecation period for the current behavior of
:func:`datetime.datetime.strptime` and :func:`time.strptime` which always
fails to parse a date string with a :exc:`ValueError` involving a day of
month such as ``strptime("02-29", "%m-%d")`` when a year is **not**
specified and the date happen to be February 29th. This should help avoid
users finding new bugs every four years due to a natural mistaken assumption
about the API when parsing partial date values.
Loading