Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-95778: CVE-2020-10735: Prevent DoS by very large int() #96499

Merged
merged 45 commits into from
Sep 2, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
19b28fc
CVE-2020-10735: Prevent DoS by very large int()
tiran May 5, 2020
0a96b20
Default to disable, improve tests and docs
tiran Jan 19, 2022
88f6d5d
fix typo
tiran Jan 19, 2022
70c195e
More docs (WIP)
tiran Jan 19, 2022
e17e93b
Basic documentation for sys functions
tiran Jan 19, 2022
fbd14b7
Use ValueError, ignore underscore, scale limit
tiran Jan 20, 2022
dd74d70
Fix CI
tiran Jan 20, 2022
0e01461
Address Greg's review
tiran Aug 1, 2022
0b21e5f
Fix sys.flags len and docs
tiran Aug 1, 2022
3b38abe
Keep the warning, but remove advice about limiting input length in th…
gpshead Aug 2, 2022
37193ed
Renamed the APIs & too many other refactorings.
gpshead Aug 5, 2022
c90b79f
Improve the configuring docs.
gpshead Aug 7, 2022
fea25ea
Stop tying to base10, just use string digits.
gpshead Aug 7, 2022
ac9f22f
Remove the added now-unneeded helper log tbl fn.
gpshead Aug 7, 2022
da72dd1
prevent intdostimeit from emitting errors in test_tools.
gpshead Aug 7, 2022
d7e4d7b
Remove a leftover base 10 reference. clarify.
gpshead Aug 7, 2022
5c7e6d5
versionadded/changed to 3.12
gpshead Aug 7, 2022
61a5bc9
Link to the CVE from the main doc.
gpshead Aug 7, 2022
c15adde
Add a What's New entry.
gpshead Aug 7, 2022
76ae1c2
Add a Misc/NEWS.d entry.
gpshead Aug 7, 2022
1ad88f5
Undo addition to PyConfig to ease backporting.
gpshead Aug 8, 2022
0c83111
Remove the Tools/scripts/ example and timing code.
gpshead Aug 8, 2022
5d39ab6
un-add the <math.h> include (not needed for PR anymore)
gpshead Aug 8, 2022
5b77b3e
Remove added unused imports.
gpshead Aug 8, 2022
de00cdc
Tabs -> Spaces
gpshead Aug 8, 2022
3cc8553
make html and make doctest in Doc pass.
gpshead Aug 8, 2022
da97e65
Raise the default limit and the threshold.
gpshead Aug 10, 2022
ef03a16
Remove xmlrpc.client changes, test-only.
gpshead Aug 12, 2022
e916845
Rearrange the new stdtypes docs, w/limits + caution.
gpshead Aug 13, 2022
101502e
Make a huge int a SyntaxError with lineno when parsing.
gpshead Aug 16, 2022
fa8a58a
Mention the chosen default in the NEWS entry.
gpshead Aug 16, 2022
313ab6d
Properly clear & free the prior exception.
gpshead Aug 16, 2022
614cd02
Add a note to the float.as_integer_ratio() docs.
gpshead Aug 17, 2022
16ad090
Clarify the documentation wording and error msg.
gpshead Aug 17, 2022
4eb72e6
Fix test_idle, it used a long int on a line.
gpshead Aug 17, 2022
da36550
Rename the test.support context manager and document it.
gpshead Aug 19, 2022
f4372cc
Documentation cleanup.
gpshead Aug 19, 2022
c421853
Update attribution in Misc/NEWS.d
gpshead Aug 25, 2022
9f2168a
Regen global strings
tiran Sep 1, 2022
3c8504b
Make the doctest actually run & fix it.
gpshead Sep 1, 2022
1586419
Fix the docs build.
gpshead Sep 2, 2022
94bd3ee
Rename the news file to appease the Bedevere bot.
gpshead Sep 2, 2022
0b91f65
Regen argument clinic after the rebase merge.
gpshead Sep 2, 2022
02776f9
Hexi hexa
tiran Sep 2, 2022
173fa4e
Hexi hexa 2
tiran Sep 2, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions Doc/library/functions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -910,6 +910,13 @@ are always available. They are listed here in alphabetical order.
.. versionchanged:: 3.11
The delegation to :meth:`__trunc__` is deprecated.

.. versionchanged:: 3.12
:class:`int` string inputs and string representations can be limited to
help avoid denial of service attacks. A :exc:`ValueError` is raised when
the limit is exceeded while converting a string *x* to an :class:`int` or
when converting an :class:`int` into a string would exceed the limit.
See the :ref:`integer string conversion length limitation
<int_max_str_digits>` documentation.

.. function:: isinstance(object, classinfo)

Expand Down
11 changes: 11 additions & 0 deletions Doc/library/json.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ is a lightweight data interchange format inspired by
`JavaScript <https://en.wikipedia.org/wiki/JavaScript>`_ object literal syntax
(although it is not a strict subset of JavaScript [#rfc-errata]_ ).

.. warning::
Be cautious when parsing JSON data from untrusted sources. A malicious
JSON string may cause the decoder to consume considerable CPU and memory
resources. Limiting the size of data to be parsed is recommended.

:mod:`json` exposes an API familiar to users of the standard library
:mod:`marshal` and :mod:`pickle` modules.

Expand Down Expand Up @@ -253,6 +258,12 @@ Basic Usage
be used to use another datatype or parser for JSON integers
(e.g. :class:`float`).

.. versionchanged:: 3.12
The default *parse_int* of :func:`int` now limits the maximum length of
the integer string via the interpreter's :ref:`integer string
conversion length limitation <int_max_str_digits>` to help avoid denial
of service attacks.

*parse_constant*, if specified, will be called with one of the following
strings: ``'-Infinity'``, ``'Infinity'``, ``'NaN'``.
This can be used to raise an exception if invalid JSON numbers
Expand Down
166 changes: 166 additions & 0 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -622,6 +622,13 @@ class`. float also has the following additional methods.
:exc:`OverflowError` on infinities and a :exc:`ValueError` on
NaNs.

.. note::

The values returned by ``as_integer_ratio()`` can be huge. Attempts
gpshead marked this conversation as resolved.
Show resolved Hide resolved
to render such integers into decimal strings may bump into the
:ref:`integer string conversion length limitation
<int_max_str_digits>`.

.. method:: float.is_integer()

Return ``True`` if the float instance is finite with integral
Expand Down Expand Up @@ -5460,6 +5467,165 @@ types, where they are relevant. Some of these are not reported by the
[<class 'bool'>]


.. _int_max_str_digits:

Integer string conversion length limitation
===========================================

CPython has a global limit for converting between :class:`int` and :class:`str`
to mitigate denial of service attacks. This limit *only* applies to decimal or
other non-power-of-two number bases. Hexadecimal, octal, and binary conversions
are unlimited. The limit can be configured.

The :class:`int` type in CPython is an abitrary length number stored in binary
form (commonly known as a "bignum"). There exists no algorithm that can convert
a string to a binary integer or a binary integer to a string in linear time,
*unless* the base is a power of 2. Even the best known algorithms for base 10
have sub-quadratic complexity. Converting a large value such as ``int('1' *
500_000)`` can take over a second on a fast CPU.

Limiting conversion size offers a practical way to avoid `CVE-2020-10735
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.

The limit is applied to the number of digit characters in the input or output
string when a non-linear conversion algorithm would be involved. Underscores
and the sign are not counted towards the limit.

When an operation would exceed the limit, a :exc:`ValueError` is raised:

.. doctest::

>>> import sys
>>> sys.set_int_max_str_digits(4300) # Illustrative, this is the default.
>>> _ = int('2' * 5432)
Traceback (most recent call last):
...
ValueError: Exceeds the limit (4300) for integer string conversion: value has 5432 digits.
>>> i = int('2' * 4300)
>>> len(str(i))
4300
>>> i_squared = i*i
>>> len(str(i_squared))
Traceback (most recent call last):
...
ValueError: Exceeds the limit (4300) for integer string conversion: value has 8599 digits.
>>> len(hex(i_squared))
7144
>>> assert int(hex(i_squared), base=16) == i*i # Hexadecimal is unlimited.

The default limit is 4300 digits as provided in
:data:`sys.int_info.default_max_str_digits <sys.int_info>`.
The lowest limit that can be configured is 640 digits as provided in
:data:`sys.int_info.str_digits_check_threshold <sys.int_info>`.

Verification:

.. doctest::

>>> import sys
>>> assert sys.int_info.default_max_str_digits == 4300, sys.int_info
>>> assert sys.int_info.str_digits_check_threshold == 640, sys.int_info
>>> msg = int('578966293710682886880994035146873798396722250538762761564'
... '9252925514383915483333812743580549779436104706260696366600'
... '571186405732').to_bytes(53, 'big')
...

.. versionadded:: 3.12

Affected APIs
-------------

The limition only applies to potentially slow conversions between :class:`int`
and :class:`str` or :class:`bytes`:

* ``int(string)`` with default base 10.
* ``int(string, base)`` for all bases that are not a power of 2.
* ``str(integer)``.
* ``repr(integer)``
* any other string conversion to base 10, for example ``f"{integer}"``,
``"{}".format(integer)``, or ``b"%d" % integer``.

The limitations do not apply to functions with a linear algorithm:

* ``int(string, base)`` with base 2, 4, 8, 16, or 32.
* :func:`int.from_bytes` and :func:`int.to_bytes`.
* :func:`hex`, :func:`oct`, :func:`bin`.
* :ref:`formatspec` for hex, octal, and binary numbers.
* :class:`str` to :class:`float`.
* :class:`str` to :class:`decimal.Decimal`.

Configuring the limit
---------------------

Before Python starts up you can use an environment variable or an interpreter
command line flag to configure the limit:

* :envvar:`PYTHONINTMAXSTRDIGITS`, e.g.
``PYTHONINTMAXSTRDIGITS=640 python3`` to set the limit to 640 or
``PYTHONINTMAXSTRDIGITS=0 python3`` to disable the limitation.
* :option:`-X int_max_str_digits <-X>`, e.g.
``python3 -X int_max_str_digits=640``
* :data:`sys.flags.int_max_str_digits` contains the value of
:envvar:`PYTHONINTMAXSTRDIGITS` or :option:`-X int_max_str_digits <-X>`.
If both the env var and the ``-X`` option are set, the ``-X`` option takes
precedence. A value of *-1* indicates that both were unset, thus a value of
:data:`sys.int_info.default_max_str_digits` was used during initilization.

From code, you can inspect the current limit and set a new one using these
:mod:`sys` APIs:

* :func:`sys.get_int_max_str_digits` and :func:`sys.set_int_max_str_digits` are
a getter and setter for the interpreter-wide limit. Subinterpreters have
their own limit.

Information about the default and minimum can be found in :attr:`sys.int_info`:

* :data:`sys.int_info.default_max_str_digits <sys.int_info>` is the compiled-in
default limit.
* :data:`sys.int_info.str_digits_check_threshold <sys.int_info>` is the lowest
accepted value for the limit (other than 0 which disables it).

.. versionadded:: 3.12

.. caution::

Setting a low limit *can* lead to problems. While rare, code exists that
contains integer constants in decimal in their source that exceed the
minimum threshold. A consequence of setting the limit is that Python source
code containing decimal integer literals longer than the limit will
encounter an error during parsing, usually at startup time or import time or
even at installation time - anytime an up to date ``.pyc`` does not already
exist for the code. A workaround for source that contains such large
constants is to convert them to ``0x`` hexadecimal form as it has no limit.

Test your application thoroughly if you use a low limit. Ensure your tests
run with the limit set early via the environment or flag so that it applies
during startup and even during any installation step that may invoke Python
to precompile ``.py`` sources to ``.pyc`` files.

Recommended configuration
-------------------------

The default :data:`sys.int_info.default_max_str_digits` is expected to be
reasonable for most applications. If your application requires a different
limit, set it from your main entry point using Python version agnostic code as
these APIs were added in security patch releases in versions before 3.12.

Example::

>>> import sys
>>> if hasattr(sys, "set_int_max_str_digits"):
... upper_bound = 68000
... lower_bound = 4004
... current_limit = sys.get_int_max_str_digits()
... if current_limit == 0 or current_limit > upper_bound:
... sys.set_int_max_str_digits(upper_bound)
... elif current_limit < lower_bound:
... sys.set_int_max_str_digits(lower_bound)

If you need to disable it entirely, set it to ``0``.


.. rubric:: Footnotes

.. [1] Additional information on these special methods may be found in the Python
Expand Down
57 changes: 44 additions & 13 deletions Doc/library/sys.rst
Original file line number Diff line number Diff line change
Expand Up @@ -502,9 +502,9 @@ always available.
The :term:`named tuple` *flags* exposes the status of command line
flags. The attributes are read only.

============================= ================================================================
============================= ==============================================================================================================
attribute flag
============================= ================================================================
============================= ==============================================================================================================
:const:`debug` :option:`-d`
:const:`inspect` :option:`-i`
:const:`interactive` :option:`-i`
Expand All @@ -521,7 +521,8 @@ always available.
:const:`dev_mode` :option:`-X dev <-X>` (:ref:`Python Development Mode <devmode>`)
:const:`utf8_mode` :option:`-X utf8 <-X>`
:const:`safe_path` :option:`-P`
============================= ================================================================
:const:`int_max_str_digits` :option:`-X int_max_str_digits <-X>` (:ref:`integer string conversion length limitation <int_max_str_digits>`)
============================= ==============================================================================================================

.. versionchanged:: 3.2
Added ``quiet`` attribute for the new :option:`-q` flag.
Expand All @@ -543,6 +544,9 @@ always available.
.. versionchanged:: 3.11
Added the ``safe_path`` attribute for :option:`-P` option.

.. versionchanged:: 3.12
Added the ``int_max_str_digits`` attribute.


.. data:: float_info

Expand Down Expand Up @@ -723,6 +727,13 @@ always available.

.. versionadded:: 3.6

.. function:: get_int_max_str_digits()

Returns the current value for the :ref:`integer string conversion length
limitation <int_max_str_digits>`. See also :func:`set_int_max_str_digits`.

.. versionadded:: 3.12

.. function:: getrefcount(object)

Return the reference count of the *object*. The count returned is generally one
Expand Down Expand Up @@ -996,19 +1007,31 @@ always available.

.. tabularcolumns:: |l|L|

+-------------------------+----------------------------------------------+
| Attribute | Explanation |
+=========================+==============================================+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
| | integers are stored internally in base |
| | ``2**int_info.bits_per_digit`` |
+-------------------------+----------------------------------------------+
| :const:`sizeof_digit` | size in bytes of the C type used to |
| | represent a digit |
+-------------------------+----------------------------------------------+
+----------------------------------------+-----------------------------------------------+
| Attribute | Explanation |
+========================================+===============================================+
| :const:`bits_per_digit` | number of bits held in each digit. Python |
| | integers are stored internally in base |
| | ``2**int_info.bits_per_digit`` |
+----------------------------------------+-----------------------------------------------+
| :const:`sizeof_digit` | size in bytes of the C type used to |
| | represent a digit |
+----------------------------------------+-----------------------------------------------+
| :const:`default_max_str_digits` | default value for |
| | :func:`sys.get_int_max_str_digits` when it |
| | is not otherwise explicitly configured. |
+----------------------------------------+-----------------------------------------------+
| :const:`str_digits_check_threshold` | minimum non-zero value for |
| | :func:`sys.set_int_max_str_digits`, |
| | :envvar:`PYTHONINTMAXSTRDIGITS`, or |
| | :option:`-X int_max_str_digits <-X>`. |
+----------------------------------------+-----------------------------------------------+

.. versionadded:: 3.1

.. versionchanged:: 3.12
Added ``default_max_str_digits`` and ``str_digits_check_threshold``.


.. data:: __interactivehook__

Expand Down Expand Up @@ -1308,6 +1331,14 @@ always available.

.. availability:: Unix.

.. function:: set_int_max_str_digits(n)

Set the :ref:`integer string conversion length limitation
<int_max_str_digits>` used by this interpreter. See also
:func:`get_int_max_str_digits`.

.. versionadded:: 3.12

.. function:: setprofile(profilefunc)

.. index::
Expand Down
10 changes: 10 additions & 0 deletions Doc/library/test.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1011,6 +1011,16 @@ The :mod:`test.support` module defines the following functions:
.. versionadded:: 3.10


.. function:: adjust_int_max_str_digits(max_digits)

This function returns a context manager that will change the global
:func:`sys.set_int_max_str_digits` setting for the duration of the
context to allow execution of test code that needs a different limit
on the number of digits when converting between an integer and string.

.. versionadded:: 3.12


The :mod:`test.support` module defines the following classes:


Expand Down
13 changes: 13 additions & 0 deletions Doc/using/cmdline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -505,6 +505,9 @@ Miscellaneous options
stored in a traceback of a trace. Use ``-X tracemalloc=NFRAME`` to start
tracing with a traceback limit of *NFRAME* frames. See the
:func:`tracemalloc.start` for more information.
* ``-X int_max_str_digits`` configures the :ref:`integer string conversion
length limitation <int_max_str_digits>`. See also
:envvar:`PYTHONINTMAXSTRDIGITS`.
* ``-X importtime`` to show how long each import takes. It shows module
name, cumulative time (including nested imports) and self time (excluding
nested imports). Note that its output may be broken in multi-threaded
Expand Down Expand Up @@ -582,6 +585,9 @@ Miscellaneous options
.. versionadded:: 3.11
The ``-X frozen_modules`` option.

.. versionadded:: 3.12
The ``-X int_max_str_digits`` option.

.. versionadded:: 3.12
The ``-X perf`` option.

Expand Down Expand Up @@ -763,6 +769,13 @@ conflict.

.. versionadded:: 3.2.3

.. envvar:: PYTHONINTMAXSTRDIGITS

If this variable is set to an integer, it is used to configure the
interpreter's global :ref:`integer string conversion length limitation
<int_max_str_digits>`.

.. versionadded:: 3.12

.. envvar:: PYTHONIOENCODING

Expand Down
11 changes: 11 additions & 0 deletions Doc/whatsnew/3.12.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,17 @@ Other Language Changes
mapping is hashable.
(Contributed by Serhiy Storchaka in :gh:`87995`.)

* Converting between :class:`int` and :class:`str` in bases other than 2
(binary), 4, 8 (octal), 16 (hexadecimal), or 32 such as base 10 (decimal)
now raises a :exc:`ValueError` if the number of digits in string form is
above a limit to avoid potential denial of service attacks due to the
algorithmic complexity. This is a mitigation for `CVE-2020-10735
<https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735>`_.
This limit can be configured or disabled by environment variable, command
line flag, or :mod:`sys` APIs. See the :ref:`integer string conversion
length limitation <int_max_str_digits>` documentation. The default limit
is 4300 digits in string form.


New Modules
===========
Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_global_strings.h
Original file line number Diff line number Diff line change
Expand Up @@ -451,6 +451,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(mapping)
STRUCT_FOR_ID(match)
STRUCT_FOR_ID(max_length)
STRUCT_FOR_ID(maxdigits)
STRUCT_FOR_ID(maxevents)
STRUCT_FOR_ID(maxmem)
STRUCT_FOR_ID(maxsplit)
Expand Down
Loading