From 0660e207fcebe97567d659d30ce1b425d54b5179 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 10:44:34 -0600 Subject: [PATCH 01/11] Move a couple more things down to "deferred". --- pep-0554.rst | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 8f2925e8916..2672a163ae3 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -1079,9 +1079,6 @@ haven't called ``release()``. Open Questions ============== -* add a "tp_share" type slot instead of using a global registry - for shareable types? - * impact of data sharing on cache performance in multi-core scenarios? (see [cache-line-ping-pong]_) @@ -1119,8 +1116,6 @@ interpreters would actually share the underlying mutex. This would provide much better efficiency than blocking channel ops. The main concern is that locks and channels don't mix well (as learned in Go). -* also track which interpreters are using a channel end? - * auto-run in a thread? The PEP proposes a hard separation between subinterpreters and threads: @@ -1476,6 +1471,18 @@ code (i.e. a script). This is equivalent to ``PyRun_StringFlags()``, ``exec()``, or a module body. None of those "return" anything. We can revisit this once ``run()`` supports functions, etc. +Add a "tp_share" type slot +-------------------------- + +This would replace the current global registry for shareable types. + +Expose which interpreters have actually *used* a channel end. +------------------------------------------------------------- + +Currently we associate interpreters upon access to a channel. We would +keep a separate association list for "upon use" and expose that. + + Rejected Ideas ============== From 36c1f573b9ea8e70c5f6e5bad4a12343748d6b38 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 11:38:59 -0600 Subject: [PATCH 02/11] Clarify about documentation and help for extension maintainers. --- pep-0554.rst | 68 +++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 57 insertions(+), 11 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 2672a163ae3..2c238c0823a 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -208,16 +208,25 @@ For sharing data between interpreters: | ``ChannelReleasedError`` | ``ChannelClosedError`` | The channel is released (but not yet closed). | +--------------------------+------------------------+------------------------------------------------+ -"Extending Python" Docs ------------------------ +Help for Extension Module Maintainers +------------------------------------- + +Many extension modules do not support use in subinterpreters yet. The +maintainers and users of such extension modules will both benefit when +they are updated to support subinterpreters. In the meantime users may +become confused by failures when using subinterpreters, which could +negatively impact extension maintainers. See `Concerns`_ below. -Many extension modules do not support use in subinterpreters. The -authors and users of such extension modules will both benefit when they -are updated to support subinterpreters. To help with that, a new page -will be added to the `Extending Python `_ docs. +To mitigate that impact and accelerate compatibility, we will do the +following: -This page will explain how to implement PEP 489 support and how to move -from global module state to per-interpreter. +* be clear that extension modules are *not* required to support use in + subinterpreters +* raise ``ImportError`` when an incompatible (no PEP 489 support) module + is imported in a subinterpreter +* provide resources (e.g. docs) to help maintainers reach compatibility +* reach out to the maintainers of Cython and of the most used extension + modules (on PyPI) to get feedback and possibly provide assistance Examples @@ -1076,15 +1085,52 @@ channel end was sent, still hold a reference to the channel end, and haven't called ``release()``. +Documentation +============= + +The new stdlib docs page for the ``interpreters`` module will include +the following: + +* (at the top) a clear note that subinterpreter support in extension + modules is not required +* some explanation about what subinterpreters are +* brief examples of how to use subinterpreters and channels +* a summary of the limitations of subinterpreters +* (for extension maintainers) a link to the resources for ensuring + subinterpreter compatibilty +* much of the API information in this PEP + +A separate page will be added to the docs for resources to help +extension maintainers ensure their modules can be used safely in +subinterpreters, under `Extending Python `. The page +will include the following information: + +* a summary about subinterpreters (similar to the same in the new + ``interpreters`` module page and in the C-API docs) +* an explanation of how extension modules can be impacted +* how to implement PEP 489 support +* how to move from global module state to per-interpreter +* how to take advantage of PEP 384 (heap types), PEP 3121 + (module state), and PEP 573 +* strategies for dealing with 3rd party C libraries that keep their + own subinterpreter-incompatible global state + +Note that the documentation will play a large part in mitigating any +negative impact that the new ``interpreters`` module might have on +extension module maintainers. + +Also, the ``ImportError`` for imcompatible extgension modules will have +a message that clearly says it is due to missing subinterpreter +compatibility and that extensions are not required to provide it. This +will help set user expectations properly. + + Open Questions ============== * impact of data sharing on cache performance in multi-core scenarios? (see [cache-line-ping-pong]_) -* strictly disallow subinterpreter import of extension modules without - PEP 489 support? - * add "isolated" mode to subinterpreters API? There are various ways that an interpreter could potentially operate From d83443db49bab4ee0e2ada52d23d3b97b92580b3 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 11:47:23 -0600 Subject: [PATCH 03/11] Clarify about "provisional" status. --- pep-0554.rst | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 2c238c0823a..3e17101daf3 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -708,9 +708,14 @@ Provisional Status The new ``interpreters`` module will be added with "provisional" status (see PEP 411). This allows Python users to experiment with the feature and provide feedback while still allowing us to adjust to that feedback. -The module will be provisional in Python 3.8 and we will make a decision -before the 3.9 release whether to keep it provisional, graduate it, or -remove it. +The module will be provisional in Python 3.9 and we will make a decision +before the 3.10 release whether to keep it provisional, graduate it, or +remove it. This PEP will be updated accordingly. + +While the module is provisional, any changes to the API (or to behavior) +do not need to be reflected here, nor get approval by the BDFL-delegate. +However, such changes will still need to go through the normal processes +(BPO for smaller changes and python-dev/PEP for substantial ones). Alternate Python Implementations From 7f2cec5c52aa794deb705bad778dc5d7b572af02 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 12:25:54 -0600 Subject: [PATCH 04/11] Add "isolated" mode and a way to disable it. --- pep-0554.rst | 95 +++++++++++++++++++++++++++++++--------------------- 1 file changed, 57 insertions(+), 38 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 3e17101daf3..5ac40c272ed 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -83,17 +83,17 @@ the `"interpreters" Module API`_ section below. For creating and using interpreters: -+----------------------------------+----------------------------------------------+ -| signature | description | -+==================================+==============================================+ -| ``list_all() -> [Interpreter]`` | Get all existing interpreters. | -+----------------------------------+----------------------------------------------+ -| ``get_current() -> Interpreter`` | Get the currently running interpreter. | -+----------------------------------+----------------------------------------------+ -| ``get_main() -> Interpreter`` | Get the main interpreter. | -+----------------------------------+----------------------------------------------+ -| ``create() -> Interpreter`` | Initialize a new (idle) Python interpreter. | -+----------------------------------+----------------------------------------------+ ++---------------------------------------------+----------------------------------------------+ +| signature | description | ++=============================================+==============================================+ +| ``list_all() -> [Interpreter]`` | Get all existing interpreters. | ++---------------------------------------------+----------------------------------------------+ +| ``get_current() -> Interpreter`` | Get the currently running interpreter. | ++---------------------------------------------+----------------------------------------------+ +| ``get_main() -> Interpreter`` | Get the main interpreter. | ++---------------------------------------------+----------------------------------------------+ +| ``create(*, isolated=True) -> Interpreter`` | Initialize a new (idle) Python interpreter. | ++---------------------------------------------+----------------------------------------------+ | @@ -104,6 +104,8 @@ For creating and using interpreters: +----------------------------------------+-----------------------------------------------------+ | ``.id`` | The interpreter's ID (read-only). | +----------------------------------------+-----------------------------------------------------+ +| ``.isolated`` | The interpreter's mode (read-only). | ++----------------------------------------+-----------------------------------------------------+ | ``.is_running() -> bool`` | Is the interpreter currently executing code? | +----------------------------------------+-----------------------------------------------------+ | ``.close()`` | Finalize and destroy the interpreter. | @@ -755,13 +757,14 @@ The module provides the following functions:: Return the main interpreter. If the Python implementation has no concept of a main interpreter then return None. - create() -> Interpreter + create(*, isolated=True) -> Interpreter Initialize a new Python interpreter and return it. The interpreter will be created in the current thread and will remain idle until something is run in it. The interpreter may be used in any thread and will run in whichever thread calls - ``interp.run()``. + ``interp.run()``. See "Interpreter Isolated Mode" below for + an explanation of the "isolated" parameter. The module also provides the following class:: @@ -770,7 +773,12 @@ The module also provides the following class:: id -> int: - The interpreter's ID (read-only). + The interpreter's ID. (read-only) + + isolated -> bool: + + Whether or not the interpreter is operating in "isolated" mode. + (read-only) is_running() -> bool: @@ -1090,6 +1098,41 @@ channel end was sent, still hold a reference to the channel end, and haven't called ``release()``. +.. _isolated-mode: + +Interpreter "Isolated" Mode +=========================== + +By default, every new interpreter created by ``interpreters.create()`` +has specific restrictions on any code it runs. This includes the +following: + +* importing an extension module fails if it does not implement the + PEP 489 API +* new threads are not allowed (including daemon threads) +* ``os.fork()`` is not allowed (so no ``multiprocessing``) +* ``os.exec*()``, AKA "fork+exec", is not allowed (so no ``subprocess``) + +This represents the full "isolated" mode of subinterpreters. It is +applied when ``interpreters.create()`` is called with the "isolated" +keyword-only argument set to ``True`` (the default). If +``interpreters.create(isolated=False)`` is called then none of those +restrictions is applied. + +One advantage of this approach is that it allows extension maintainers +to check subinterpreter compatibility before they implement the PEP 489 +API. Also note that ``isolated=False`` represents the historical +behavior when using the existing subinterpreters C-API, thus providing +backward compatibility. For the existing C-API itself, the default +remains ``isolated=False``. The same is true for the "main" module, so +existing use of Python will not change. + +We may choose to later loosen some of the above restrictions or provide +a way to enable/disable granular restrictions individually. Regardless, +requiring PEP 489 support from extension modules will always be a +default restriction. + + Documentation ============= @@ -1136,30 +1179,6 @@ Open Questions * impact of data sharing on cache performance in multi-core scenarios? (see [cache-line-ping-pong]_) -* add "isolated" mode to subinterpreters API? - -There are various ways that an interpreter could potentially operate -in a more isolated/restricted way:: - - * ImportError when importing ext. module without PEP 489 support - * no daemon threads - * no threads at all - * no multiprocessing - * ... - -This could be facilitated via settinga (separate or an int flag) on -the ``PyConfig`` struct on each ``PyInterpreterState``. (This would -require moving ``_PyInterpreterState_SetConfig()`` to the public C-API.) -By default the settings would all be False, for backward compatibility. - -The ``interpreters`` module, however, would likely use a more -restrictive default (e.g. always require PEP 489 support). This would -effectively be the "isolated" mode. It would make sense to add an arg -to ``interpreters.create()`` to disable "isolated" mode (at least the -PEP 489 part), since then extension authors could test their modules -under subinterpreters (without having to release a potentially broken -build with PEP 489 support). - * add a shareable synchronization primitive? This would be ``_threading.Lock`` (or something like it) where From 0ce38b175bd7a62e137c5bcb24d35d68e344087f Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 12:34:55 -0600 Subject: [PATCH 05/11] Move shareable threading.Lock down to "deferred". --- pep-0554.rst | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 5ac40c272ed..8d1145db585 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -1179,13 +1179,6 @@ Open Questions * impact of data sharing on cache performance in multi-core scenarios? (see [cache-line-ping-pong]_) -* add a shareable synchronization primitive? - -This would be ``_threading.Lock`` (or something like it) where -interpreters would actually share the underlying mutex. This would -provide much better efficiency than blocking channel ops. The main -concern is that locks and channels don't mix well (as learned in Go). - * auto-run in a thread? The PEP proposes a hard separation between subinterpreters and threads: @@ -1552,6 +1545,19 @@ Expose which interpreters have actually *used* a channel end. Currently we associate interpreters upon access to a channel. We would keep a separate association list for "upon use" and expose that. +Add a shareable synchronization primitive +----------------------------------------- + +This would be ``_threading.Lock`` (or something like it) where +interpreters would actually share the underlying mutex. This would +provide much better efficiency than blocking channel ops. The main +concern is that locks and channels don't mix well (as learned in Go). + +Note that the same functionality as a lock can be acheived by passing +some sort of "token" object through a channel. "send()" would be +equivalent to releasing the lock and "recv()" to acquiring the lock. + +We can add this later if it proves desireable without much trouble. Rejected Ideas From 21824aaf7e798e9fe1c3207f97a5eaf72abb26e4 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 12:45:08 -0600 Subject: [PATCH 06/11] Move BaseException propagation down to "deferred". --- pep-0554.rst | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 8d1145db585..edb001c4fc3 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -1191,19 +1191,6 @@ more often than not. So it would make sense to make this the default behavior. We would add a kw-only param "threaded" (default ``True``) to ``run()`` to allow the run-in-the-current-thread operation. -* what to do about BaseException propagation? - -The exception types that inherit from ``BaseException`` (aside from -``Exception``) are usually treated specially. These types are: -``KeyboardInterrupt``, ``SystemExit``, and ``GeneratorExit``. It may -make sense to treat them specially when it comes to propagation from -``run()``. Here are some options:: - - * propagate like normal via RunFailedError - * do not propagate (handle them somehow in the subinterpreter) - * propagate them directly (avoid RunFailedError) - * propagate them directly (set RunFailedError as __cause__) - TODO ====== @@ -1559,6 +1546,23 @@ equivalent to releasing the lock and "recv()" to acquiring the lock. We can add this later if it proves desireable without much trouble. +Propagate SystemExit and KeyboardInterrupt Differently +------------------------------------------------------ + +The exception types that inherit from ``BaseException`` (aside from +``Exception``) are usually treated specially. These types are: +``KeyboardInterrupt``, ``SystemExit``, and ``GeneratorExit``. It may +make sense to treat them specially when it comes to propagation from +``run()``. Here are some options:: + + * propagate like normal via RunFailedError + * do not propagate (handle them somehow in the subinterpreter) + * propagate them directly (avoid RunFailedError) + * propagate them directly (set RunFailedError as __cause__) + +We aren't going to worry about handling them differently. Threads +already ignore ``SystemExit``, so for now we will follow that pattern. + Rejected Ideas ============== From 956183adc72b67a9e20f2de6b47a99785da4f957 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 15:16:58 -0600 Subject: [PATCH 07/11] Clarify about channel lifespan. --- pep-0554.rst | 222 +++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 190 insertions(+), 32 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index edb001c4fc3..f9103f9f3f1 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -842,7 +842,6 @@ The module also provides the following class:: Supported code: source text. - Uncaught Exceptions ------------------- @@ -1045,7 +1044,7 @@ The module also provides the following channel-related classes:: If the other end is not currently receiving then return False. Otherwise return True. - release(): + release() -> bool: This is the same as "RecvChannel.release(), but applied to the sending end of the channel. @@ -1064,7 +1063,6 @@ The module also provides the following channel-related classes:: Note that ``send_buffer()`` is similar to how ``multiprocessing.Connection`` works. [mp-conn]_ - Channel Association ------------------- @@ -1073,29 +1071,205 @@ interpreters. This association effectively means "the channel end is available to that interpreter". It has ramifications on introspection and on how channels are automatically closed. +If an interpreter is not associated with a channel end, it is not +allowed to use that end. In that case send() or recv() would fail for +the respective end. Also, an interpreter can be associated with both +ends or with only one end. + When a channel is created, both ends are immediately associated with the current interpreter. When a channel end is passed to an interpreter -via ``Interpreter.run(..., channels=...)`` then that interpreter is -associated with the channel end. Likewise when a channel end is sent +via ``Interpreter.run(..., channels=...)`` then the interpreter is +associated with that channel end. Likewise when a channel end is sent through another channel, the receiving interpreter is associated with the sent channel end. -A channel end is explicitly released by an interpreter through the -``release()`` method. It is also done automatically for an interpreter -when the last ``*Channel`` object for the end in that interpreter is -garbage-collected, as though ``release()`` were called. +A channel end is explicitly un-associated for the currently interpreter +through the ``release()`` method. It is also done automatically for an +interpreter when the last ``*Channel`` object for the end in that +interpreter is garbage-collected, as though ``release()`` were called. + +Consequently, ``*Channel.interpreters`` means those to which the +channel end was sent, still hold a reference to the channel end, and +haven't called ``release()``. + +Un-associating an interpreter from one end does not automatically cause +it to be done for the other end. Once an interpreter has been +un-associated from a channel end, it can no longer use that end nor can +it be associated with that end again. -Calling ``*Channel.close()`` automatically releases the channel in all -interpreters for both ends. +Calling ``*Channel.close()`` automatically un-associates both channel +ends for all interpreters. -Once the number of associated interpreters on both ends drops -to 0, the channel is actually closed. The Python runtime will +Automatic Channel Closing +------------------------- + +Once 0 interpreters are associated with one of the channel ends, the +entire channel is automatically closed. The Python runtime will garbage-collect all closed channels, though it may not happen immediately. -Consequently, ``*Channel.interpreters`` means those to which the -channel end was sent, still hold a reference to the channel end, and -haven't called ``release()``. +If the channel is closed from the "recv" end then the channel is closed +immediately. If it's the "send" end then the channel is closed once it +is empty. + +Closing a Non-Empty Channel +--------------------------- + +... + + + +* add a more detailed description of channel lifespan + +A state machine diagram may be most effective. Relevant questions: + + * How does an interpreter detach from the receiving end of a channel + that is never empty? + * What happens if an interpreter deletes the last reference to a + non-empty channel? + * On the receiving end, or on the sending end? + + + + + +empty / not empty + +recv.release() +send.release() + +recv.close() +recv.close(force=True) +send.close() +send.close(force=True) + +The lifespan of channels is the most complex part of this proposal. +Here's a summary of how it works. + +Every channel tracks the interpreters associated with each of its ends +(send, rexc). An interpreter is associated as soon as it gains access +to the channel end (when created, when passed to ``Interpreter.run()``, +when passed through a channel). An interpreter may send or recv only +when it is associated with the corresponding end of the channel. + +An interpreter becomes automatically un-associated with a channel end +as soon as it has no more references to (objects for) that end. The +``release()`` method can also be used to explicitly un-associate the +end. + + +Here is an example of how channel state changes as they get used: + ++---------------+-----------+--------+------+------+ +| interp A | interp B | chan | recv | send | ++===============+===========+========+======+======+ +| new chan | | 0 | A | A | ++---------------+-----------+--------+------+------+ +| run B (recv) | | 0 | AB | A | ++---------------+-----------+--------+------+------+ +| send "a" | | 1 | AB | A | ++---------------+-----------+--------+------+------+ +| | recv "a" | 0 | AB | A | ++---------------+-----------+--------+------+------+ +| send "b" | | 1 | AB | A | ++---------------+-----------+--------+------+------+ +| send "c" | | 2 | AB | A | ++---------------+-----------+--------+------+------+ +| | recv "b" | 1 | AB | A | ++---------------+-----------+--------+------+------+ + +At this point there are 2 items queued up, waiting to be received. +Both interpreters are associated with the "recv" end of the channel +and interpreter A is associated with the "send" end. + +If we keep going, let's say that there is no chance interpreter A will +use the recv end, so we can release it and keep going. + ++---------------+-----------+--------+------+------+ +| interp A | interp B | chan | recv | send | ++===============+===========+========+======+======+ +| send "d" | | 2 | AB | A | ++---------------+-----------+--------+------+------+ +| release recv | | 2 | B | A | ++---------------+-----------+--------+------+------+ +| send "e" | | 3 | B | A | ++---------------+-----------+--------+------+------+ + +If "c" were a marker that the work is done then interpreter B would +stop running. The channel would stay open even through no interpreters +are associated with the recv end. Interpreter 1 can keep sending to it. + ++---------------+-----------+--------+------+------+ +| interp A | interp B | chan | recv | send | ++===============+===========+========+======+======+ +| | recv "c" | 2 | B | A | ++---------------+-----------+--------+------+------+ +| | | 2 | | A | ++---------------+-----------+--------+------+------+ +| send "f" | | 3 | | A | ++---------------+-----------+--------+------+------+ + +At that point the channel might not be used any more. It stays open +with 3 items queued up (uh-oh, a memory leak). Let's say that the +original interpreter i + ++---------------+-----------+--------+------+------+ +| interp A | interp B | chan | recv | send | ++===============+===========+========+======+======+ +| release | | 3 | A | A | ++---------------+-----------+--------+------+------+ +| send "f" | | 3 | A | A | ++---------------+-----------+--------+------+------+ + ++---------------+-----------+-----------+----------+--------+------+------+ +| interp A | interp B | interp C | interp D | chan X | recv | send | ++===============+===========+===========+==========+========+======+======+ +| new chan (X) | | | | 0 | A | A | ++---------------+-----------+-----------+----------+--------+------+------+ +| run B (sendX) | | | | 0 | A | AB | ++---------------+-----------+-----------+----------+--------+------+------+ +| sendX "a" | | | | 1 | A | AB | ++---------------+-----------+-----------+----------+--------+------+------+ +| run C (recvX) | | | | 1 | AC | AB | ++---------------+-----------+-----------+----------+--------+------+------+ +| | sendX "b" | | | 2 | AC | AB | ++---------------+-----------+-----------+----------+--------+------+------+ +| | | recvX "a" | | 1 | AC | AB | ++---------------+-----------+-----------+----------+--------+------+------+ +| run D (bothX) | | | | 1 | ACD | ABD | ++---------------+-----------+-----------+----------+--------+------+------+ +| release sendX | | | | 1 | ACD | BD | ++---------------+-----------+-----------+----------+--------+------+------+ +| | | | | | | | ++---------------+-----------+-----------+----------+--------+------+------+ +| | | | | | | | ++---------------+-----------+-----------+----------+--------+------+------+ + ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| interp A | interp B | interp C | interp D | chan X | recv | send | chan Y | recv | send | ++===============+===========+===========+==========+========+======+======+========+======+======+ +| new chan (X) | | | | 0 | A | A | | | | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| new chan (Y) | | | | 0 | A | A | 0 | A | A | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| run B (sendX) | | | | 0 | A | AB | 0 | A | AB | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| sendX "A" | | | | 1 | A | AB | 0 | A | AB | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| run C (recvX) | | | | 1 | AC | AB | 0 | AC | AB | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| | sendX "B" | | | 2 | AC | AB | 0 | AC | AB | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| | | recvX "A" | | 1 | AC | AB | 0 | AC | AB | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| run D (bothX) | | | | 1 | ACD | ABD | 0 | ACD | ABD | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| release sendX | | | | 1 | ACD | BD | 0 | ACD | BD | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| | | | | | | | | | | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +| | | | | | | | | | | ++---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ .. _isolated-mode: @@ -1192,22 +1366,6 @@ behavior. We would add a kw-only param "threaded" (default ``True``) to ``run()`` to allow the run-in-the-current-thread operation. -TODO -====== - -* add a more detailed description of channel lifespan - -A state machine diagram may be most effective. Relevant questions: - - * How does an interpreter detach from the receiving end of a channel - that is never empty? - * What happens if an interpreter deletes the last reference to a - non-empty channel? - * On the receiving end, or on the sending end? - -* run the CPython test suite in a subinterpreter and see what shakes out - - Deferred Functionality ====================== From d2d6e154ff407dec2622b76d32183d5f65022623 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 15:41:52 -0600 Subject: [PATCH 08/11] Drop the release() and close() methods of RecvChannel and SendChannel. --- pep-0554.rst | 324 ++++----------------------------------------------- 1 file changed, 23 insertions(+), 301 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index f9103f9f3f1..f4a5b20342b 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -145,20 +145,12 @@ For sharing data between interpreters: +------------------------------------------+-----------------------------------------------+ | ``.id`` | The channel's unique ID. | +------------------------------------------+-----------------------------------------------+ -| ``.interpreters`` | The list of associated interpreters. | -+------------------------------------------+-----------------------------------------------+ | ``.recv() -> object`` | | Get the next object from the channel, | | | | and wait if none have been sent. | -| | | Associate the interpreter with the channel. | +------------------------------------------+-----------------------------------------------+ | ``.recv_nowait(default=None) -> object`` | | Like recv(), but return the default | | | | instead of waiting. | +------------------------------------------+-----------------------------------------------+ -| ``.release()`` | | No longer associate the current interpreter | -| | | with the channel (on the receiving end). | -+------------------------------------------+-----------------------------------------------+ -| ``.close(force=False)`` | | Close the channel in all interpreters. | -+------------------------------------------+-----------------------------------------------+ | @@ -169,26 +161,17 @@ For sharing data between interpreters: +------------------------------+--------------------------------------------------+ | ``.id`` | The channel's unique ID. | +------------------------------+--------------------------------------------------+ -| ``.interpreters`` | The list of associated interpreters. | -+------------------------------+--------------------------------------------------+ | ``.send(obj)`` | | Send the object (i.e. its data) to the | | | | receiving end of the channel and wait. | -| | | Associate the interpreter with the channel. | +------------------------------+--------------------------------------------------+ | ``.send_nowait(obj)`` | | Like send(), but return False if not received. | +------------------------------+--------------------------------------------------+ | ``.send_buffer(obj)`` | | Send the object's (PEP 3118) buffer to the | | | | receiving end of the channel and wait. | -| | | Associate the interpreter with the channel. | +------------------------------+--------------------------------------------------+ | ``.send_buffer_nowait(obj)`` | | Like send_buffer(), but return False | | | | if not received. | +------------------------------+--------------------------------------------------+ -| ``.release()`` | | No longer associate the current interpreter | -| | | with the channel (on the sending end). | -+------------------------------+--------------------------------------------------+ -| ``.close(force=False)`` | | Close the channel in all interpreters. | -+------------------------------+--------------------------------------------------+ | @@ -205,10 +188,6 @@ For sharing data between interpreters: +--------------------------+------------------------+------------------------------------------------+ | ``NotReceivedError`` | ``ChannelError`` | Nothing was waiting to receive a sent object. | +--------------------------+------------------------+------------------------------------------------+ -| ``ChannelClosedError`` | ``ChannelError`` | The channel is closed. | -+--------------------------+------------------------+------------------------------------------------+ -| ``ChannelReleasedError`` | ``ChannelClosedError`` | The channel is released (but not yet closed). | -+--------------------------+------------------------+------------------------------------------------+ Help for Extension Module Maintainers ------------------------------------- @@ -315,7 +294,6 @@ Synchronize using a channel interp.run(tw.dedent(""" reader.recv() print("during") - reader.release() """), shared=dict( reader=r, @@ -326,7 +304,6 @@ Synchronize using a channel t.start() print('after') s.send(b'') - s.release() Sharing a file descriptor ------------------------- @@ -377,7 +354,6 @@ Passing objects via marshal obj = marshal.loads(data) do_something(obj) data = reader.recv() - reader.release() """)) t = threading.Thread(target=run) t.start() @@ -407,7 +383,6 @@ Passing objects via pickle obj = pickle.loads(data) do_something(obj) data = reader.recv() - reader.release() """)) t = threading.Thread(target=run) t.start() @@ -914,9 +889,7 @@ to channels:: create_channel() -> (RecvChannel, SendChannel): Create a new channel and return (recv, send), the RecvChannel - and SendChannel corresponding to the ends of the channel. The - lifetime of the channel is determined by associations between - intepreters and the channel's ends (see below). + and SendChannel corresponding to the ends of the channel. Both ends of the channel are supported "shared" objects (i.e. may be safely shared by different interpreters. Thus they @@ -938,13 +911,6 @@ The module also provides the following channel-related classes:: The channel's unique ID. This is shared with the "send" end. - interpreters => [Interpreter]: - - The list of interpreters associated with the "recv" end of - the channel. (See below for more on how interpreters are - associated with channels.) If the channel has been closed - then raise ChannelClosedError. - recv(): Return the next object from the channel. If none have been @@ -955,47 +921,12 @@ The module also provides the following channel-related classes:: though it could also be a compatible proxy. Regardless, it may use a copy of that data or actually share the data. - If the channel is already closed then raise ChannelClosedError. - If the channel isn't closed but the current interpreter already - called the "release()" method for the "recv" end then raise - ChannelReleasedError (which is a subclass of - ChannelClosedError). - recv_nowait(default=None): Return the next object from the channel. If none have been sent then return the default. Otherwise, this is the same as the "recv()" method. - release() -> bool: - - No longer associate the current interpreter with the channel - (on the "recv" end) and block any future association If the - interpreter was never associated with the channel then still - block any future association. The "send" end of the channel - is unaffected by a released "recv" end. - - Once an interpreter is no longer associated with the "recv" - end of the channel, any "recv()" and "recv_nowait()" calls - from that interpreter will fail (even ongoing calls). See - "recv()" for details. - - See below for more on how association relates to auto-closing - a channel. - - This operation is idempotent. Return True if "release()" - has not been called before by the current interpreter. - - close(force=False): - - Close both ends of the channel (in all interpreters). This - means that any further use of the channel anywhere raises - ChannelClosedError. If the channel is not empty then - raise ChannelNotEmptyError (if "force" is False) or - discard the remaining objects (if "force" is True) - and close it. Note that the behavior of closing - the "send" end is slightly different. - class SendChannel(id): @@ -1007,21 +938,12 @@ The module also provides the following channel-related classes:: The channel's unique ID. This is shared with the "recv" end. - interpreters -> [Interpreter]: - - Like "RecvChannel.interpreters" but for the "send" end. - send(obj): Send the object (i.e. its data) to the "recv" end of the channel. Wait until the object is received. If the object is not shareable then ValueError is raised. - If this channel end was already released - by the interpreter then raise ChannelReleasedError. If - the channel is already closed then raise - ChannelClosedError. - send_nowait(obj): Send the object to the "recv" end of the channel. This @@ -1044,232 +966,16 @@ The module also provides the following channel-related classes:: If the other end is not currently receiving then return False. Otherwise return True. - release() -> bool: - - This is the same as "RecvChannel.release(), but applied - to the sending end of the channel. - - close(force=False): - - Close both ends of the channel (in all interpreters). No - matter what the "send" end of the channel is immediately - closed. If the channel is empty then close the "recv" - end immediately too. Otherwise, if "force" if False, - close the "recv" end (and hence the full channel) - once the channel becomes empty; or, if "force" - is True, discard the remaining items and - close immediately. - Note that ``send_buffer()`` is similar to how ``multiprocessing.Connection`` works. [mp-conn]_ -Channel Association -------------------- - -Each end (send/recv) of each channel is associated with a set of -interpreters. This association effectively means "the channel end -is available to that interpreter". It has ramifications on -introspection and on how channels are automatically closed. - -If an interpreter is not associated with a channel end, it is not -allowed to use that end. In that case send() or recv() would fail for -the respective end. Also, an interpreter can be associated with both -ends or with only one end. - -When a channel is created, both ends are immediately associated with -the current interpreter. When a channel end is passed to an interpreter -via ``Interpreter.run(..., channels=...)`` then the interpreter is -associated with that channel end. Likewise when a channel end is sent -through another channel, the receiving interpreter is associated with -the sent channel end. - -A channel end is explicitly un-associated for the currently interpreter -through the ``release()`` method. It is also done automatically for an -interpreter when the last ``*Channel`` object for the end in that -interpreter is garbage-collected, as though ``release()`` were called. - -Consequently, ``*Channel.interpreters`` means those to which the -channel end was sent, still hold a reference to the channel end, and -haven't called ``release()``. - -Un-associating an interpreter from one end does not automatically cause -it to be done for the other end. Once an interpreter has been -un-associated from a channel end, it can no longer use that end nor can -it be associated with that end again. - -Calling ``*Channel.close()`` automatically un-associates both channel -ends for all interpreters. - -Automatic Channel Closing -------------------------- - -Once 0 interpreters are associated with one of the channel ends, the -entire channel is automatically closed. The Python runtime will -garbage-collect all closed channels, though it may not happen -immediately. - -If the channel is closed from the "recv" end then the channel is closed -immediately. If it's the "send" end then the channel is closed once it -is empty. - -Closing a Non-Empty Channel ---------------------------- - -... - - - -* add a more detailed description of channel lifespan - -A state machine diagram may be most effective. Relevant questions: +Channel Lifespan +---------------- - * How does an interpreter detach from the receiving end of a channel - that is never empty? - * What happens if an interpreter deletes the last reference to a - non-empty channel? - * On the receiving end, or on the sending end? - - - - - -empty / not empty - -recv.release() -send.release() - -recv.close() -recv.close(force=True) -send.close() -send.close(force=True) - -The lifespan of channels is the most complex part of this proposal. -Here's a summary of how it works. - -Every channel tracks the interpreters associated with each of its ends -(send, rexc). An interpreter is associated as soon as it gains access -to the channel end (when created, when passed to ``Interpreter.run()``, -when passed through a channel). An interpreter may send or recv only -when it is associated with the corresponding end of the channel. - -An interpreter becomes automatically un-associated with a channel end -as soon as it has no more references to (objects for) that end. The -``release()`` method can also be used to explicitly un-associate the -end. - - -Here is an example of how channel state changes as they get used: - -+---------------+-----------+--------+------+------+ -| interp A | interp B | chan | recv | send | -+===============+===========+========+======+======+ -| new chan | | 0 | A | A | -+---------------+-----------+--------+------+------+ -| run B (recv) | | 0 | AB | A | -+---------------+-----------+--------+------+------+ -| send "a" | | 1 | AB | A | -+---------------+-----------+--------+------+------+ -| | recv "a" | 0 | AB | A | -+---------------+-----------+--------+------+------+ -| send "b" | | 1 | AB | A | -+---------------+-----------+--------+------+------+ -| send "c" | | 2 | AB | A | -+---------------+-----------+--------+------+------+ -| | recv "b" | 1 | AB | A | -+---------------+-----------+--------+------+------+ - -At this point there are 2 items queued up, waiting to be received. -Both interpreters are associated with the "recv" end of the channel -and interpreter A is associated with the "send" end. - -If we keep going, let's say that there is no chance interpreter A will -use the recv end, so we can release it and keep going. - -+---------------+-----------+--------+------+------+ -| interp A | interp B | chan | recv | send | -+===============+===========+========+======+======+ -| send "d" | | 2 | AB | A | -+---------------+-----------+--------+------+------+ -| release recv | | 2 | B | A | -+---------------+-----------+--------+------+------+ -| send "e" | | 3 | B | A | -+---------------+-----------+--------+------+------+ - -If "c" were a marker that the work is done then interpreter B would -stop running. The channel would stay open even through no interpreters -are associated with the recv end. Interpreter 1 can keep sending to it. - -+---------------+-----------+--------+------+------+ -| interp A | interp B | chan | recv | send | -+===============+===========+========+======+======+ -| | recv "c" | 2 | B | A | -+---------------+-----------+--------+------+------+ -| | | 2 | | A | -+---------------+-----------+--------+------+------+ -| send "f" | | 3 | | A | -+---------------+-----------+--------+------+------+ - -At that point the channel might not be used any more. It stays open -with 3 items queued up (uh-oh, a memory leak). Let's say that the -original interpreter i - -+---------------+-----------+--------+------+------+ -| interp A | interp B | chan | recv | send | -+===============+===========+========+======+======+ -| release | | 3 | A | A | -+---------------+-----------+--------+------+------+ -| send "f" | | 3 | A | A | -+---------------+-----------+--------+------+------+ - -+---------------+-----------+-----------+----------+--------+------+------+ -| interp A | interp B | interp C | interp D | chan X | recv | send | -+===============+===========+===========+==========+========+======+======+ -| new chan (X) | | | | 0 | A | A | -+---------------+-----------+-----------+----------+--------+------+------+ -| run B (sendX) | | | | 0 | A | AB | -+---------------+-----------+-----------+----------+--------+------+------+ -| sendX "a" | | | | 1 | A | AB | -+---------------+-----------+-----------+----------+--------+------+------+ -| run C (recvX) | | | | 1 | AC | AB | -+---------------+-----------+-----------+----------+--------+------+------+ -| | sendX "b" | | | 2 | AC | AB | -+---------------+-----------+-----------+----------+--------+------+------+ -| | | recvX "a" | | 1 | AC | AB | -+---------------+-----------+-----------+----------+--------+------+------+ -| run D (bothX) | | | | 1 | ACD | ABD | -+---------------+-----------+-----------+----------+--------+------+------+ -| release sendX | | | | 1 | ACD | BD | -+---------------+-----------+-----------+----------+--------+------+------+ -| | | | | | | | -+---------------+-----------+-----------+----------+--------+------+------+ -| | | | | | | | -+---------------+-----------+-----------+----------+--------+------+------+ - -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| interp A | interp B | interp C | interp D | chan X | recv | send | chan Y | recv | send | -+===============+===========+===========+==========+========+======+======+========+======+======+ -| new chan (X) | | | | 0 | A | A | | | | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| new chan (Y) | | | | 0 | A | A | 0 | A | A | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| run B (sendX) | | | | 0 | A | AB | 0 | A | AB | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| sendX "A" | | | | 1 | A | AB | 0 | A | AB | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| run C (recvX) | | | | 1 | AC | AB | 0 | AC | AB | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| | sendX "B" | | | 2 | AC | AB | 0 | AC | AB | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| | | recvX "A" | | 1 | AC | AB | 0 | AC | AB | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| run D (bothX) | | | | 1 | ACD | ABD | 0 | ACD | ABD | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| release sendX | | | | 1 | ACD | BD | 0 | ACD | BD | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| | | | | | | | | | | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ -| | | | | | | | | | | -+---------------+-----------+-----------+----------+--------+------+------+--------+------+------+ +A channel is automatically closed and destoyed once there are no more +Python objects (e.g. ``RecvChannel`` and ``SendChannel``) referring +to it. So it is effectively triggered via garbage-collection of those +objects.. .. _isolated-mode: @@ -1721,6 +1427,22 @@ make sense to treat them specially when it comes to propagation from We aren't going to worry about handling them differently. Threads already ignore ``SystemExit``, so for now we will follow that pattern. +Add an explicit release() and close() to channel end classes +------------------------------------------------------------ + +It can be convenient to have an explicit way to close a channel against +further global use. Likewise it could be useful to have an explicit +way to release one of the channel ends relative to the current +interpreter. Among other reasons, such a mechanism is useful for +communicating overall state between interpreters without the extra +boilerplate that passing objects through a channel directly would +require. + +The challenge is getting automatic release/close right without making +it hard to understand. This is especially true when dealing with a +non-empty channel. We should be able to get by without release/close +for now. + Rejected Ideas ============== From 3df91b4fd4c279851a95568d418de4d921ff2da7 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 15:50:53 -0600 Subject: [PATCH 09/11] Drop SendChannel.send_buffer(). --- pep-0554.rst | 36 ++++++++++-------------------------- 1 file changed, 10 insertions(+), 26 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index f4a5b20342b..68d2b52cc9e 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -69,7 +69,6 @@ At first only the following types will be supported for sharing: * bytes * str * int -* PEP 3118 buffer objects (via ``send_buffer()``) * PEP 554 channels Support for other basic types (e.g. bool, float, Ellipsis) will be added later. @@ -166,12 +165,6 @@ For sharing data between interpreters: +------------------------------+--------------------------------------------------+ | ``.send_nowait(obj)`` | | Like send(), but return False if not received. | +------------------------------+--------------------------------------------------+ -| ``.send_buffer(obj)`` | | Send the object's (PEP 3118) buffer to the | -| | | receiving end of the channel and wait. | -+------------------------------+--------------------------------------------------+ -| ``.send_buffer_nowait(obj)`` | | Like send_buffer(), but return False | -| | | if not received. | -+------------------------------+--------------------------------------------------+ | @@ -621,7 +614,6 @@ channels to the following: * bytes * str * int -* PEP 3118 buffer objects (via ``send_buffer()``) * channels Limiting the initial shareable types is a practical matter, reducing @@ -877,7 +869,7 @@ with unbuffered semantics). Python objects are not shared between interpreters. However, in some cases data those objects wrap is actually shared and not just copied. -One example is PEP 3118 buffers. In those cases the object in the +One example might be PEP 3118 buffers. In those cases the object in the original interpreter is kept alive until the shared data in the other interpreter is no longer used. Then object destruction can happen like normal in the original interpreter, along with the previously shared @@ -952,23 +944,6 @@ The module also provides the following channel-related classes:: other end) then queue the object and return False. Otherwise return True. - send_buffer(obj): - - Send a MemoryView of the object rather than the object. - Otherwise this is the same as "send()". Note that the - object must implement the PEP 3118 buffer protocol. - The buffer will always be released in the original - interpreter, like normal. - - send_buffer_nowait(obj): - - Send a MemoryView of the object rather than the object. - If the other end is not currently receiving then return - False. Otherwise return True. - -Note that ``send_buffer()`` is similar to how -``multiprocessing.Connection`` works. [mp-conn]_ - Channel Lifespan ---------------- @@ -1443,6 +1418,15 @@ it hard to understand. This is especially true when dealing with a non-empty channel. We should be able to get by without release/close for now. +Add SendChannel.send_buffer() +----------------------------- + +This method would allow no-copy sending of an object through a channel +if it supports the PEP 3118 buffer protocol (e.g. memoryview). + +Support for this is not fundamental to channels and can be added on +later without much disruption. + Rejected Ideas ============== From 682e6f2f9d60a7c960f7c8ec8366a32f7a9c16c8 Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 15:55:45 -0600 Subject: [PATCH 10/11] Deal with the remaining open issues. --- pep-0554.rst | 41 ++++++++++++++++++++++------------------- 1 file changed, 22 insertions(+), 19 deletions(-) diff --git a/pep-0554.rst b/pep-0554.rst index 68d2b52cc9e..f41d4d75fb2 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -543,6 +543,15 @@ at length in this PEP. Just to be clear, the value lies in:: * preparation for per-interpreter GIL * encourage experimentation +* "data sharing can have a negative impact on cache performance + in multi-core scenarios" + +(See [cache-line-ping-pong]_.) + +This shouldn't be a problem for now as we have no immediate plans +to actually share data between interpreters, instead focusing +on copying. + About Subinterpreters ===================== @@ -1028,25 +1037,6 @@ compatibility and that extensions are not required to provide it. This will help set user expectations properly. -Open Questions -============== - -* impact of data sharing on cache performance in multi-core scenarios? - (see [cache-line-ping-pong]_) - -* auto-run in a thread? - -The PEP proposes a hard separation between subinterpreters and threads: -if you want to run in a thread you must create the thread yourself and -call ``run()`` in it. However, it might be convenient if ``run()`` -could do that for you, meaning there would be less boilerplate. - -Furthermore, we anticipate that users will want to run in a thread much -more often than not. So it would make sense to make this the default -behavior. We would add a kw-only param "threaded" (default ``True``) -to ``run()`` to allow the run-in-the-current-thread operation. - - Deferred Functionality ====================== @@ -1427,6 +1417,19 @@ if it supports the PEP 3118 buffer protocol (e.g. memoryview). Support for this is not fundamental to channels and can be added on later without much disruption. +Auto-run in a thread +-------------------- + +The PEP proposes a hard separation between subinterpreters and threads: +if you want to run in a thread you must create the thread yourself and +call ``run()`` in it. However, it might be convenient if ``run()`` +could do that for you, meaning there would be less boilerplate. + +Furthermore, we anticipate that users will want to run in a thread much +more often than not. So it would make sense to make this the default +behavior. We would add a kw-only param "threaded" (default ``True``) +to ``run()`` to allow the run-in-the-current-thread operation. + Rejected Ideas ============== From 8c2ba16a4b7f063ac71c9bd42d12720fcf63f81d Mon Sep 17 00:00:00 2001 From: Eric Snow Date: Fri, 1 May 2020 15:57:35 -0600 Subject: [PATCH 11/11] Update the post history. --- pep-0554.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pep-0554.rst b/pep-0554.rst index f41d4d75fb2..2e100646849 100644 --- a/pep-0554.rst +++ b/pep-0554.rst @@ -8,7 +8,7 @@ Content-Type: text/x-rst Created: 2017-09-05 Python-Version: 3.9 Post-History: 07-Sep-2017, 08-Sep-2017, 13-Sep-2017, 05-Dec-2017, - 09-May-2018, 20-Apr-2020 + 09-May-2018, 20-Apr-2020, 01-May-2020 Abstract