Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forkserver.set_forkserver_preload() silent preload import failures when sys.path is required #117378

Closed
doublex opened this issue Mar 29, 2024 · 2 comments
Assignees
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@doublex
Copy link

doublex commented Mar 29, 2024

Bug report

Bug description:

Bug in def main(): https://github.com/python/cpython/blob/main/Lib/multiprocessing/forkserver.py#L167

The param sys_path is ignored. Result: ModuleNotFoundError for preloaded modules.

a) Using sys_path fixes this issue
b) Maybe better remove "pass" and report and error to simplify problem solving

    if preload:
        if '__main__' in preload and main_path is not None:
            process.current_process()._inheriting = True
            try:
                spawn.import_main_path(main_path)
            finally:
                del process.current_process()._inheriting
+        if sys_path:
+            sys.path = sys_path
        for modname in preload:
            try:
                __import__(modname)
            except ImportError:
-                pass
+                warnings.warn('forkserver: preloading module failed %s' % modname)

CPython versions tested on:

3.12

Operating systems tested on:

No response

Linked PRs

@doublex doublex added the type-bug An unexpected behavior, bug, or error label Mar 29, 2024
@gpshead gpshead self-assigned this Nov 7, 2024
@gpshead gpshead added stdlib Python modules in the Lib dir 3.12 bugs and security fixes topic-multiprocessing labels Nov 7, 2024
gpshead added a commit to gpshead/cpython that referenced this issue Nov 7, 2024
…ritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python ~3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method.

A workaround for the bug this fixes was to set PYTHONPATH in the environment
before the forkserver process was started.
@gpshead
Copy link
Member

gpshead commented Nov 7, 2024

Thanks! This is an ironic bug. It has been there from the start for the forkserver. I suspect in a lot of environments it just didn't matter as the default path or PYTHONPATH from the environment was correct. But I can easily imagine situations where this would defeat the purpose of preloading or even potentially cause subtle bugs based on "" vs the absolute path determined at parent multiprocessing import time being in that preload's sys.path.

@gpshead gpshead added the 3.13 bugs and security fixes label Nov 7, 2024
@gpshead gpshead moved this to In Progress in Multiprocessing issues Nov 7, 2024
@doublex
Copy link
Author

doublex commented Nov 7, 2024

@gpshead
Thank you very much for your efforts!

gpshead added a commit that referenced this issue Nov 9, 2024
…e. (GH-126538)

gh-117378: Fix multiprocessing forkserver preload sys.path inheritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 9, 2024
…ritance. (pythonGH-126538)

pythongh-117378: Fix multiprocessing forkserver preload sys.path inheritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

(cherry picked from commit 9d08423)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
gpshead added a commit that referenced this issue Nov 9, 2024
…eritance. (GH-126538) (GH-126632)

gh-117378: Fix multiprocessing forkserver preload sys.path inheritance. (GH-126538)

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

(cherry picked from commit 9d08423)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
gpshead added a commit to gpshead/cpython that referenced this issue Nov 9, 2024
…th inheritance. (pythonGH-126538)

pythongh-117378: Fix multiprocessing forkserver preload sys.path inheritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

(cherry picked from commit 9d08423)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
gpshead added a commit to gpshead/cpython that referenced this issue Nov 10, 2024
Docs are hard.  Lets go shopping!
gpshead added a commit that referenced this issue Nov 10, 2024
…eritance. (GH-126538) (GH-126633)

gh-117378: Fix multiprocessing forkserver preload sys.path inheritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

(cherry picked from commit 9d08423)

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
@gpshead gpshead changed the title forkserver.set_forkserver_preload() fails forkserver.set_forkserver_preload() silent preload import failures when sys.path is required Nov 10, 2024
gpshead added a commit that referenced this issue Nov 10, 2024
…iate (GH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 10, 2024
…ppropriate (pythonGH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
(cherry picked from commit ca878b6)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 10, 2024
…ppropriate (pythonGH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
(cherry picked from commit ca878b6)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
gpshead added a commit that referenced this issue Nov 10, 2024
…appropriate (GH-126635) (GH-126653)

gh-117378: Only run the new multiprocessing SysPath test when appropriate (GH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
(cherry picked from commit ca878b6)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
gpshead added a commit that referenced this issue Nov 10, 2024
…appropriate (GH-126635) (GH-126652)

gh-117378: Only run the new multiprocessing SysPath test when appropriate (GH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
(cherry picked from commit ca878b6)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
gpshead added a commit that referenced this issue Nov 11, 2024
gh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 11, 2024
pythongh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
(cherry picked from commit 5c488ca)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue Nov 11, 2024
pythongh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
(cherry picked from commit 5c488ca)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
@gpshead gpshead closed this as completed Nov 11, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in Multiprocessing issues Nov 11, 2024
gpshead added a commit that referenced this issue Nov 11, 2024
gh-117378: Clear up the NEWS entry wording (GH-126634)

gh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
(cherry picked from commit 5c488ca)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
gpshead added a commit that referenced this issue Nov 11, 2024
gh-117378: Clear up the NEWS entry wording (GH-126634)

gh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
(cherry picked from commit 5c488ca)

Co-authored-by: Gregory P. Smith <greg@krypto.org>
picnixz pushed a commit to picnixz/cpython that referenced this issue Dec 8, 2024
…ritance. (pythonGH-126538)

pythongh-117378: Fix multiprocessing forkserver preload sys.path inheritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
picnixz pushed a commit to picnixz/cpython that referenced this issue Dec 8, 2024
…ppropriate (pythonGH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
picnixz pushed a commit to picnixz/cpython that referenced this issue Dec 8, 2024
pythongh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
…ritance. (pythonGH-126538)

pythongh-117378: Fix multiprocessing forkserver preload sys.path inheritance.

`sys.path` was not properly being sent from the parent process when launching
the multiprocessing forkserver process to preload imports.  This bug has been
there since the forkserver start method was introduced in Python 3.4.  It was
always _supposed_ to inherit `sys.path` the same way the spawn method does.

Observable behavior change: A `''` value in `sys.path` will now be replaced in
the forkserver's `sys.path` with an absolute pathname
`os.path.abspath(os.getcwd())` saved at the time that `multiprocessing` was
imported in the parent process as it already was when using the spawn start
method. **This will only be observable during forkserver preload imports**.

The code invoked before calling things in another process already correctly sets `sys.path`.
Which is likely why this went unnoticed for so long as a mere performance issue in
some configurations.

A workaround for the bug on impacted Pythons is to set PYTHONPATH in the
environment before multiprocessing's forkserver process was started. Not perfect
as that is then inherited by other children, etc, but likely good enough for many
people's purposes.

Co-authored-by: Serhiy Storchaka <storchaka@gmail.com>
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
…ppropriate (pythonGH-126635)

The first version had it running two forkserver and one spawn tests underneath each of the _fork, _forkserver, and _spawn test suites that build off the generic one.

This adds to the existing complexity of the multiprocessing test suite by offering BaseTestCase classes another attribute to control which suites they are invoked under. Practicality vs purity here. :/

Net result: we don't over-run the new test and their internal logic is simplified.
ebonnal pushed a commit to ebonnal/cpython that referenced this issue Jan 12, 2025
pythongh-117378: Clear up the NEWS entry wording.

Docs are hard.  Lets go shopping!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes 3.13 bugs and security fixes stdlib Python modules in the Lib dir topic-multiprocessing type-bug An unexpected behavior, bug, or error
Projects
Development

No branches or pull requests

2 participants