Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails if dependencies include .pth files #381

Open
tmontes opened this issue Apr 26, 2020 · 8 comments
Open

Fails if dependencies include .pth files #381

tmontes opened this issue Apr 26, 2020 · 8 comments
Labels
bug A crash or error in behavior.

Comments

@tmontes
Copy link
Contributor

tmontes commented Apr 26, 2020

Context

As a Mu Editor contributor, and having worked a bit on improving its packaging, I joined the sprint today in order to understand how easy/difficult it will be for Mu to be packaged with the newer briefcase 0.3 release.

Facts about Mu packaging:

  • macOS Application Bundle produced with briefcase 0.2.x.
  • Windows installer produced with pynsist.
  • PyPI wheel produced with wheel + setuptools.
  • Also included in Debian, Ubuntu, and Raspbian package repositories. Maybe Fedora, too. Not 100% sure because Linux distribution packaging has been, AFAICT, handled outside of the development tree (or maybe it hasn't had much love, in the last 12 months or so).

Challenges about Mu packaging:

  • Having a single source of truth WRT to dependency specification and meta data.
  • The current solution is setuptools based and all of the information sourced from setup.py + setup.cfg (from there, we use a non-trivial script to produce the necessary pynsist input such that it can do its thing, on Windows).

The Issue

Packaging Mu on Windows leads to a partially ok Mu installation for various motives. In this issue, I'm focusing on a failure that results when trying to bring up its Python REPL -- it leads to a crash (Mu's fault) because a particular module fails to import, resulting in an unhandled exception.

Specifics:

  • Mu's REPL is based on qtconsole that, on Windows, ends up requiring pywin32.
  • pywin32 uses .pth files to guide site.py in populating sys.path in a specific way.

Bottom line:

  • Briefcase packaged Windows applications that depend on pywin32 fail to import win32api, on of its modules.

Investigation

After a few hints from @freakboy3742 and @dgelessus at Gitter, here's what's going on:

  • The Python support package being used is the default one: one of the "embeddable zip file" packages from https://www.python.org/downloads/windows/. (FTR, all if this was explored with Python 3.6.8 64 bit).
  • In particular it includes a pythonXX._pth that:
  • Such pythonXX._pth file is actually overwritten by briefcase in order to:
    • Add the src\app and src\app_packages to sys.path such that both the application and its dependencies can be successfully imported.
  • However, the presence of the ._pth file:
    • Prevents the site module from being loaded at startup...
    • ...which would be responsible for populating sys.path from any .pth files that are found on the site-packages directories.

A Successful Hack

With this information, I invested some time fiddling with these things to see if, at least, I could make it work. Here's what I did that resulted in a working Mu REPL (thus, its underlying import win32api working!):

  • Hacked the cookiecutter template's briefcase.toml file (took me quite figuring out where this file was coming from!!!):

    • Set app_packages_path to src\python\lib\site-packages, instead.
  • Then, hacked briefcase's overwriting of pythonXX._pth to produce:

    pythonXX.zip
    .
    lib\site-packages
    import site
    ..\app
    
    • This lets Python find the Standard Library with the first 2 lines...
    • ...find application dependencies with the 3rd line...
    • ...has site imported such that .pth files in the application dependencies are handled...
    • ...lastly adds the application package path such that it can be imported and run.
  • Lastly, I observed that having site imported lead to an over-populated, non-safe sys.path. For some reason, my local Python installation's site-packages was being added, and then maybe some more.

  • With that, the last step of the hack, was creating a sitecustomize.py, which is automatically picked up when site is imported per the revamped pythonXX._pth. Here's my take:

    import re
    import sys
    
    _IS_LOCAL = re.compile(r'(src\\python|src\\app)', re.IGNORECASE)
    
    sys.path = [path for path in sys.path if _IS_LOCAL.search(path)]
    

With these three changes, the import win32api in Mu succeeds and, ultimately Mu works as intended WRT to providing a Python REPL.

Thoughts

  • Handling .pth files in dependencies is a must, I'd venture saying.
  • My approach is the result of fiddling and exploring and I don't like it very much (but it works!). It feels unnecessarily complicated and, thus, brittle.
  • Somewhat orthogonal, but related, maybe having a venv to which dependencies are pip installed instead of pip install --targeted will be more robust, at least for the platforms where that is feasible. No worries about playing with import PATHs in three distinct places (well, maybe some PATH cleaning up could be in order, to guarantee isolation -- see my sitecustomize.py, above).

All in all, I leave this issue here in the hope that some useful discussion comes out of it, while serving as a hack-guide to someone facing a similar failures. I'm not sure yet about what could be a solid, sustainable, simple, future-proof solution.

Shall we discuss? :)

PS: I suppose similar failures will happen on any platform, as long as .pth files are brought in by dependencies.

@tmontes
Copy link
Contributor Author

tmontes commented Apr 26, 2020

NOTE: The sitecustomize.py I pasted in the previous comment is too aggressive -- works with briefcase run but apparently fails after briefcase package MSI installation. Realized that after the fact. Will come back a paste something that works and excludes non application/dependecy/bundled standard library paths. Needs investigation. :)

@freakboy3742 freakboy3742 added the bug A crash or error in behavior. label Apr 27, 2020
@freakboy3742
Copy link
Member

Thanks for the thorough investigation and writeup!

For background: app_packages was introduced because it allowed us to isolate the support package from the dependencies. That's not a huge concern for "normal" Python installs because the Python interpreter is installed once and packages are installed into that interpreter (or a virtual environment); but in an app packaging world, updating the support package is something you're more likely to want to do independent of dependencies. I'm not fundamentally opposed to breaking this separation and using site_packages; but it's worth being aware what the consequence of that decision would be.

That said - the remaining fixes all seem like (a) a good set of changes, and (b) not fundamentally incompatible with using a separate app_packages folder - we just need to add app_packages to the python3.X._pth file.

The fact that your local Python's path is being added to sys.path is definitely odd - and definitely something we want to avoid; the site path filtering seems like an interesting approach, although I guess the real fix is to work out why the extra path elements are leaking into sys.path in the first place.

Ideally, these changes wouldn't be something baked into the Briefcase sources either - they'd be something included in the briefcase template so that an end-user can easily customize the contents.

@tmontes
Copy link
Contributor Author

tmontes commented Apr 28, 2020

Thanks for the feedback.

That said - the remaining fixes all seem like (a) a good set of changes, and (b) not fundamentally incompatible with using a separate app_packages folder - we just need to add app_packages to the python3.X._pth file.

It's already there, in the current master! But that isn't enough, apparently. My understanding:

  • The ._pth file prevents site from being imported (unless it explicitly imports it).
  • But site runs the code that processes the regular .pth files in site-packages.

The fact that your local Python's path is being added to sys.path is definitely odd - and definitely something we want to avoid; the site path filtering seems like an interesting approach, although I guess the real fix is to work out why the extra path elements are leaking into sys.path in the first place.

Yet to clarify if this.

  • Is it because site is being explictly brought in, in my hacked ._pth file?
  • How is sys.path different if no ._pth file is ever there?

Ideally, these changes wouldn't be something baked into the Briefcase sources either - they'd be something included in the briefcase template so that an end-user can easily customize the contents.

Understood and generally agreed (even though, as a side comment, I'd like to have briefcase embed/bundle the "default" templates itself, such that it can be used completely offline -- a whole different topic and subsequent discussion). :)

I will investigate further and share my findings.

@tmontes
Copy link
Contributor Author

tmontes commented Apr 28, 2020

Investigation

Environment:

  • Windows 10 + www.python.org's 64 bit Python 3.6 installed for current user at C:\Users\test\AppData\Local\Programs\Python\Python36>.
  • Working with current Mu Editor and briefcase master.
  • Created a minimal pyproject.toml file.
  • Repository root at C:\Users\test\work\github.com\mu.
  • Working with hacked briefcase-windows-msi-template that produces per-user MSI installers (see User installable MSI packages? #382).

Objective:

  • Must be able to import PyQt5 and win32api.
  • sys.path must not include PATHs outside of the application directory.

Starting Point: current master

Package Source Observations

(after running briefcase package)

C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\\\app'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\\\app_packages'

>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\\\app_packages\\PyQt5\\__init__.py'>

>>> import win32api
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'

Summary:

  • sys.path looks good.
  • Imports PyQt5 from the right source.
  • Fails at importing win32api.

User-Installed Package Observations

(after installing MSI package)

C:\Users\test>"AppData\Local\Programs\Mu Editor\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python\\python36.zip'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\\\app'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\\\app_packages'
>>>
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\\\app_packages\\PyQt5\\__init__.py'>
>>>
>>> import win32api
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'

Summary:

  • sys.path looks good.
  • Imports PyQt5 from the right source.
  • Fails at importing win32api.

Where to, next?

Facts:

  • The site module processes .pth files (used by pywin32 to provide win32api, here).
  • The presence of python36._pth prevents site from being auto-imported.

Options:

  1. Add import site to python36._pth.
  2. Drop the python36._pth file.

If we go with 2., a way of adding src\app and src\app_packages to sys.path must be put in place. Options:

  • Create a sitecustomize.py that adds them to the PATH.
  • Add a .pth file in src\python\lib\site-packages pointing to them.

Option 1 - Add import site to python36._pth

Contents of hacked python36._pth after the change:

python36.zip
.
..\\app
..\\app_packages
import site

Package Source Observations

(after running briefcase package)

C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\app'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\app_packages'
'C:\\Users\\test\\AppData\\Roaming\\Python\\Python36\\site-packages'
>>>
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\app_packages\\PyQt5\\__init__.py'>
>>>
>>> import win32api
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'

Summary:

  • sys.path now includes a foreign PATH.
  • Imports PyQt5 from the right source.
  • Fails at importing win32api.

User-Installed Package Observations

(after installing MSI package)

C:\Users\test>"AppData\Local\Programs\Mu Editor\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python\\python36.zip'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\python'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\app'
'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\app_packages'
'C:\\Users\\test\\AppData\\Roaming\\Python\\Python36\\site-packages'
>>>
>>> import PyQt5
>>> PyQt5
<module 'PyQt5' from 'C:\\Users\\test\\AppData\\Local\\Programs\\Mu Editor\\app_packages\\PyQt5\\__init__.py'>
>>>
>>> import win32api
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'

Summary:

  • sys.path now includes a foreign PATH.
  • Imports PyQt5 from the right source.
  • Fails at importing win32api.

Add import site to python36._pth summary

Positive:

  • Nothing.

Negative:

  • sys.path now polluted.

Thoughts:

  • For some reason, importing site, did not process the .pth files in src\app_packages.
  • Tried bringing the import site line in python36._pth up but observed nothing different.
  • Adding a sitecustomize.py from this point on might help -- TODO?

Option 2 - Drop the python36._pth file.

Package Source Observations

(after running briefcase package)

C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
''
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\DLLs'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\lib'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
'C:\\Users\\test\\AppData\\Roaming\\Python\\Python36\\site-packages'
>>>
>>> import PyQt5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'PyQt5'
>>>
>>> import win32api
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'

Summary:

  • sys.path missing app and app_packages and polluted with '' and local python installation site-packages.
  • None of the PyQt5 / win32api imports work.

User-Installed Package Observations

(didn't even try)

Drop the python36._pth file summary

Positive:

  • Nothing.

Negative:

  • sys.path missing PATHs and polluted
  • None if the imports work.

Thoughts:

  • No solution was expected from this: just focused on observing behaviour.
  • Will try using the default www.python.org supplied python36._pth next.
  • Adding a sitecustomize.py from this point on might help -- TODO?

Option 2a - Default python36._pth file.

Contents, as supplied in the Python embeddable package:

python36.zip
.

# Uncomment to run site.main() automatically
#import site

Package Source Observations

(after running briefcase package)

C:\Users\test\work\github.com\mu>"windows\Mu Editor\src\python\python.exe" -q
>>> import sys
>>> print(*map(repr, sys.path), sep='\n')
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python\\python36.zip'
'C:\\Users\\test\\work\\github.com\\mu\\windows\\Mu Editor\\src\\python'
>>>
>>> import PyQt5
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'PyQt5'
>>>
>>> import win32api
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'win32api'

Summary:

  • sys.path looking good, but obviously missing PATHs.
  • None of the PyQt5 / win32api imports work.

User-Installed Package Observations

(didn't even try)

Default python36._pth file summary

Positive:

  • Nothing.

Negative:

  • Imports don't work.

Thoughts:

  • No solution was expected from this: just focused on observing behaviour.
  • Adding import site to python36._pth and sitecustomize.py from this point might help -- TODO?

@tmontes
Copy link
Contributor Author

tmontes commented Apr 28, 2020

Stop and Think

Issue

From the experiments above, as soon as site is auto-imported -- either by explicitly adding it to python36._pth, or by droping that file completely -- sys.path becomes polluted with non-local PATHs.

Explanation:

  • site does it when it finds a user base and site-packages by looking at the result of sysconfig.get_config_var('userbase') and sysconfig.get_path('purelib', 'nt_user') - see code in getuserbase and getusersitepackages.

Issue

Understand how/when site processes .pth files and why our option 1 above -- adding import site to python36._pth -- apparently did not process the .pth files in src\app_packages.

Explanation:

  • .pth file processing is handled in addsitedir that delegates work to addpackage.
  • addsitedir is called by addsitepackages and addusersitepackages.
  • Both are called from main that is called on import, conditionally here:
    if not sys.flags.no_site:
        main()
    
  • One could wonder if main actually ran: it feels safe saying so, given that our option 1 resulted in a polluted sys.path that only site could have achieved.
  • Thus, for some reason, addsitedir was never called with the custom PATHs in python36._pth: ..\app and ..\app_packages.

...time passes ...code is read ...hacked with ...and print-debuged (is there a better way?) :-)

Culprit:

  • site is indeed imported.
  • main is indeed run.
  • addsitepackages is called, however:
    • The custom PATHs from python36._pth are present in the known_paths argument.
    • The code only adds new PATHs -- sourced from getsitepackages -- and only those new PATHs are passed to addsitedir.
    • Thus, whichever PATHs are in sys.path when site is imported are never processed for .pth files.

Status

Apparent Scenario

  • Must auto-import site such that .pth files are handled.
  • The custom PATHs in python36._pth are not processed for .pth files.
  • This pollutes sys.path that will need cleaning.

Possible ways forward

A. "Kind of ugly" option

  • Add import site to current python36._pth, with the custom PATHs.
  • Create a sitecustomize.py that both (a) calls site.addpackage to process .pth files in ..\app_packages to further populate sys.path and (b) cleans up the polluted sys.path.

B. "Might be nice but won't work" option

  • Remove custom PATHS from python36._pth and add import site to it.
  • Add a src\python\lib\site-packges\briefcase.pth with relative paths to src\app and src\app_packages.
  • Would be elegant but .pth files are not processed recursively. Thus, the .pth files in src\app_packages -- the ones we really care about -- would not be processed.

(this was a close one! oh, frustration!)

C. "Not sure if its really that bad, but don't like it very much" option

  • Set briefcase.toml's app_packages_path to src\python\lib\site-packages.
  • Remove custom PATHS from python36._pth and add import site to it.
  • Create a sitecustomize.py that cleans up sys.path from site-pulluted entries.

(may limit updating the support package, like @freakboy3742 noted above -- then again, maybe not: support package isn't supposed to touch ...\lib\site-packages which is "local" by definition).

D. "What about a venv, which feels solid, but will probably be a mess" option.

  • Create a virtual environment to host the application and dependencies.
  • Use that venv's python to install dependencies.
  • Move/copy application package into that venv's site-packages -- why not?
  • No PATH fiddling, I suppose -- wondering if venv's sys.path would include foreign PATHs?
  • Why this isn't as clean/easy as it might be:
    • The support package's python does not include the venv module on windows. :(
    • No way this could be done when targeting a foreign architecture -- can't run python to create venv and then pip install things.

E. "Is there any other option" option

  • Take a rest and see if letting the mind out of this for a while helps ideas settle down and or pop.

:-)

@monzelr
Copy link

monzelr commented Apr 5, 2022

I had the same issue with packaging pywin32 in Briefcase and as @tmontes figured it out correctly, *.pth files are correctly packed by Briefcase, but the link to those *.pth-files is wrong.

The python package site and its variable site.USER_SITE points to a incorrect path (for me it was C:\\Users\\RuneMonzel\\AppData\\Roaming\\Python\\Python38\\site-packages) and thus all .pth in the app_packages folder will not be loaded correctly. However, python's documentation says you can modify the import behaviour with the site package: https://docs.python.org/3/library/site.html

This fix should be applied if any module does extra imports via a .pth file, which is normally located in site-packages (or when using Briefcase in app_packages).
A fast fix:

try:
    import win32con, win32event, win32process
    from win32com.shell.shell import ShellExecuteEx
    from win32com.shell import shellcon
    
except ModuleNotFoundError:
    print("Try to find 'app_packages' folder and to add this to python's 'site' package.")
    app_packages = ""
    for path in sys.path:
        if path.endswith("app_packages"):
            app_packages = path
    if app_packages == "":
        raise ModuleNotFoundError
    else:
        
        import site
        site.USER_SITE = app_packages  # correct the 'site-packages' path to 'app_packages' path
        site.main()  # recall site package thus all .pth in 'app_packages' will be add to sys.path
        
        import win32con, win32event, win32process
        from win32com.shell.shell import ShellExecuteEx
        from win32com.shell import shellcon
        
    for path in sys.path:
        print(path)

The output of the code above shows, that sys.path is now extended with the paths found in pywin32.pth:

Try to find 'app_packages' folder and to add this to python's 'site' package.
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\python\python38.zip
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\python
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages\win32
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages\win32\lib
C:\Users\RuneMonzel\AppData\Local\Programs\AutoLucid\app_packages\Pythonwin

@freakboy3742 :
I am wondering if it is possible to change the site.USER_SITE variable to the folder app_packages in the standard cookie-cutter template? Thus all python modules with .pth dependencies would be imported normally.

UPDATE: The following does not work because pythonXX._pth does only accept import site:
One way would be importing an extra module in pythonXX._pth, like import sitecustomize which calls this kind of code:

import sys
import site
app_packages = ""
for path in sys.path:
    if path.endswith("app_packages"):
        app_packages = path
if app_packages != "":
    site.USER_SITE = app_packages  # correct the 'site-packages' path to 'app_packages' path
    site.main()  # recall site package thus all .pth in 'app_packages' will be called

@freakboy3742
Copy link
Member

@monzelr Thanks for the extra detail. I think this may be tracking the same problem as #669; and yes - this is absolutely something that should be fixed. The general approach you've described makes sense; we'll need to find a good place to drop a sitecustomize script so it is picked up on all platforms.

@freakboy3742
Copy link
Member

#669 has a more direct test/manifestation of this; while the report here is windows specific, it's likely USER_SITE problems exist on every platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A crash or error in behavior.
Projects
None yet
Development

No branches or pull requests

3 participants