Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

virtualenv extends sys.path after sitecustomize.py #1861

Closed
frenzymadness opened this issue Jun 16, 2020 · 7 comments
Closed

virtualenv extends sys.path after sitecustomize.py #1861

frenzymadness opened this issue Jun 16, 2020 · 7 comments
Labels

Comments

@frenzymadness
Copy link
Contributor

Let's imagine a situation when I know exactly what I need to have in sys.path. An obvious choice, in that case, would be to implement it in sitecustomize.py module because according to Python docs sitecustomize.py or usercustomize.py should be the last place where sys.path is altered before a script starts.

I cannot achieve that with Python 2 virtualenv because it alters sys.path after site module is reloaded and sitecustomize.py module is processed.

Reproducer preparation: simple sitecustomize.py with two debug prints and a regular virtual environment with Python 2.

$ cat ../site/sitecustomize.py 
import sys

print("1", sys.path)

ma, mi = sys.version_info[:2]
sys.path = ["/usr/lib64/python{}.{}/".format(ma, mi), "/usr/lib64/python{}.{}/lib-dynload/".format(ma, mi), "/booo"]

print("2", sys.path)

$ virtualenv --version
virtualenv 20.0.23 from /usr/local/lib/python3.9/site-packages/virtualenv/__init__.py

$ virtualenv --no-setuptools --no-pip --no-wheel -p python2 testvenv
created virtual environment CPython2.7.18.final.0-64 in 127ms
  creator CPython2Posix(dest=/root/home/testvenv, clear=False, global=False)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator

When I run the Python from the testvenv with sitecustomize.py loaded, virtualenv adds its site-packages paths into sys.path after sitecustomize.py is loaded:

$ PYTHONPATH=../site/ testvenv/bin/python
('1', ['/root/site', '/usr/lib/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/root/home/testvenv/lib64/python2.7/site-packages', '/root/home/testvenv/lib/python2.7/site-packages'])
('2', ['/usr/lib64/python2.7/', '/usr/lib64/python2.7/lib-dynload/', '/booo'])
Python 2.7.18 (default, May  7 2020, 00:00:00) 
[GCC 10.0.1 20200430 (Red Hat 10.0.1-0.14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib64/python2.7/', '/usr/lib64/python2.7/lib-dynload/', '/booo', '/root/home/testvenv/lib64/python2.7/site-packages', '/root/home/testvenv/lib/python2.7/site-packages']
>>> 

The code causing this is there (see the order of the commands):

here = __file__ # the distutils.install patterns will be injected relative to this site.py, save it here
# ___RELOAD_CODE___
# and then if the distutils site packages are not on the sys.path we add them via add_site_dir; note we must add
# them by invoking add_site_dir to trigger the processing of pth files
import os
site_packages = r"""
___EXPECTED_SITE_PACKAGES___
"""
import json
add_site_dir = sys.modules["site"].addsitedir
for path in json.loads(site_packages):
full_path = os.path.abspath(os.path.join(here, path.encode("utf-8")))
if full_path not in sys.path:
add_site_dir(full_path)

It's not possible to simply move the call to reload to the end because the site module needs to be reloaded sooner to load addsitedir function.

Context: sitecustomize is used in pip tests to create an isolated virtual environment but with the latest virtualenv, it isn't isolated.

@gaborbernat
Copy link
Contributor

sitecustomize.py or usercustomize.py should be the last place where sys.path is altered before a script starts

Can you quote this? I can't seem to find a reference to this.

@frenzymadness
Copy link
Contributor Author

Moreover, this does not happen for Python3:

$ virtualenv --no-setuptools --no-pip --no-wheel -p python3 testvenv3
created virtual environment CPython3.9.0.beta.1-64 in 77ms
  creator CPython3Posix(dest=/root/home/testvenv3, clear=False, global=False)
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator
$ PYTHONPATH=../site/ testvenv3/bin/python
1 ['/root/site', '/usr/lib64/python39.zip', '/usr/lib64/python3.9', '/usr/lib64/python3.9/lib-dynload', '/root/home/testvenv3/lib64/python3.9/site-packages', '/root/home/testvenv3/lib/python3.9/site-packages']
2 ['/usr/lib64/python3.9/', '/usr/lib64/python3.9/lib-dynload/', '/booo']
Python 3.9.0b1 (default, May 29 2020, 00:00:00) 
[GCC 10.1.1 20200507 (Red Hat 10.1.1-1)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib64/python3.9/', '/usr/lib64/python3.9/lib-dynload/', '/booo']
>>>

@gaborbernat
Copy link
Contributor

Moreover, this does not happen for Python3:

This is kinda irrelevant, Python 3 has venv so has better semantics and expectation of what a virtual environment is. Python 2, on the other hand, does not offer such guarantees, so it's a bit less strict on what should happen, and a lot cannot be guaranteed given the limitations you can do within the system.

@gaborbernat gaborbernat added question and removed bug labels Jun 16, 2020
@gaborbernat
Copy link
Contributor

Let's imagine a situation when I know exactly what I need to have in sys.path

This is not how virtual environments work, at least not with virtualenv. We offer strict guarantees about what must be on sys.path for a newly created virtual environment. purelib/platlib is guaranteed to be on the sys.path post creation, you removing these elements from sys.path during startup (inside the site/user customize) puts you out of contract, as you remove these guarantees. In the case of Python 3 we don't perform this check, but if anything that's the bug here; rather than on Python 2 we do.

@frenzymadness
Copy link
Contributor Author

Can you quote this? I can't seem to find a reference to this.

The last three paragraphs in the docs are:

After these path manipulations, an attempt is made to import a module named sitecustomize, which can perform arbitrary site-specific customizations. …

After this, an attempt is made to import a module named usercustomize, which can perform arbitrary user-specific customizations, if ENABLE_USER_SITE is true. …

Note that for some non-Unix systems, sys.prefix and sys.exec_prefix are empty, and the path manipulations are skipped; however the import of sitecustomize and usercustomize is still attempted.

which seems to me that the imports should be the last things happening at the end of the site module.

I understand your point of view but I also think that it removes flexibility. I am not saying it's a bad thing to have guarantees what is in sys.path but it should be done in a different order so the flexibility to manipulate sys.path will be preserved without any effect to regular users. It does not affect you if you only need to add a path to sys.path in sitecustomize but if you need to remove a path or set a specific order, there is no way to do that now without virtualenv rewriting sys.path.

@gaborbernat
Copy link
Contributor

gaborbernat commented Jun 16, 2020

The quotes explain how things happen, and the way I read at no point require that no path manipulation happen post site/usercustomize.

if you need to remove a path or set a specific order, there is no way to do that now without virtualenv rewriting sys.path.

This is true only for python 2 though. Not sure what's a good solution here. That code needs to run to maintain guarantees, and defend against already-in-place OS path customization of Python versions unlikely to be changed going ahead.

Perhaps the fixup operation could be skipped if a certain _VIRTUALENV_PYTHON2_SKIP_SYS_PATH_FIXUP environment variable is set. Realistically not sure that I've seen any other use case for this other than pips test suite. Even better you could perhaps create a virtualenv plugin that handles this niche use case. To not complicate core virtualenv logic for this edge case.

frenzymadness added a commit to frenzymadness/pip that referenced this issue Jun 17, 2020
The latest virtualenv does some dark magic for sys.path in
Python 2 so the BuildEnvironment is no longer isolated here.

See: pypa/virtualenv#1861
@gaborbernat
Copy link
Contributor

The original question has been answered and we did not identify any actionable follow-up, so I'll close it for now.

frenzymadness added a commit to frenzymadness/pip that referenced this issue Jul 28, 2020
The latest virtualenv does some dark magic for sys.path in
Python 2 so the BuildEnvironment is no longer isolated here.

See: pypa/virtualenv#1861
frenzymadness added a commit to frenzymadness/pip that referenced this issue Sep 1, 2020
The latest virtualenv does some dark magic for sys.path in
Python 2 so the BuildEnvironment is no longer isolated here.

See: pypa/virtualenv#1861
@pypa pypa locked and limited conversation to collaborators Jan 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants