use prepare_metadata_for_build_wheel instead of parsing setup.py #2296

finswimmer · 2020-04-12T12:54:30Z

This PR is a complete rework, how poetry gather metadata about non-poetry project, that are necessary for dependency resolution, in case no sdist or wheel is available.

In the original implementation this was done by trying to parse the setup.py or setup.cfg. This has the strong limitation to situation where version and dependencies are not determine during runtime. And poetry has to run with python3, because the ast module for python2 works in a different way.

Build tools like setuptools that implements PEP 517 should provide a method prepare_metadata_for_build_wheel for getting access to the metadata one need for dependency resolution.

To gain access to this API hook following steps are needed:

create a separate environment in which the build tool is installed
run prepare_metadata_for_build_wheel for the build backend on the folder containing the package
parse the output with importlib.metadata

Lot of this work is already done in the pep517 package.

This implementation has the advantage:

metadata for complex setup.py can be obtained
the implementation in not necessary limited to setuptools. Virtually any PEP517 compliant build system can be integrated easily
the code is less complex

The only downside is, that the whole process needs more time than parsing. In my opinion this is negligible over the advantages.

Pull Request Check List

This is just a reminder about the most common mistakes. Please make sure that you tick all appropriate boxes. But please read our contribution guide at least once, it will save you unnecessary review cycles!

[*] Added tests for changed code.
~~- [ ] Updated documentation for changed code.~~

pradyunsg · 2020-04-12T13:00:26Z

This seems to be reimplementing a lot of the functionality from pep517. Consider directly using pep517 for the implementation here?

finswimmer · 2020-04-12T16:12:51Z

@pradyunsg Your comment was fast :)

First I was a bit irritated by it, what you mean by pep517. After reading PEP517 once again I've found this part:

A Python library will be provided which frontends can use to easily call hooks this way.

And then I found this: https://pypi.org/project/pep517/ I guess this is what you meant?

Looks promising. Will have a look at this.

Thanks a lot! 👍

pradyunsg · 2020-04-12T22:10:24Z

We're all kinds of bad at naming things. :)

I was on mobile, so linking to the PyPI page was... possible but annoyingly difficult. Sorry!

* fix tests

abn

Overall, i think this is a step in the right direction. The AST parser was too fragile anyway. One minor thing to note is that this will remove the improvenents at #2257. Which is I think is okay, because we are receiving more accurate results.

A few minor questions and suggestions in the code. Leaving this as a comment since this is not ready for review yet afaict.

abn · 2020-04-14T19:05:23Z

tests/console/commands/test_init.py

@@ -144,7 +144,7 @@ def test_interactive_with_git_dependencies(app, repo, mocker, poetry):
    command = app.find("init")
    command._pool = poetry.pool

-    mocker.patch("poetry.utils._compat.Path.open")
+    # mocker.patch("poetry.utils._compat.Path.open")


If you have not considered it yet, you could consider adding a test fixture to do this since this is reused. Something like this might be useful.

@pytest.fixture def mocked_pyproject_toml(mocker): def side_effect(self, *args, **kwargs): if self.name == "pyproject.toml": return MagicMock() return self.open(*args, **kwargs) patched_open = mocker.patch("poetry.utils._compat.Path.open") patched_open.side_effect = side_effect yield patched_open

This was already added for another change at

poetry/tests/conftest.py

Lines 122 to 134 in 95e3490

@pytest.fixture

def mocked_open_files(mocker):

files = []

original = Path.open

def mocked_open(self, *args, **kwargs):

if self.name in {"pyproject.toml"}:

return mocker.MagicMock()

return original(self, *args, **kwargs)

mocker.patch("poetry.utils._compat.Path.open", mocked_open)

yield files

Could reuse this.

abn · 2020-04-14T19:07:34Z

tests/puzzle/test_provider.py

@@ -116,16 +116,16 @@ def test_search_for_vcs_read_setup_with_extras(provider, mocker):
    }


-def test_search_for_vcs_read_setup_raises_error_if_no_version(provider, mocker):
+def test_search_for_vcs_read_setup_with_dynamic_version(provider, mocker):


Should we consider retaining the old case as well since it is an expected failure?

abn · 2020-04-14T19:08:21Z

tests/utils/fixtures/setups/pendulum/setup.py

 ]

 package_data = {"": ["*"]}

 install_requires = ["python-dateutil>=2.6,<3.0", "pytzdata>=2018.3"]

-extras_require = {':python_version < "3.5"': ["typing>=3.6,<4.0"]}
+extras_require = {'typing:python_version < "3.5"': ["typing>=3.6,<4.0"]}


abn · 2020-04-14T19:09:59Z

tests/utils/test_setup_reader.py

@@ -134,28 +131,10 @@ def test_setup_reader_read_setup_kwargs(setup):
    assert expected_python_requires == result["python_requires"]


-@pytest.mark.skipif(not PY35, reason="AST parsing does not work for Python <3.4")
-def test_setup_reader_read_setup_call_in_main(setup):


Does this case still work as expected with this change? ie. a setup file with the call in main.

abn · 2020-04-14T19:10:58Z

poetry/utils/setup_reader.py

 except ImportError:
-    from ConfigParser import ConfigParser
+    from importlib_metadata import PathDistribution


This might be redundant?

abn · 2020-04-14T19:11:57Z

poetry/utils/setup_reader.py

-            not result["install_requires"]
-            and not result["extras_require"]
-            and not result["python_requires"]
+    def read_from_pep517_hook(cls, directory):


Fwiw, I think it might be great to retain the name for now. Since we are reading form the givent directory.

abn · 2020-04-14T19:13:34Z

poetry/utils/setup_reader.py

-            setup_call, body, "python_requires"
-        )
+        with pep517.envbuild.BuildEnvironment() as env, temporary_directory() as tmp_dir:
+            env.pip_install([cls.build_requires])


Does pep517 do something like python -m ensurepip when using pip commands?

abn · 2020-04-14T19:17:39Z

poetry/utils/setup_reader.py

+
+            if distribution.requires:
+                for record in distribution.requires:
+                    requirements = record.split(";", 1)


can we simply use poetry.core.packages.dependency_from_pep_508(record) here?

abn · 2020-04-14T19:18:24Z

poetry/utils/setup_reader.py

+                    except IndexError:
+                        result["install_requires"].append(project_name)
+
+        egg_info = Path(directory).glob("*.egg-info")


We should not need this anymore correct? Since the metadata will be generated in the isolated environment?

abn · 2020-04-14T19:19:52Z

tests/utils/fixtures/setups/pendulum/setup.py

-    "pendulum.tz.data",
-    "pendulum.tz.zoneinfo",
-    "pendulum.utils",
+    # "pendulum._extensions",


Is this because we use a truncated source in the fixture?

sdispater · 2020-04-14T19:50:44Z

Thanks for the thorough and comprehensive work on this PR!

However, I think the purpose of the SetupReader is lost in this PR. The idea was to not execute the setup.py file to get the package metadata for security and consistency reasons (since a setup.py file is not necessarily deterministic).

But for projects using setuptools as a build backend getting the metadata will execute setup.py (see https://github.com/pypa/setuptools/blob/master/setuptools/build_meta.py#L158).

By going this route, we might end up losing performances (using the BuildEnvironment might be costly) and security for a marginal gain which is getting more accurate metadata for non deterministic setup.py files.

I am not too keen on executing anything just for the sake of getting metadata (even using the prepare_metadata_for_build_wheel does not make sense in a context of dependency resolution because it means downloading and using an external tool for something that should be static).

I know it's a tricky subject and not everyone agrees but, in my opinion, the build backend should only be used to build a wheel for installation and nothing else.

That being said I appreciate the work you put into it, don't get me wrong, and if someone can guarantee me that nothing will ever be executed by going this PR route, I am willing to do it. But for now I would prefer not to go this way, even if it follows the current packaging guidelines.

And, again, this is great work and I appreciate the time put into it.

And I am open for discussion :-)

abn · 2020-04-14T21:16:02Z

@sdispater the ast parsing solution can still be retained. In our current implementation, as I understand it, we do execute the setup.py file before the AST parsing anyway using python setup.py egg_info. From what I can tell, using pep517, we get a more rounded solution here that could potentially handle more cases than we already do.

sdispater · 2020-04-14T21:31:13Z

@abn This is true only for git and directory dependencies (since in this case we assume people know what they are doing). So if we remove the change from the Inspector class which is used by the LegacyRepository and PyPIRepository classes then I think it would be more acceptable.

The other thing that bothers me (even though I must admit it's already the case) is that it will build a new environment each time. A better solution would be to create a unique environment lazily that will be reused during the dependency resolution and cleaned up after.

I know the AST parser is flaky (I wrote most of it 😅) but it works reasonably well so we should try to rely on it as much as possible.

abn · 2020-04-14T22:04:57Z

Ha, I agree that the AST parser is a decent fallback - we can obviously improve it as we go along. Considering we can cache this information for packages and that this would be the exception cases, I do not see the perf-hit to be considerable or unmanagable spinning up an isolated environment (we can always improve this later). I guess there are different views on that one.

Fwiw, if we can get better metadata, I feel that this or a similar solution might be worth it. It will also mean we can support non poetry PEP-517 source packages too (the ones without the metadata). Also, this discussion is relevant for #2301 as well.

finswimmer · 2020-04-15T18:32:51Z

Hello @sdispater ,

I can understand your reservation about executing something to get the necessary and I would also prefer something static. That's why I started the discussion here

The current result of the discussion is, that using prepare_metadata_for_build_wheel is the intended way to get the data we need for a dependency resolution. I would be happy if we can go on in the discussion and can convince more people, that this is not the optimal way.

The biggest plus for my implementation I see in the fact that we can support other packaging tools aside poetry and setuptools.

Maybe we can use prepare_metadata_for_build_wheel as a fallback for now, if ast parsing isn't enough?

@abn: Thanks a lot for your code comments. I will go through them later.

pradyunsg · 2020-04-15T21:07:16Z

I agree that using the AST parser, when the backend is setuptools, is a good idea. Maybe the setuptools backend should probably be doing these AST shenanigans instead, but that's not something under poetry's control, so it's definitely a good idea to use the AST parser for situations where it's pretty unambigous what the results would be.

For non-setuptools backends though, I do think it'd be good to have PEP 517 support in poetry. :)

sdispater · 2020-06-01T12:42:37Z

So, I agree with @pradyunsg: If the backend is setuptools we use the AST parser (unless all the information we need is in the setup.cfg file). Note that I am pretty sure we can further improve the parser.

For other backends, we use the appropriate PEP-517 hooks. I don't think we can use the pep517 library however. It uses in a few places sys.executable directly meaning we can't use it with the environment system we have in place in Poetry.

github-actions · 2024-03-01T07:07:39Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

finswimmer added 2 commits April 12, 2020 14:53

wip: use prepare_metadata_for_build_wheel instead of parsing setup.py

3af8f53

wip: remove obsolete imports

98215d7

finswimmer added 3 commits April 12, 2020 15:04

wip: do not specify where temp folder should be located

58dca07

wip: correct use of super() for python2 compatibility

ebfdc40

wip: convert Path to str to change folder

8ee5832

finswimmer added 3 commits April 12, 2020 19:47

wip: make use of the pep517 package

6fd0705

wip: different approach using p517

7a6eb7c

wip: convert directory Path to string

ec2cddb

finswimmer added 4 commits April 13, 2020 07:45

wip: fix tests

9064044

* code cleanup

8a44380

* fix tests

make black happy

0d482e8

convert Path to str when using rmtree

d70d3d0

finswimmer marked this pull request as ready for review April 14, 2020 09:53

finswimmer requested a review from a team April 14, 2020 09:54

finswimmer changed the title ~~[WIP]: use prepare_metadata_for_build_wheel instead of parsing setup.py~~ use prepare_metadata_for_build_wheel instead of parsing setup.py Apr 14, 2020

abn reviewed Apr 14, 2020

View reviewed changes

finswimmer mentioned this pull request Jun 1, 2020

Unable to parse "None" exception #2474

Closed

3 tasks

abn mentioned this pull request Jul 4, 2020

inspection: use pep517 metadata build #2632

Merged

2 tasks

abn closed this in #2632 Jul 24, 2020

github-actions bot locked as resolved and limited conversation to collaborators Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use prepare_metadata_for_build_wheel instead of parsing setup.py #2296

use prepare_metadata_for_build_wheel instead of parsing setup.py #2296

finswimmer commented Apr 12, 2020 •

edited

Loading

pradyunsg commented Apr 12, 2020

finswimmer commented Apr 12, 2020

pradyunsg commented Apr 12, 2020

abn left a comment

abn Apr 14, 2020

abn Apr 19, 2020

abn Apr 14, 2020

abn Apr 14, 2020

abn Apr 14, 2020 •

edited

Loading

abn Apr 14, 2020

abn Apr 14, 2020

abn Apr 14, 2020

abn Apr 14, 2020

abn Apr 14, 2020

abn Apr 14, 2020

sdispater commented Apr 14, 2020

abn commented Apr 14, 2020

sdispater commented Apr 14, 2020

abn commented Apr 14, 2020

finswimmer commented Apr 15, 2020

pradyunsg commented Apr 15, 2020

sdispater commented Jun 1, 2020

github-actions bot commented Mar 1, 2024

	@pytest.fixture
	def mocked_open_files(mocker):
	files = []
	original = Path.open

	def mocked_open(self, args, *kwargs):
	if self.name in {"pyproject.toml"}:
	return mocker.MagicMock()
	return original(self, args, *kwargs)

	mocker.patch("poetry.utils._compat.Path.open", mocked_open)

	yield files

use prepare_metadata_for_build_wheel instead of parsing setup.py #2296

use prepare_metadata_for_build_wheel instead of parsing setup.py #2296

Conversation

finswimmer commented Apr 12, 2020 • edited Loading

Pull Request Check List

pradyunsg commented Apr 12, 2020

finswimmer commented Apr 12, 2020

pradyunsg commented Apr 12, 2020

abn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abn Apr 14, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sdispater commented Apr 14, 2020

abn commented Apr 14, 2020

sdispater commented Apr 14, 2020

abn commented Apr 14, 2020

finswimmer commented Apr 15, 2020

pradyunsg commented Apr 15, 2020

sdispater commented Jun 1, 2020

github-actions bot commented Mar 1, 2024

finswimmer commented Apr 12, 2020 •

edited

Loading

abn Apr 14, 2020 •

edited

Loading