-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 'Design for a Python Toolchain' proposal #161
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this work Jon, from the perspective of remote execution this all looks good to me. One comment though from the perspective of rule owners (and might be something to do in a later PR): For rules_docker we have a section related to how to use the toolchain rules properly from rules that extend rules_docker (https://github.com/bazelbuild/rules_docker/tree/master/toolchains/docker#overview).
In this case I think you need something similar as there are rules that execute the native py_binary (e.g., rules_docker does here: https://github.com/bazelbuild/rules_docker/blob/master/python/image.bzl#L84) and there might be something special that these type of users need to do to work with the new py toolchains properly. I'd be happy to work with you to try out the changes (whenever they are ready to try out) to see if we can use them effectively in rules_docker.
|
||
`PyRuntimeInfo` is a newly-exposed provider returned by the native [`py_runtime`](https://docs.bazel.build/versions/master/be/python.html#py_runtime) rule. It can be loaded from `@bazel_tools//tools/python:toolchain.bzl`. It has the following fields, each of which corresponds to an attribute on `py_runtime`. (The last one, `python_version`, is newly added in this doc.) | ||
|
||
* `interpreter_path`: An absolute filesystem path to a Python interpreter (or a script that launches a Python interpreter, forwarding any command-line arguments) available on the target platform. Must be `None` if `interpreter` is non-`None`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/ target platform / execution platform
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Python it's actually the target platform, because it's an interpreter executed at runtime as opposed to a compiler executed at build time. Bazel's "building" of a py_binary is really just laying out the runfiles and generating the stub script that'll be the entry point to its execution later.
(In the case where you're running a tool written in Python on the execution platform of another target, that second target's execution platform is the Python binary's "target" platform, so the description is still accurate.)
|
||
and thanks to toolchain resolution, the resulting executable will automatically know to use the interpreter located at `/usr/weirdpath/python3`. | ||
|
||
If we had not defined a custom toolchain, then we'd be stuck with `autodetecting_python_toolchain`, which would fail at execution time if `/usr/weirdpath` were not on `PATH`. (It would also be slightly slower since it requires an extra invocation of the interpreter at execution time to confirm its version.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"slightly slower"
is this a per-target penalty or a once when configuring the toolchain penalty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a per-target penalty, and it's paid at execution time, not build time. So each time you run a py_binary
artifact (the stub script) built using this autodetecting toolchain, it will pay the cost of exec
-ing to the extra wrapper script (as opposed to directly to the interpreter), searching directories on PATH for the system interpreter, and verifying its version by invoking it with the -V
flag.
Fortunately Python does in fact have a -V
flag, so we don't need to actually start up the true interpreter just to print sys.version
. I didn't realize that when I wrote this parenthetical.
Originally I also had a host_python_toolchain
in this proposal, that would autodetect the Python binary just once at WORKSPACE evaluation time, but only for the host platform. I removed it at katre@'s suggestion since it didn't seem to add much value.
In any case, users should be defining explicit toolchains for their platforms, which will avoid any inefficiency implied by this default fallback.
This PR adds a design doc on making the native Python rules use the toolchain framework to find their runtimes. Follow bazelbuild/bazel#7375 for discussion and design review.
@katre Please do a quick validation-review that the doc is publishable for review.
Reviewer roles: