Skip to content

Commit

Permalink
feat(bzlmod): introduce pypi_index for using bazel's downloader
Browse files Browse the repository at this point in the history
This is a variant of bazelbuild#1625 and was inspired by bazelbuild#1788. In bazelbuild#1625, we
attempt to parse the simple API HTML files in the same `pip.parse`
extension and it brings the follownig challenges:

* The `pip.parse` cannot be easily use in `isolated` mode and it may
  be difficult to implement the isolation if bazelbuild/bazel#20186
  moves forward.
* Splitting the `pypi_index` out of the `pip.parse` allows us to accept
  the location of the parsed simple API artifacts encoded as a bazel
  label.
* Separation of the logic allows us to very easily implement usage of
  the downloader for cross-platform wheels.
* The `whl` `METADATA` might not be exposed through older versions of
  Artifactory, so having the complexity hidden in this single extension
  allows us to not increase the complexity and scope of `pip.parse` too
  much.
* The repository structure can be reused for `pypi_install` extension
  from bazelbuild#1728.

TODO:
- [ ] Add unit tests for functions in `pypi_index.bzl` bzlmod extension if
  the design looks good.
- [ ] Changelog.

Out of scope of this PR:
- Further usage of the downloaded artifacts to implement something
  similar to bazelbuild#1625 or bazelbuild#1744. This needs bazelbuild#1750 and bazelbuild#1764.
- Making the lock file the same on all platforms - We would need
  to fully parse the requirements file.
- Support for different dependency versions in the `pip.parse` hub repos
  based on each platform - we would need to be able to interpret
  platform markers in some way, but `pypi_index` should be good already.
- Implementing the parsing of METADATA to detect dependency cycles.
- Support for `requirements` files that are not created via
  `pip-compile`.
- Support for other lock formats, though that would be reasonably
  trivial to add.

Open questions:
- Support for VCS dependencies in requirements files - We should
  probably handle them as `overrides` in the `pypi_index` extension and
  treat them in `pip.parse` just as an `sdist`, but I am not sure it
  would work without any issues.
  • Loading branch information
aignas committed Mar 10, 2024
1 parent 3f40e98 commit 6720945
Show file tree
Hide file tree
Showing 9 changed files with 550 additions and 18 deletions.
4 changes: 2 additions & 2 deletions .bazelrc
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
# (Note, we cannot use `common --deleted_packages` because the bazel version command doesn't support it)
# To update these lines, execute
# `bazel run @rules_bazel_integration_test//tools:update_deleted_packages`
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/dupe_requirements,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,gazelle,gazelle/manifest,gazelle/manifest/generate,gazelle/manifest/hasher,gazelle/manifest/test,gazelle/modules_mapping,gazelle/python,gazelle/pythonconfig,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/pip_repository_entry_points,tests/integration/py_cc_toolchain_registered
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/dupe_requirements,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,gazelle,gazelle/manifest,gazelle/manifest/generate,gazelle/manifest/hasher,gazelle/manifest/test,gazelle/modules_mapping,gazelle/python,gazelle/pythonconfig,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/pip_repository_entry_points,tests/integration/py_cc_toolchain_registered
build --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/dupe_requirements,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,gazelle,gazelle/manifest,gazelle/manifest/generate,gazelle/manifest/hasher,gazelle/manifest/test,gazelle/modules_mapping,gazelle/python,gazelle/pythonconfig,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/pip_repository_entry_points,tests/integration/py_cc_toolchain_registered
query --deleted_packages=examples/build_file_generation,examples/build_file_generation/random_number_generator,examples/bzlmod,examples/bzlmod/entry_points,examples/bzlmod/entry_points/tests,examples/bzlmod/libs/my_lib,examples/bzlmod/other_module,examples/bzlmod/other_module/other_module/pkg,examples/bzlmod/patches,examples/bzlmod/py_proto_library,examples/bzlmod/py_proto_library/example.com/another_proto,examples/bzlmod/py_proto_library/example.com/proto,examples/bzlmod/runfiles,examples/bzlmod/tests,examples/bzlmod/tests/dupe_requirements,examples/bzlmod/tests/other_module,examples/bzlmod/whl_mods,examples/bzlmod_build_file_generation,examples/bzlmod_build_file_generation/other_module/other_module/pkg,examples/bzlmod_build_file_generation/runfiles,examples/multi_python_versions/libs/my_lib,examples/multi_python_versions/requirements,examples/multi_python_versions/tests,examples/pip_parse,examples/pip_parse_vendored,examples/pip_repository_annotations,examples/py_proto_library,examples/py_proto_library/example.com/another_proto,examples/py_proto_library/example.com/proto,gazelle,gazelle/manifest,gazelle/manifest/generate,gazelle/manifest/hasher,gazelle/manifest/test,gazelle/modules_mapping,gazelle/python,gazelle/pythonconfig,tests/integration/compile_pip_requirements,tests/integration/compile_pip_requirements_test_from_external_repo,tests/integration/ignore_root_user_error,tests/integration/ignore_root_user_error/submodule,tests/integration/pip_parse,tests/integration/pip_parse/empty,tests/integration/pip_repository_entry_points,tests/integration/py_cc_toolchain_registered

test --test_output=errors

Expand Down
2 changes: 1 addition & 1 deletion .bazelversion
Original file line number Diff line number Diff line change
@@ -1 +1 @@
7.0.0
7.0.2
23 changes: 22 additions & 1 deletion MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ module(
compatibility_level = 1,
)

bazel_dep(name = "bazel_features", version = "1.1.1")
bazel_dep(name = "bazel_features", version = "1.9.0")
bazel_dep(name = "bazel_skylib", version = "1.3.0")
bazel_dep(name = "platforms", version = "0.0.4")

Expand Down Expand Up @@ -53,10 +53,31 @@ use_repo(python, "pythons_hub")
# This call registers the Python toolchains.
register_toolchains("@pythons_hub//:all")

# This call registers the `pypi_index` extension so that it can be used in the `pip` extension
pypi_index = use_extension("//python/extensions:pypi_index.bzl", "pypi_index")
use_repo(pypi_index, "pypi_index")

# ===== DEV ONLY DEPS AND SETUP BELOW HERE =====
bazel_dep(name = "stardoc", version = "0.6.2", dev_dependency = True, repo_name = "io_bazel_stardoc")
bazel_dep(name = "rules_bazel_integration_test", version = "0.20.0", dev_dependency = True)

# This call additionally only adds items to the `pypi_index` if we are
# not ignoring dev dependencies, making it no-op for the regular usage.
dev_pypi_index = use_extension(
"//python/extensions:pypi_index.bzl",
"pypi_index",
dev_dependency = True,
)
dev_pypi_index.add_requirements(
srcs = [
# List all of the requirements files used by us
"//docs/sphinx:requirements.txt",
"//tools/publish:requirements_darwin.txt",
"//tools/publish:requirements.txt",
"//tools/publish:requirements_windows.txt",
],
)

dev_pip = use_extension(
"//python/extensions:pip.bzl",
"pip",
Expand Down
24 changes: 24 additions & 0 deletions examples/bzlmod/MODULE.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,30 @@ python.toolchain(
# rules based on the `python_version` arg values.
use_repo(python, "python_3_10", "python_3_9", "python_versions")

# This extension allows rules_python to optimize downloading for packages by checking
# for available artifacts on PyPI Simple API compatible mirrors.
pypi_index = use_extension("@rules_python//python/extensions:pypi_index.bzl", "pypi_index")
pypi_index.add_requirements(
srcs = [
"//:requirements_lock_3_10.txt",
"//:requirements_lock_3_9.txt",
"//:requirements_windows_3_10.txt",
"//:requirements_windows_3_9.txt",
],
)

# We can also initialize the extension in dev mode.
dev_pypi_index = use_extension(
"@rules_python//python/extensions:pypi_index.bzl",
"pypi_index",
dev_dependency = True,
)
dev_pypi_index.add_requirements(
srcs = [
"//tests/dupe_requirements:requirements.txt",
],
)

# This extension allows a user to create modifications to how rules_python
# creates different wheel repositories. Different attributes allow the user
# to modify the BUILD file, and copy files.
Expand Down
19 changes: 19 additions & 0 deletions python/extensions/pypi_index.bzl
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Copyright 2024 The Bazel Authors. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

"""See the doc in the implementation file."""

load("//python/private/bzlmod:pypi_index.bzl", _pypi_index = "pypi_index")

pypi_index = _pypi_index
37 changes: 26 additions & 11 deletions python/pip_install/pip_repository.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -766,18 +766,27 @@ def _whl_library_impl(rctx):
# Manually construct the PYTHONPATH since we cannot use the toolchain here
environment = _create_repository_execution_environment(rctx, python_interpreter)

repo_utils.execute_checked(
rctx,
op = "whl_library.ResolveRequirement({}, {})".format(rctx.attr.name, rctx.attr.requirement),
arguments = args,
environment = environment,
quiet = rctx.attr.quiet,
timeout = rctx.attr.timeout,
)
if rctx.attr.whl_file:
whl_path = rctx.path(rctx.attr.whl_file)
if not whl_path.exists:
fail("The given whl '{}' does not exist".format(rctx.attr.whl_file))

# Simulate the behaviour where the whl is present in the current directory.
rctx.symlink(whl_path, whl_path.basename)
whl_path = rctx.path(whl_path.basename)
else:
repo_utils.execute_checked(
rctx,
op = "whl_library.ResolveRequirement({}, {})".format(rctx.attr.name, rctx.attr.requirement),
arguments = args,
environment = environment,
quiet = rctx.attr.quiet,
timeout = rctx.attr.timeout,
)

whl_path = rctx.path(json.decode(rctx.read("whl_file.json"))["whl_file"])
if not rctx.delete("whl_file.json"):
fail("failed to delete the whl_file.json file")
whl_path = rctx.path(json.decode(rctx.read("whl_file.json"))["whl_file"])
if not rctx.delete("whl_file.json"):
fail("failed to delete the whl_file.json file")

if rctx.attr.whl_patches:
patches = {}
Expand Down Expand Up @@ -911,6 +920,12 @@ whl_library_attrs = {
mandatory = True,
doc = "Python requirement string describing the package to make available",
),
"whl_file": attr.label(
doc = """\
The wheel file label to be used for this installation. This will not use pip to download the
whl and instead use the supplied file. Note that the label needs to point to a single file.
""",
),
"whl_patches": attr.label_keyed_string_dict(
doc = """a label-keyed-string dict that has
json.encode(struct([whl_file], patch_strip]) as values. This
Expand Down
9 changes: 6 additions & 3 deletions python/private/auth.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -33,10 +33,13 @@ def get_auth(rctx, urls):
Returns:
dict: A map of authentication parameters by URL.
"""
if rctx.attr.netrc:
netrc = read_netrc(rctx, rctx.attr.netrc)
attr = getattr(rctx, "attr", None)

if getattr(attr, "netrc", None):
netrc = read_netrc(rctx, getattr(attr, "netrc"))
elif "NETRC" in rctx.os.environ:
netrc = read_netrc(rctx, rctx.os.environ["NETRC"])
else:
netrc = read_user_netrc(rctx)
return use_netrc(netrc, urls, rctx.attr.auth_patterns)

return use_netrc(netrc, urls, getattr(attr, "auth_patterns", None))
51 changes: 51 additions & 0 deletions python/private/bzlmod/pip.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,8 @@ You cannot use both the additive_build_content and additive_build_content_file a
def _create_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides):
python_interpreter_target = pip_attr.python_interpreter_target

pypi_index_repo = module_ctx.path(pip_attr._pypi_index_repo).dirname

# if we do not have the python_interpreter set in the attributes
# we programmatically find it.
hub_name = pip_attr.hub_name
Expand Down Expand Up @@ -180,10 +182,46 @@ def _create_whl_repos(module_ctx, pip_attr, whl_map, whl_overrides):
group_name = whl_group_mapping.get(whl_name)
group_deps = requirement_cycles.get(group_name, [])

pkg_pypi_index = pypi_index_repo.get_child(whl_name, "index.json")
if not pkg_pypi_index.exists:
# The wheel index for a package does not exist, so not using bazel downloader...
whl_file = None
else:
# Ensure that we have a wheel for a particular version.
# FIXME @aignas 2024-03-10: Maybe the index structure should be:
# pypi_index/<distro>/<version>:index.json?
#
# We expect the `requirement_line to be of shape '<distro>==<version> ...'
_, _, version_tail = requirement_line.partition("==")
version, _, _ = version_tail.partition(" ")
version_segment = "-{}-".format(version.strip("\" "))

index_json = [struct(**v) for v in json.decode(module_ctx.read(pkg_pypi_index))]

# For now only use the whl_file if it is a cross-platform wheel.
# This is very conservative and does that only thing that we have
# in the whl list is the cross-platform wheel.
whls = [
dist
for dist in index_json
if dist.filename.endswith(".whl") and version_segment in dist.filename
]
any_whls = [
dist
for dist in whls
if dist.filename.endswith("-none-any.whl") or dist.filename.endswith("-abi3-any.whl")
]

if len(any_whls) == len(whls) and len(whls) == 1:
whl_file = any_whls[0].label
else:
whl_file = None

repo_name = "{}_{}".format(pip_name, whl_name)
whl_library(
name = repo_name,
requirement = requirement_line,
whl_file = whl_file,
repo = pip_name,
repo_prefix = pip_name + "_",
annotation = annotation,
Expand Down Expand Up @@ -414,6 +452,19 @@ a corresponding `python.toolchain()` configured.
doc = """\
A dict of labels to wheel names that is typically generated by the whl_modifications.
The labels are JSON config files describing the modifications.
""",
),
"_pypi_index_repo": attr.label(
default = "@pypi_index//:BUILD.bazel",
doc = """\
The label to the root of the pypi_index repository to be used for this particular
call of the `pip.parse`. This ensures that we can work with isolated usage of the
pip.parse tag class, where the user may want to also have the `pypi_index` usage
isolated as well.
This also makes the code cleaner and ensures there are no cyclic dependencies.
NOTE: For now this is internal and will be exposed if needed.
""",
),
}, **pip_repository_attrs)
Expand Down
Loading

0 comments on commit 6720945

Please sign in to comment.