Skip to content

Commit

Permalink
Standardise on pip_parse (#807)
Browse files Browse the repository at this point in the history
  • Loading branch information
groodt authored Sep 24, 2022
1 parent a364fca commit f0efec5
Show file tree
Hide file tree
Showing 23 changed files with 139 additions and 455 deletions.
81 changes: 28 additions & 53 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@

This repository is the home of the core Python rules -- `py_library`,
`py_binary`, `py_test`, and related symbols that provide the basis for Python
support in Bazel. It also contains packaging rules for integrating with PyPI
(`pip`). Documentation lives in the
support in Bazel. It also contains package installation rules for integrating with PyPI and other package indices. Documentation lives in the
[`docs/`](https://github.com/bazelbuild/rules_python/tree/main/docs)
directory and in the
[Bazel Build Encyclopedia](https://docs.bazel.build/versions/master/be/python.html).
Expand All @@ -24,7 +23,7 @@ Once they are fully migrated to rules_python, they may evolve at a different
rate, but this repository will still follow
[semantic versioning](https://semver.org).

The packaging rules (`pip_install`, etc.) are less stable. We may make breaking
The package installation rules (`pip_install`, `pip_parse` etc.) are less stable. We may make breaking
changes as they evolve.

This repository is maintained by the Bazel community. Neither Google, nor the
Expand Down Expand Up @@ -101,70 +100,27 @@ py_binary(
)
```

## Using the packaging rules
## Using the package installation rules

Usage of the packaging rules involves two main steps.

1. [Installing `pip` dependencies](#installing-pip-dependencies)
2. [Consuming `pip` dependencies](#consuming-pip-dependencies)
1. [Installing third_party packages](#installing-third_party-packages)
2. [Using third_party packages as dependencies](#using-third_party-packages-as-dependencies)

The packaging rules create two kinds of repositories: A central external repo that holds
The package installation rules create two kinds of repositories: A central external repo that holds
downloaded wheel files, and individual external repos for each wheel's extracted
contents. Users only need to interact with the central external repo; the wheel repos
are essentially an implementation detail. The central external repo provides a
`WORKSPACE` macro to create the wheel repos, as well as a function, `requirement()`, for use in
`BUILD` files that translates a pip package name into the label of a `py_library`
target in the appropriate wheel repo.

### Installing `pip` dependencies
### Installing third_party packages

To add pip dependencies to your `WORKSPACE`, load the `pip_install` function, and call it to create the
central external repo and individual wheel external repos.


```python
load("@rules_python//python:pip.bzl", "pip_install")

# Create a central external repo, @my_deps, that contains Bazel targets for all the
# third-party packages specified in the requirements.txt file.
pip_install(
name = "my_deps",
requirements = "//path/to:requirements.txt",
)
```

Note that since `pip_install` is a repository rule and therefore executes pip at WORKSPACE-evaluation time, Bazel has no
information about the Python toolchain and cannot enforce that the interpreter
used to invoke pip matches the interpreter used to run `py_binary` targets. By
default, `pip_install` uses the system command `"python3"`. This can be overridden by passing the
`python_interpreter` attribute or `python_interpreter_target` attribute to `pip_install`.

You can have multiple `pip_install`s in the same workspace. This will create multiple external repos that have no relation to
one another, and may result in downloading the same wheels multiple times.

As with any repository rule, if you would like to ensure that `pip_install` is
re-executed in order to pick up a non-hermetic change to your environment (e.g.,
updating your system `python` interpreter), you can force it to re-execute by running
`bazel sync --only [pip_install name]`.

### Fetch `pip` dependencies lazily

One pain point with `pip_install` is the need to download all dependencies resolved by
your requirements.txt before the bazel analysis phase can start. For large python monorepos
this can take a long time, especially on slow connections.

`pip_parse` provides a solution to this problem. If you can provide a lock
file of all your python dependencies `pip_parse` will translate each requirement into its own external repository.
Bazel will only fetch/build wheels for the requirements in the subgraph of your build target.

There are API differences between `pip_parse` and `pip_install`:
1. `pip_parse` requires a fully resolved lock file of your python dependencies. You can generate this by using the `compile_pip_requirements` rule,
running `pip-compile` directly, or using virtualenv and `pip freeze`. `pip_parse` uses a label argument called `requirements_lock` instead of
`requirements` to make this distinction clear.
2. `pip_parse` translates your requirements into a starlark macro called `install_deps`. You must call this macro in your WORKSPACE to
declare your dependencies.


```python
load("@rules_python//python:pip.bzl", "pip_parse")

Expand All @@ -174,14 +130,33 @@ pip_parse(
name = "my_deps",
requirements_lock = "//path/to:requirements_lock.txt",
)

# Load the starlark macro which will define your dependencies.
load("@my_deps//:requirements.bzl", "install_deps")
# Call it to define repos for your requirements.
install_deps()
```

### Consuming `pip` dependencies
Note that since `pip_parse` is a repository rule and therefore executes pip at WORKSPACE-evaluation time, Bazel has no
information about the Python toolchain and cannot enforce that the interpreter
used to invoke pip matches the interpreter used to run `py_binary` targets. By
default, `pip_parse` uses the system command `"python3"`. This can be overridden by passing the
`python_interpreter` attribute or `python_interpreter_target` attribute to `pip_parse`.

You can have multiple `pip_parse`s in the same workspace. This will create multiple external repos that have no relation to
one another, and may result in downloading the same wheels multiple times.

As with any repository rule, if you would like to ensure that `pip_parse` is
re-executed in order to pick up a non-hermetic change to your environment (e.g.,
updating your system `python` interpreter), you can force it to re-execute by running
`bazel sync --only [pip_parse name]`.

Note: The `pip_install` rule is deprecated. `pip_parse` offers identical functionality and both `pip_install`
and `pip_parse` now have the same implementation. The name `pip_install` may be removed in a future version of the rules.
The maintainers have taken all reasonable efforts to faciliate a smooth transition, but some users of `pip_install` will
need to replace their existing `requirements.txt` with a fully resolved set of dependencies using a tool such as
`pip-tools` or the `compile_pip_requirements` repository rule.

### Using third_party packages as dependencies

Each extracted wheel repo contains a `py_library` target representing
the wheel's contents. There are two ways to access this library. The
Expand Down
67 changes: 9 additions & 58 deletions docs/pip.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 5 additions & 7 deletions docs/pip_repository.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 0 additions & 5 deletions examples/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,6 @@ bazel_integration_test(
timeout = "long",
)

bazel_integration_test(
name = "relative_requirements_example",
timeout = "long",
)

bazel_integration_test(
name = "bzlmod_example",
)
6 changes: 3 additions & 3 deletions examples/pip_install/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -88,9 +88,9 @@ py_test(
genquery(
name = "yamllint_lib_by_version",
expression = """
attr("tags", "\\bpypi_version=1.26.3\\b", "@pip//pypi__yamllint")
attr("tags", "\\bpypi_version=1.26.3\\b", "@pip_yamllint//:pkg")
intersect
attr("tags", "\\bpypi_name=yamllint\\b", "@pip//pypi__yamllint")
attr("tags", "\\bpypi_name=yamllint\\b", "@pip_yamllint//:pkg")
""",
scope = [requirement("yamllint")],
)
Expand All @@ -99,7 +99,7 @@ write_file(
name = "write_expected",
out = "expected",
content = [
"@pip//pypi__yamllint:pypi__yamllint",
"@pip_yamllint//:pkg",
"",
],
)
Expand Down
5 changes: 5 additions & 0 deletions examples/pip_install/WORKSPACE
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,11 @@ pip_install(
requirements = "//:requirements.txt",
)

load("@pip//:requirements.bzl", "install_deps")

# Initialize repositories for all packages in requirements.txt.
install_deps()

# You could optionally use an in-build, compiled python interpreter as a toolchain,
# and also use it to execute pip.
#
Expand Down
Loading

0 comments on commit f0efec5

Please sign in to comment.