Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolve Labels for toolchain roots and sysroots correctly under bzlmod #235

Merged

Conversation

rrbutani
Copy link
Collaborator

@rrbutani rrbutani commented Oct 22, 2023

Currently the repo rule and tag class accept string-form labels for toolchain root packages and sysroots. Under bzlmod this is problematic because users may pass us labels that point at repos that are not in this module's repo mapping. To support such labels, they need to be passed to us as actual Labels (not strings).

This necessitates some (breaking) changes to interface for the repo rule and tag class.


For toolchain_roots:

  • we want to allow 1:many mappings from system specification to toolchain root Label so a Label -> string attribute (i.e. label_keyed_string_dict) wouldn't work
  • we also still want to support absolute paths so even if we compromised on the 1:many mappings bit, flipping the map wouldn't work

Adding a level of indirection (toolchain_roots_label_map) was the least-worst solution I could come up with:

    toolchain_roots = {
        "": "previous_llvm_toolchain",
    },
    # Note: this map is "backwards" because there is no `string_keyed_label_dict`...
    toolchain_roots_label_map = {
        "@llvm_toolchain_llvm": "previous_llvm_toolchain",
    },

While this is a breaking change, it shouldn't be too bad though; there's no potential for silent errors and the error message provides a good hint about what change needs to be made I think.


I've applied a similar change to sysroot:

    sysroot = {
        "linux-x86_64": "chromium_x64_sysroot",
    },
    sysroot_label_map = {
        "@org_chromium_sysroot_linux_x64//:sysroot": "chromium_x64_sysroot",
    },

For sysroot I considered just flipping the map since I think forcing a 1:1 mapping would be more acceptable but:

  • this is a potentially more confusing breaking change
  • this would force us to drop support for absolute paths to a sysroot

I've also enabled bzlmod-enabled tests for the system paths, absolute paths, and cross tests in CI.

Fixes #234. cc: @steve-261370

@rrbutani rrbutani force-pushed the fix/labels-in-toolchain-roots branch 2 times, most recently from 3327e60 to 35c601b Compare October 22, 2023 20:05
@rrbutani rrbutani marked this pull request as draft October 22, 2023 20:29
@rrbutani rrbutani marked this pull request as ready for review October 22, 2023 20:44
@dzbarsky
Copy link
Contributor

Potentially-silly question - why do we need to support LLVM at absolute paths? Can't the user have a new_local_repository rule to bring the system LLVM into scope? I think that would simplify the approach here, no?

@siddharthab
Copy link
Contributor

Not silly at all. This was for avoiding the sandbox overhead when using llvm with an http_archive rule. So relative paths to the extracted repo become absolute paths, and the files don't have to be brought into the sandbox. When this project started, the sandboxing overhead in Bazel was significant. I don't know how it looks like now.

@dzbarsky
Copy link
Contributor

That makes sense. Nowadays --reuse_sandbox_directories (I believe it will be turned on by default in Bazel7) mitigates it - if you have a bunch of actions with the same inputs, you don't need to create the sandbox from scratch repeatedly. In particular, given how new bzlmod is, I'm not sure anyone will be trying to use bzlmod without a new Bazel. Should we try to remove the absolute paths and see if anyone complains?

@siddharthab
Copy link
Contributor

Might be OK to do it. Do you mean to remove it only for bzlmod or for both? I have not been keeping up with the ecosystem (my active involvement with Bazel was 4-5 years ago), so I don't have a strong opinion either way. You can decide between @rrbutani and yourself.

@rrbutani
Copy link
Collaborator Author

rrbutani commented Oct 28, 2023

I don't have a strong opinion about whether to support absolute paths; I don't personally use the feature and I agree that dropping support for it would be a nice simplification.

However, I think this is somewhat orthogonal to this PR; besides absolute paths the other reason this PR adds the (slightly weird) inverted toolchain root and sysroot label maps is to allow users to route multiple os/arch keys to the same Label; i.e.:

toolchain_roots = {
  "ubuntu-20.04-linux-x86_64": "@llvm_toolchain_root",
  "arch-linux-x86_64": "@llvm_toolchain_root",
},

If we were to make toolchain_roots a label_keyed_string_dict this isn't expressible directly.

Assuming we want the above to be expressible1, we have some options:

  1. Indirection: another attribute for the labels, as implemented in this PR.
  2. Use a label_keyed_string_dict for toolchain_roots and sysroot and introduce some kind of ad-hoc way to put multiple keys into one string; i.e.:
    sysroot = {
      "@linux_sysroot//:sysroot": json.encode(["arch-linux-aarch64", "pop-22.04-linux-aarch64"]),
    }
  • or maybe just comma separated strings
  • not the best UX (for the repo rule we can paper over this interface discrepancy with a macro that stringifies a list for you but for bzlmod I don't know of a workaround; we also can't use json.encode(...) in MODULE.bazel) but it works
  1. Use a label_keyed_string_dict, have users that want to map multiple things to the same label make their own aliases; i.e.:
     toolchain_roots = {
       "@llvm_toolchain_root//:alias_1": "ubuntu-20.04-linux-x86_64",
       "@llvm_toolchain_root//:alias_2": "arch-linux-x86_64",
     },
  • a mild annoyance but perhaps acceptable
  • I'm not sure we'd want to encourage users to exploit this but:
    • for repository rules, Labels passed in as attributes are not validated beyond their repo name (must be in the mapping) and their package path (directory must exist; IIRC it doesn't actually even need to contain a BUILD.bazel file but I may be misremembering)
    • this means that users can pass in Labels with fictitious names to get around not being able to have duplicate keys without running into any errors (note that for toolchain root labels we currently drop the name part of the label anyways; we just want a path to the repo + package)
    • maybe ^ makes this approach ("aliases") more acceptable?

I have no real opinion about which of these approaches to go with or whether we should support 1:many mappings at all. I went the indirection route because it preserves the existing functionality and because I think it's a less-worse breaking change (it'll at least error in a way that makes it obvious what attrs you need to change) but ultimately I was just trying to get the tests to pass under bzlmod.

Footnotes

  1. I don't feel strongly about this functionality either. If most users aren't specifying keys with distros in them (just os + arch) it seems conceivable that the mappings are actually 1:1 in practice.

@dzbarsky
Copy link
Contributor

Fair enough, I can send PRs removing the absolute paths separately. I'm not sure if people use distro keys, we do not. But we do use a universal sysroot for OSX somit covers two arches. I wouldn't mind getting by with the alias workaround though.

@rrbutani
Copy link
Collaborator Author

rrbutani commented Nov 5, 2023

@siddharthab @jsharpe: Do you have opinions about whether to keep the 1-to many functionality/whether the breaking change in this PR as implemented is acceptable?

@siddharthab
Copy link
Contributor

I have not had time to look into this properly. Honestly, I have not even looked into what bzlmod implies. Will do this tomorrow. Thanks for all your help @rrbutani, very appreciated.

@siddharthab
Copy link
Contributor

Will do this tomorrow.

Sorry, running behind with this, but still on my mind. Will try to get to this sometime this week.

@rrbutani
Copy link
Collaborator Author

rrbutani commented Nov 7, 2023

No rush; I was just going through my open PRs. It doesn't seem like this is affecting anyone's work flow (#234 was just about the test suite not passing).

@siddharthab
Copy link
Contributor

Thanks. I had a chance to look today. I prefer a comma-separated string list as the simplest and most concise option here. I do not anticipate commas being used as part of distro names. If commas can indeed be used, maybe some other separator, or perhaps your json encoding option. If we choose commas now and later realize it was a mistake, we can introduce an escape mechanism (small chance).

I think that Bazel never introduced more value types other than a string in label_keyed_string_dict, is because they expected a string to encode arbitrary information.

@rrbutani
Copy link
Collaborator Author

rrbutani commented Nov 9, 2023

I prefer a comma-separated string list as the simplest and most concise option here.

👍 Sounds good to me; I'll update this PR to match (probably not before this weekend though).

I think that Bazel never introduced more value types other than a string in label_keyed_string_dict, is because they expected a string to encode arbitrary information.

I think this has more to do with the pain involved with adding new attribute types to Bazel. If you're going to only offer one dict-label attribute type label_keyed_string_dict is definitely a reasonable choice but it'd be nice to have a more general mechanism (use cases for Dict[string, Label] seem to come up a bunch, for example) — I'm definitely envious of buck2's composable attribute types 😛.

@mering
Copy link
Contributor

mering commented Jan 3, 2024

Thanks for your great work so far! I tried using this branch for our migration to bzlmod and it seems to work fine so far.

Has there been any progress towards getting this merged in the past months? No hurries at all, I just would like to learn about your plans.

MODULE.bazel Outdated Show resolved Hide resolved
@rrbutani
Copy link
Collaborator Author

rrbutani commented Jan 3, 2024

Has there been any progress towards getting this merged in the past months? No hurries at all, I just would like to learn about your plans.

Apologies, I dropped the ball on this.

I still need to implement the interface changes that were requested. I should have time to do so this coming weekend.

@fmeum
Copy link
Member

fmeum commented Jan 11, 2024

Instead of going through the additional indirection, this could also be solved by adding a dedicated sysroot (and toolchain_root) tag. That could look something like this:

llvm = use_extension("@toolchains_llvm//toolchain/extensions:llvm.bzl", "llvm")
llvm.toolchain(llvm_version = "16.0.4")
llvm.sysroot(target = "linux-x86_64", sysroot_label = "@my_sysroot//:sysroot")
llvm.sysroot(target = "macos", sysroot_path = "/usr/lib/Xcode")

All these tags could have optional name attributes, just like toolchain.

toolchain/internal/repo.bzl Outdated Show resolved Hide resolved
@aran
Copy link

aran commented Feb 29, 2024

Is there any plan to get this landed and issuing a major version bump?

@siddharthab
Copy link
Contributor

I will wrap up this PR in the next few days.

@siddharthab siddharthab force-pushed the fix/labels-in-toolchain-roots branch from f5036c4 to 9127e1d Compare March 13, 2024 04:10
@siddharthab siddharthab merged commit 512a360 into bazel-contrib:master Mar 13, 2024
33 of 35 checks passed
@siddharthab
Copy link
Contributor

Thanks @rrbutani for starting work on this and all the discussion.
Thanks @fmeum for the ideas. I chose both of your suggestions.

@rrbutani
Copy link
Collaborator Author

@siddharthab sorry for dropping the ball on this; thanks for finishing it up

@mering mering mentioned this pull request Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bzlmod sysroot example test not working
7 participants