Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

7.1.0rc1 crashes with Unrecoverable error while evaluating node REPOSITORY_DIRECTORY #21483

Closed
criemen opened this issue Feb 23, 2024 · 11 comments
Assignees
Labels
P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug

Comments

@criemen
Copy link
Contributor

criemen commented Feb 23, 2024

Description of the bug:

Buildling our project (working fine with 7.0.2) on Ubuntu 22.04 crashes with

FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.RuntimeException: Unrecoverable error while evaluating node 'REPOSITORY_DIRECTORY:@@rules_nodejs~~node~nodejs' (requested by nodes 'PACKAGE_LOOKUP:@@rules_nodejs~~node~nodejs//')
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:550)
        at com.google.devtools.build.lib.concurrent.AbstractQueueVisitor$WrappedRunnable.run(AbstractQueueVisitor.java:414)
        at java.base/java.util.concurrent.ForkJoinTask$RunnableExecuteAction.exec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.scan(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(Unknown Source)
Caused by: java.lang.NullPointerException
        at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:903)
        at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.fetchRepository(RepositoryDelegatorFunction.java:439)
        at com.google.devtools.build.lib.rules.repository.RepositoryDelegatorFunction.compute(RepositoryDelegatorFunction.java:205)
        at com.google.devtools.build.skyframe.AbstractParallelEvaluator$Evaluate.run(AbstractParallelEvaluator.java:461)
        ... 7 more
BAZEL FAILED

The error is reproducible.

Which category does this issue belong to?

Core

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I've not worked on getting a small reproducing example yet (our repo is unfortunately closed source).
This looks somewhat similar other crashes, though, so I wonder if this is necessary right now.

Which operating system are you running Bazel on?

Ubuntu 22.04

What is the output of bazel info release?

release 7.1.0rc1

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

No response

What's the output of git remote get-url origin; git rev-parse HEAD ?

private repository

Is this a regression? If yes, please try to identify the Bazel commit where the bug was introduced.

No response

Have you found anything relevant by searching the web?

No response

Any other information, logs, or outputs that you want to share?

No response

@github-actions github-actions bot added the team-Core Skyframe, bazel query, BEP, options parsing, bazelrc label Feb 23, 2024
@criemen criemen changed the title 7.1.0rc1 crashes with 7.1.0rc1 crashes with Unrecoverable error while evaluating node REPOSITORY_DIRECTORY Feb 23, 2024
@meisterT
Copy link
Member

meisterT commented Feb 23, 2024

This was last touched in https://github.com/bazelbuild/bazel/pull/21182/files

handler.fetch is annotated @Nullable and we do not check for null here...

@meisterT meisterT added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. and removed team-Core Skyframe, bazel query, BEP, options parsing, bazelrc labels Feb 23, 2024
@meisterT
Copy link
Member

@bazel-io fork 7.1.0

@meteorcloudy
Copy link
Member

@criemen Is it possible to construct a minimal reproducible case? It'll make it much easier to diagnose this problem.

@criemen
Copy link
Contributor Author

criemen commented Feb 29, 2024

@meteorcloudy here you go: https://github.com/criemen/bazel-710rc1-repro

Luckily, it was fairly easy to hit.

@meteorcloudy
Copy link
Member

meteorcloudy commented Feb 29, 2024

bazelisk --bisect=7.0.0..release-7.1.0 build --nobuild //:typescript gives 18ba449

@Wyverald
Copy link
Member

Wyverald commented Feb 29, 2024

Same bug on master, culprit is b1294e3. Investigating.

@Wyverald
Copy link
Member

This part of the rules_nodejs code triggers the bug: https://github.com/bazelbuild/rules_nodejs/blob/943800d8d601872dae53f5c1e17414d316834cac/nodejs/private/nodejs_repo_host_os_alias.bzl#L34-L38

This is because we try to watch the symlink target, but the target in this case actually points into where another repo (in this case @@nodejs_darwin_arm64) would be fetched. The problem is that that repo isn't actually defined (IOW, it doesn't exist); in the Bzlmod world, that repo's name would actually be @@rules_nodejs~~nodejs~nodejs_darwin_arm64.

I think there are two things that need to be fixed:

  1. The parameter rctx.symlink(watch_target=) shouldn't exist. The symlink target's existence and its contents should not affect the creation of the symlink at all, so there's no reason to ever watch it.
  2. Watching a path that corresponds to a nonexistent/undefined repo should either result in an explicit error, or it should be the same as watching any nonexistent file. An explicit error makes sense because this file will technically never exist (all bets are off if you go creating things in $outputBase/external manually...).

@Wyverald
Copy link
Member

Number 1 is P1 (that should solve the immediate bug); number 2 is P2 or P3 since nobody should explicitly call rctx.watch() on such a path.

@criemen
Copy link
Contributor Author

criemen commented Mar 1, 2024

The problem is that that repo isn't actually defined (IOW, it doesn't exist); in the Bzlmod world, that repo's name would actually be @@rules_nodejs~~nodejs~nodejs_darwin_arm64.

Would you mind filing a rules_nodejs bug for refering to the wrong repository with bzlmod enabled?

Wyverald added a commit that referenced this issue Mar 4, 2024
Watching a path in an undefined repo (eg. `$output_base/external/i_am_undefined/blah`) currently causes an NPE because we expect it to need a skyframe restart when it actually doesn't. The underlying reason is rather convoluted, but the gist is that we request `RepositoryDirectoryValue.Key(@@i_am_undefined)` explicitly, when we don't actually have to -- we can simply use `FileValue.Key($output_base/external/i_am_undefined/blah` directly, and rely on the fact that `FileFunction` knows to request the corresponding `RepositoryDirectoryValue` and behave appropriately when the repo is undefined. (This is due to logic in `ExternalFilesHelper.maybeHandleExternalFile`.)

Additionally, this PR removes the `rctx.symlink(watch_target=)` parameter. The symlink target's existence and its contents should not affect the creation of the symlink at all, so there's no reason to ever watch it.

Fixes #21483.
Wyverald added a commit that referenced this issue Mar 4, 2024
Watching a path in an undefined repo (eg. `$output_base/external/i_am_undefined/blah`) currently causes an NPE because we expect it to need a skyframe restart when it actually doesn't. The underlying reason is rather convoluted, but the gist is that we request `RepositoryDirectoryValue.Key(@@i_am_undefined)` explicitly, when we don't actually have to -- we can simply use `FileValue.Key($output_base/external/i_am_undefined/blah` directly, and rely on the fact that `FileFunction` knows to request the corresponding `RepositoryDirectoryValue` and behave appropriately when the repo is undefined. (This is due to logic in `ExternalFilesHelper.maybeHandleExternalFile`.)

Additionally, this PR removes the `rctx.symlink(watch_target=)` parameter. The symlink target's existence and its contents should not affect the creation of the symlink at all, so there's no reason to ever watch it.

Fixes #21483.
Wyverald added a commit that referenced this issue Mar 4, 2024
Watching a path in an undefined repo (eg. `$output_base/external/i_am_undefined/blah`) currently causes an NPE because we expect it to need a skyframe restart when it actually doesn't. The underlying reason is rather convoluted, but the gist is that we request `RepositoryDirectoryValue.Key(@@i_am_undefined)` explicitly, when we don't actually have to -- we can simply use `FileValue.Key($output_base/external/i_am_undefined/blah` directly, and rely on the fact that `FileFunction` knows to request the corresponding `RepositoryDirectoryValue` and behave appropriately when the repo is undefined. (This is due to logic in `ExternalFilesHelper.maybeHandleExternalFile`.)

Additionally, this PR removes the `rctx.symlink(watch_target=)` parameter. The symlink target's existence and its contents should not affect the creation of the symlink at all, so there's no reason to ever watch it.

Fixes #21483.
@Wyverald
Copy link
Member

Wyverald commented Mar 4, 2024

re point 2: I ended up making it the same as watching any other nonexistent path (i.e. not throwing) because that repo can technically become defined at some point (see unit test in #21562).

The problem is that that repo isn't actually defined (IOW, it doesn't exist); in the Bzlmod world, that repo's name would actually be @@rules_nodejs~~nodejs~nodejs_darwin_arm64.

Would you mind filing a rules_nodejs bug for refering to the wrong repository with bzlmod enabled?

Done: bazel-contrib/rules_nodejs#3720

bazel-io pushed a commit to bazel-io/bazel that referenced this issue Mar 5, 2024
Watching a path in an undefined repo (eg. `$output_base/external/i_am_undefined/blah`) currently causes an NPE because we expect it to need a skyframe restart when it actually doesn't. The underlying reason is rather convoluted, but the gist is that we request `RepositoryDirectoryValue.Key(@@i_am_undefined)` explicitly, when we don't actually have to -- we can simply use `FileValue.Key($output_base/external/i_am_undefined/blah` directly, and rely on the fact that `FileFunction` knows to request the corresponding `RepositoryDirectoryValue` and behave appropriately when the repo is undefined. (This is due to logic in `ExternalFilesHelper.maybeHandleExternalFile`.)

Additionally, this PR removes the `rctx.symlink(watch_target=)` parameter. The symlink target's existence and its contents should not affect the creation of the symlink at all, so there's no reason to ever watch it.

Fixes bazelbuild#21483.

Closes bazelbuild#21562.

PiperOrigin-RevId: 612898526
Change-Id: I30deb9d9b2d19e0869517f4b6b126078446745b4
github-merge-queue bot pushed a commit that referenced this issue Mar 5, 2024
Watching a path in an undefined repo (eg.
`$output_base/external/i_am_undefined/blah`) currently causes an NPE
because we expect it to need a skyframe restart when it actually
doesn't. The underlying reason is rather convoluted, but the gist is
that we request `RepositoryDirectoryValue.Key(@@i_am_undefined)`
explicitly, when we don't actually have to -- we can simply use
`FileValue.Key($output_base/external/i_am_undefined/blah` directly, and
rely on the fact that `FileFunction` knows to request the corresponding
`RepositoryDirectoryValue` and behave appropriately when the repo is
undefined. (This is due to logic in
`ExternalFilesHelper.maybeHandleExternalFile`.)

Additionally, this PR removes the `rctx.symlink(watch_target=)`
parameter. The symlink target's existence and its contents should not
affect the creation of the symlink at all, so there's no reason to ever
watch it.

Fixes #21483.

Closes #21562.

Commit
aba1186

PiperOrigin-RevId: 612898526
Change-Id: I30deb9d9b2d19e0869517f4b6b126078446745b4

Co-authored-by: Xdng Yng <wyverald@gmail.com>
@iancha1992
Copy link
Member

A fix for this issue has been included in Bazel 7.1.0 RC2. Please test out the release candidate and report any issues as soon as possible.
If you're using Bazelisk, you can point to the latest RC by setting USE_BAZEL_VERSION=7.1.0rc2. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1 I'll work on this now. (Assignee required) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants