Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bazel 5.3 fails to run external tests #16003

Open
AustinSchuh opened this issue Jul 29, 2022 · 12 comments
Open

Bazel 5.3 fails to run external tests #16003

AustinSchuh opened this issue Jul 29, 2022 · 12 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Rules-CPP Issues for C++ rules type: bug

Comments

@AustinSchuh
Copy link
Contributor

AustinSchuh commented Jul 29, 2022

Description of the bug:

Bazel fails to run C++ tests in external repositories on remote execution.

Running locally passes, even with linux-sandbox. Bazel 5.0 worked. This has broken bazel > 5.0 for us, and is blocking all upgrades.

What's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

bazel test -c opt --config=engflow --config=build_without_the_bytes @aos//aos:condition_test

Run a c++ test for an external repository on remote execution. (I can't give you a remote execution cluster)

Which operating system are you running Bazel on?

Debian Bullseye

What is the output of bazel info release?

release 5.3.0-202207291633+f440f8ec3f

If bazel info release returns development version or (@non-git), tell us how you built Bazel.

#!/bin/bash

# This script builds a Debian package from a Bazel source tree with the
# correct version.
# The only argument is the path to a Bazel source tree.

set -e
set -u

BAZEL_SOURCE="$1"

VERSION="5.3.0-$(date +%Y%m%d%H%M)+$(GIT_DIR="${BAZEL_SOURCE}/.git" git rev-parse --short HEAD)"
OUTPUT="bazel_${VERSION}"

(
cd "${BAZEL_SOURCE}"
bazel build -c opt //src:bazel --embed_label="${VERSION}" --stamp=yes
)

cp "${BAZEL_SOURCE}/bazel-bin/src/bazel" "${OUTPUT}"

echo "Output is at ${OUTPUT}"

What's the output of git remote get-url origin; git rev-parse master; git rev-parse HEAD ?

austin[504] cmbr (release-5.3.0) ~/local/bazel
$ git remote get-url origin; git rev-parse master; git rev-parse HEAD
https://github.com/bazelbuild/bazel
master
fatal: ambiguous argument 'master': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
f440f8ec3f63e5d663e1f9d9614f05a39422102a

Have you found anything relevant by searching the web?

Reverting 0080572 fixes it. #12821 suggested that I add linkstatic to every C++ test we want to run, which seems like 1k's of lines of diffs off upstream, since no reasonable upstream would accept that patch.

Any other information, logs, or outputs that you want to share?

bazel test -c opt --config=remote_execution --config=build_without_the_bytes @aos//aos:condition_test
INFO: Invocation ID: d91e1b30-a290-4282-99c6-029e4102ed35
INFO: Analyzed target @aos//aos:condition_test (86 packages loaded, 20904 targets configured).
INFO: Found 1 test target...
FAIL: @aos//aos:condition_test (see /home/austin/.cache/bazel/_bazel_austin/f5b123ff4a0d503fd09f4c4da36644a6/execroot/repo/bazel-out/k8-opt/testlogs/external/aos/aos/condition_test/test.log)
INFO: From Testing @aos//aos:condition_test:
==================== Test output for @aos//aos:condition_test:
/var/lib/worker/work/1/exec/bazel-out/k8-opt/bin/external/aos/aos/condition_test.runfiles/repo/../aos/aos/condition_test: error while loading shared libraries: libexternal_Saos_Saos_Slibcondition.so: cannot open shared object file: No such file or directory
================================================================================
Target @aos//aos:condition_test up-to-date:
  bazel-bin/external/aos/aos/condition_test
INFO: Elapsed time: 166.038s, Critical Path: 153.48s
INFO: 52 processes: 50 remote cache hit, 2 remote.
@aos//aos:condition_test                                                 FAILED in 127.2s
  /home/austin/.cache/bazel/_bazel_austin/f5b123ff4a0d503fd09f4c4da36644a6/execroot/repo/bazel-out/k8-opt/testlogs/external/aos/aos/condition_test/test.log

Executed 1 out of 1 test: 1 fails remotely.
INFO: Build completed, 1 test FAILED, 52 total actions

And to prove I'm not crazy:

bazel test -c opt -k @aos//aos:condition_test
INFO: Build options --experimental_inmemory_dotd_files, --experimental_inmemory_jdeps_files, --extra_execution_platforms, and 2 more have changed, discarding analysis cache.
INFO: Analyzed target @aos//aos:condition_test (0 packages loaded, 20897 targets configured).
INFO: Found 1 test target...
Target @aos//aos:condition_test up-to-date:
  bazel-bin/external/aos/aos/condition_test
INFO: Elapsed time: 12.246s, Critical Path: 10.91s
INFO: 120 processes: 42 internal, 78 linux-sandbox.
INFO: Build completed successfully, 120 total actions
@aos//aos:condition_test                                                 PASSED in 2.3s

Executed 1 out of 1 test: 1 test passes.
INFO: Build completed successfully, 120 total actions
@fmeum
Copy link
Collaborator

fmeum commented Jul 30, 2022

Does #14600 fix this issue?

@AustinSchuh
Copy link
Contributor Author

Aw, I was hopeful. I fixed some merge conflicts (looks like "../".replace() -> Strings.replace("../", ...) and I still get the same failure.

==================== Test output for @aos//aos:condition_test:
/var/lib/worker/work/3/exec/bazel-out/k8-opt/bin/external/aos/aos/condition_test.runfiles/repo/../aos/aos/condition_test: error while loading shared libraries: libexternal_Saos_Saos_Slibcondition.so: cannot open shared object file: No such file or directory

@fmeum
Copy link
Collaborator

fmeum commented Jul 30, 2022

Could you check whether #16008 fixes the issue? It includes an integration test that I distilled from your reproducer.

@sgowroji sgowroji added type: bug untriaged team-Remote-Exec Issues and PRs for the Execution (Remote) team labels Aug 1, 2022
@AustinSchuh
Copy link
Contributor Author

AustinSchuh commented Aug 1, 2022

It does! Great work, thanks for the prompt response and prompt resolution. I really appreciate it.

FYI, there's the same merge conflict with Strings.replace. Not a hard thing to fix though.

@coeuvre coeuvre assigned oquenchil and unassigned coeuvre Aug 2, 2022
@coeuvre coeuvre added team-Rules-CPP Issues for C++ rules P2 We'll consider working on this in future. (Assignee optional) and removed team-Remote-Exec Issues and PRs for the Execution (Remote) team untriaged labels Aug 2, 2022
@coeuvre
Copy link
Member

coeuvre commented Aug 2, 2022

Assigning to @oquenchil since the linked fix is about cc rules.

fmeum added a commit to fmeum/bazel that referenced this issue Aug 10, 2022
The solib directory is located within the subdirectory of the runfiles
directory corresponding to the workspace. Thus, if a binary is contained
in an external repository, its $ORIGIN relative rpath has to first
ascend to the runfiles directory and then descend into the workspace
directory.

Fixes bazelbuild#16003

Closes bazelbuild#16008.

PiperOrigin-RevId: 466634083
Change-Id: I4ada28b459f23f68a2091dbaad9147cfec2fbe43
ShreeM01 pushed a commit that referenced this issue Aug 10, 2022
The solib directory is located within the subdirectory of the runfiles
directory corresponding to the workspace. Thus, if a binary is contained
in an external repository, its $ORIGIN relative rpath has to first
ascend to the runfiles directory and then descend into the workspace
directory.

Fixes #16003

Closes #16008.

PiperOrigin-RevId: 466634083
Change-Id: I4ada28b459f23f68a2091dbaad9147cfec2fbe43
@AustinSchuh
Copy link
Contributor Author

Aw, @fmeum , looks like this fixed almost all of the tests except one which expects $(location) and RPATH to agree on the path to the shared libraries. #16108 is the issue since it smells different enough to be a new bug.

@Wyverald
Copy link
Member

@bazel-io fork 5.3.1

@Wyverald
Copy link
Member

The fix introduced a regression (see #16008 (comment)), which we will probably need a patch release for.

@philsc
Copy link
Contributor

philsc commented Sep 16, 2022

Seeing a very similar problem when upgrading from 5.2.0 to 5.3.0:

error while loading shared libraries: libexternal_Sopenvkl_Slibispc_Uutil_Uispc.so: cannot open shared object file: No such file or directory

But trying out 5.3.1rc2, it looks like that particular problem is resolved for us.

@sgowroji
Copy link
Member

Hello @AustinSchuh, Are you still seeing this issue with Release 5.3.1 ?

@AustinSchuh
Copy link
Contributor Author

5.3.1 works for me. I'm hitting #16108 but that feels separate.

@ShreeM01
Copy link
Contributor

Thanks for the update @AustinSchuh!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Rules-CPP Issues for C++ rules type: bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants