-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handling __FILE__ in code size tests in CircleCI #23195
Comments
This removes all absolute and relative file paths generated by `__FILE__` macro from release build, making the release build reproducible and the results of code size tests consistent. Without `-Wno-builtin-macro-redefined`, the build errors out: ```console In file included from <built-in>:365: <command line>:3:9: error: redefining builtin macro [-Werror,-Wbuiltin-macro-redefined] 3 | #define __FILE__ "" | ^ 1 error generated. ``` The more detailed description on why this is necessary is in emscripten-core#23195.
I think the better approach is to try to make Alternatively, we should just have CI never use batch building? |
Actually its not just CI.. anyone who want to run this test needs to have the same contents on libc++abi.a. Another approach then would be to somehow patch out or remove the usage of |
I found some good info on this problem here: https://reproducible-builds.org/docs/build-path/ I sounds like |
It batch builds are really the problem here I'm kind of tempted to remove them, or make them opt-in only, since they cause me other headaches from time to time myself. |
If we can switch batch builds to use absolute paths like the non-batch and Ninja builds, then we could use absolute paths everywhere, and the appropriate command-line flags to make those paths deterministic (or not, for local development debugging). The comment for why it uses relative paths is to avoid response files, but given that we need response file support anyway (because the command-line lengths might be too long anyway) what is the real advantage of that? |
Sure at the very least we could use absolute paths when I actually prefer absolute paths because one of the annoying things about the batch builds is that filenames in the error message it produces are not open-able (because they are relative to some build directory, not my directory where I ran the command). |
Actually I think if we just make the new value (the strings on the RHS of the '=') of line 499 and 500 of emscripten/tools/system_libs.py Lines 495 to 501 in f615920
match, then the relative and absolute paths should end up matching in deterministic mode, and non-deterministic mode should be unchanged. |
That would be a great solution! One downside is that the codesize tests will fail without |
How about we use |
Some of the CI uses batch building while others use Ninja, so removing batch building from CI does not fix the problem. The ones that use Ninja are the ones dependent on emscripten/.circleci/config.yml Lines 980 to 1021 in 56ee9dd
That's basically what Zig guys did (at least for the release mode): ziglang/zig@98a30ac
I think we are already using
We only need to fix the relative/absolute path thing so I guess we don't need to remove the batch build itself unless it has other problems? But I'm not very familiar with that. It looks the batch build started using relative paths in order not to switch to response file too often.
If we use absolute paths everywhere that's not gonna be deterministic because they are different in different CI bots (e.g. test-other and test-mac-arm64) and also different from our local build, no?
This can be good solution, but this will still generate an "not openable" file paths @sbc100 mentioned. Would that be OK? Also you need to clear cache and rebuild with the deterministic mode every time rebaselining code size tests. |
|
BTW, some of these are never used. For example I just tried to make a test case using
But actually it ignores all its arguments in most configurations (including ours). |
But for my local builds I want real debug path in by debug into, but I still want to be able to pass all the code size tests, so I think I want just the |
Alternatively we just could just set |
@sbc100 So where do you want the real local absolute paths to be? In debug info, or in macro |
Also you added the option to opt out of the batch build due to the path problem: #20929 |
oh I see what you mean. Yeah, adding |
If I was going to care about real absolute paths it would only be in the debug info. However, give the number of times I actually use the debug into I'm not sure its really worth that much to me to have this. |
I added that option because the batch builds was making it hard to read the error message and build just one object at a time. It wasn't so much about that paths, although that is somewhat of an issue yes. |
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 But this does not make relative paths and absolute paths the same, which can be a problem when data generated by `__FILE__` macro is included in one of code size tests. This problem is discussed in emscripten-core#23195. This PR makes `__FILE__` macro produce the same data in all cases, so that it wouldn't change any results for code size tests. This is done by `-fmacro-prefix-map`. For the debug info, when `deterministic_paths` is set, this uses a fake path `/emsdk/emscripten` as a base emscripten path. When `deterministic_paths` is not set, this uses real local absolute paths in the debug info. This allows local developers to see their real paths in the debug info while continuing to use the same (fake) path `/emsdk/emscripten` we have used so far for the release binaries. Users can set their debug base path to whatever path they like, but given that we have used `/emsdk/emscripten` in release binaries for a while, it is possible that some users have set their configuration with this directory, so it would be better not to break them by changing it. This is basically implementing what's suggested in emscripten-core#23195 (comment) and emscripten-core#23195 (comment) This also turns `deterministic_paths` on for the Ninja path in embuilder for consistency with the non-Ninja path. Fixes #23915.
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 But this does not make relative paths and absolute paths the same, which can be a problem when data generated by `__FILE__` macro is included in one of code size tests. This problem is discussed in emscripten-core#23195. This PR makes `__FILE__` macro produce the same data in all cases by using the fake path `/emsdk/emscripten` as its base, so that it wouldn't change any results for code size tests. This is done by `-fmacro-prefix-map`. This differs from the current behavior because we don't handle relative and absolute paths differently. For the debug info, when `deterministic_paths` is set, this uses a fake path `/emsdk/emscripten` as a base emscripten path. When `deterministic_paths` is not set, this uses real local absolute paths in the debug info. This allows local developers to see their real paths in the debug info while continuing to use the same (fake) path `/emsdk/emscripten` we have used so far for the release binaries. Users can set their debug base path to whatever path they like, but given that we have used `/emsdk/emscripten` in release binaries for a while, it is possible that some users have set their configuration with this directory, so it would be better not to break them by changing it. This is done by `-ffile-prefix-map` as we have done so far, which is an alias for both `-fdebug-prefix-map` and `-fmacro-prefix-map`. This is basically implementing what's suggested in emscripten-core#23195 (comment) and emscripten-core#23195 (comment) This also turns `deterministic_paths` on for the Ninja path in embuilder for consistency with the non-Ninja path. Fixes #23915.
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in emscripten-core#23195 (comment) while discussins `__FILE__` problem in #23915.
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in emscripten-core#23195 (comment) while discussins `__FILE__` problem in #23915.
When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in emscripten-core#23195 (comment) while discussins `__FILE__` problem in #23915. This does not change any code size tests because no code size tests happens to contain `__FILE__` at the moment. (One will be added in emscripten-core#22994)
#23222) When `deterministic_paths` is set, we are currently using `-ffile-prefix-map` to produce the same path in data and debug info. In the case of absolute paths, their emscripten path is replaced with a fake path `/emsdk/emscripten`, and in the case of relative paths, all path relative to the emscripten directory is removed, so `../../system/lib/somefile.c` becomes `system/lib/somefiles.c`. https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L472-L477 https://github.com/emscripten-core/emscripten/blob/f66b5d706e174d9e5cc6122c06ea29dcd2735cd0/tools/system_libs.py#L495-L501 So this does not make relative paths and absolute paths the same, which can lead to different builds depending on whether the command line uses absolute paths vs. relative ones. Currently we use relative paths when `EMCC_BATCH_BUILD` is set. And Ninja builds cannot use relative paths. This is also what was suggested in #23195 (comment) while discussins `__FILE__` problem in #23195. This does not change any code size tests because no code size tests happens to contain `__FILE__` at the moment. (One will be added in #22994)
This removes `deterministic_paths` option by turning it on all the time, in order to produce reproducible builds, both for `__FILE__` macro and debug info paths. This is an alternative to emscripten-core#23212, which did not remove `deterministic_paths` but always set only `-fmacro-prefix-map` on. This PR is what was suggested in emscripten-core#23195 (comment) and emscripten-core#23212 (comment). Fixes emscripten-core#23195.
This updates libcxx and libcxxabi to LLVM 19.1.4: https://github.com/llvm/llvm-project/releases/tag/llvmorg-19.1.4 The initial update was done using `update_libcxx.py` and `update_libcxxabi.py`, and subsequent fixes were made in indidual commits. The commit history here is kind of messy because of CI testing so not all individual commits are noteworthy. Additional changes: - Build libcxx and libcxxabi with C++23: 8b0bfdf https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/src/expected.cpp was added in llvm/llvm-project#87390 and this file assumes C++23 to be compiled. Apparently libc++ sources are always built with C++23 so they don't guard things against it in `src/`: llvm/llvm-project#87390 (comment) This commit also builds libc++abi with C++23 because it doesn't seem there's any downside to it. - Exclude newly added `compiler_rt_shims.cpp`: 5bbcbf0 We have excluded files in https://github.com/emscripten-core/emscripten/tree/main/system/lib/libcxx/src/support/win32. This is a new file added in this directory in llvm/llvm-project#83575. - Disable time zone support: a5f2cbe We disabled C++20 time zone support in LLVM 18 update (#21638): df9af64 The list of source files related to time zone support has changed in llvm/llvm-project#74928, so this commit reflects it. - Re-add + update `__assertion_handler` from `default_assertion_handler.in`: 41f8037 This file was added as a part of LLVM 18 update (#21638) in 8d51927 and mistakenly deleted when I ran `update_libcxx.py`. This file was copied from https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxx/vendor/llvm/default_assertion_handler.in, so this also updates the file with the newest `default_assertion_handler.in`. - `_LIBCPP_PSTL_CPU_BACKEND_SERIAL` -> `_LIBCPP_PSTL_BACKEND_SERIAL`: 4b969c3 The name changed in this update, so reflecting it on our `__config_site`. - Directly include `pthread.h` from `emscripten/val.h`: a5a76c3 Due to file rearrangements happened, this was necessary. --- Other things to note: - `std::basic_string<unsigned_char>` is not supported anymore The support for `std::basic_string<unsigned_char>`, which was being used by embind, is removed in this version.#23070 removes the uses from embind. - libcxxabi uses `__FILE__` in more places llvm/llvm-project#80689 started to use `_LIBCXXABI_ASSERT`, which [uses](https://github.com/llvm/llvm-project/blob/aadaa00de76ed0c4987b97450dd638f63a385bed/libcxxabi/src/abort_message.h#L22) `__FILE__`, in `private_typeinfo.cpp`. `__FILE__` macro produces different paths depending on the local directory names and how the file is given to clang in the command line, and this file is included in the result of one of our code size tests, `other.test_minimal_runtime_code_size_hello_embind`, which caused the result of the test to vary depending on the CI bots and how the library is built (e.g., whether by embuilder, ninja, or neither). Even though this was brought to surface during this LLVM 19 update, `__FILE__` macro could be a problem for producing reproducible builds anyway. We discussed this problem in #23195, and the fixes landed in #23222, #23225, and #23256.
C++'s
__FILE__
macro expands to the file's path. Whether that's a relative path or an absolute one depends on what's given to the compiler. For example,$ clang++ ../../test.cpp
will expand
test.cpp
's__FILE__
into../../test.cpp
, whereas$ clang++ ~/test.cpp
will expand it into
/whatever/absolute/path/test.cpp
.In
system_lib.py
, we use both of them depending on the paths.build_objects
) andbatch_inputs
is true, which is the default:emscripten/tools/system_libs.py
Lines 536 to 539 in f615920
build_objects
) andbatch_inputs
is false (Add switch to disable batching of inputs when building system libs #20929)create_ninja_file
). I don't think we can use relative paths in Ninja files.In CircleCI, tests that depend on
build-libs
, which is a part ofbuild-linux
, which uses Ninja, will be built with absolute paths when code includes__FILE__
. This includes all core, other, and browser tests. Other tests (test-mac-arm64
,test-windows
, ...) don't usebuild-libs
so they usebuild_objects
and thus will include relative paths.So, this will produce different builds for different CircleCI tests.
deterministic_paths
parameter, which uses-ffile-prefix-map
insystem_lib.py
makes all relative paths the same string and all absolute paths the same string, but does not make an absolute path and a relative path the same. So even if we make CircleCI always setdeterministic_paths
, the problem remains:emscripten/tools/system_libs.py
Lines 495 to 501 in f615920
This problem was brought into attention because the new LLVM 19's libc++abi adds more usage of
__FILE__
and one of them ended up inother.test_minimal_runtime_code_size_hello_embind
, causing the code sizes to be different between different CircleCI tests. We can make this usage an empty string in release mode (which Zig did) but this does not fundamentally fix the problem that__FILE__
can end up in other code size tests. Currently we seem to use__FILE__
in several places in libraries:Not sure what is the best way to proceed:
__FILE__
an empty string in release mode via adding-D__FILE__=""
insystem_lib.py
.__FILE__
an empty string when the environment variableCIRCLECI
is set, which is set in all CIrcleCI tests. But then we have to make sure to clear cache and setCIRCLECI
when rebaselining code size tests on our local machine, which is a pain, likeI personally prefer 1, which can be as simple as #23196.
The text was updated successfully, but these errors were encountered: