Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nondeterminism encountered in metadata for doc_link_resolutions #124357

Closed
tmandry opened this issue Apr 25, 2024 · 8 comments
Closed

Nondeterminism encountered in metadata for doc_link_resolutions #124357

tmandry opened this issue Apr 25, 2024 · 8 comments
Labels
A-metadata Area: Crate metadata A-reproducibility Area: Reproducible / deterministic builds T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@tmandry
Copy link
Member

tmandry commented Apr 25, 2024

Fuchsia has a test that checks whether rustc output is deterministic, and we find that it fairly reliably reorders bits of the metadata output for doc_link_resolutions for one particular crate in our build.

I determined this by picking out some changes of d5bd169 to add RUSTC_FILE_ENCODER_PANIC_AT_OFFSET and examining the backtrace at an offset that differs in one failure in the hexdump below. I also grepped for the strings that appear in all of the differences.

  • Every string that appears in a diff also appears in a doc link of a doc comment of the form [`std::ops::Deref`].
  • Only the metadata differs, no object files differ.
  • The input files are the same between invocations in a single failure.
  • We haven't ruled out other sources of nondeterminism, like proc macros.
  • Across multiple failures, the difference has been showing up around the same content, but the diff is sometimes slightly different (another example). This could also be because of changes in the source between failures.
   4: <rustc_serialize::opaque::FileEncoder>::write_all
             at /code/rust/compiler/rustc_serialize/src/opaque.rs:162:13
   5: <rustc_serialize::opaque::FileEncoder as rustc_serialize::serialize::Encoder>::emit_raw_bytes
             at /code/rust/compiler/rustc_serialize/src/opaque.rs:279:9
   6: <rustc_metadata::rmeta::encoder::EncodeContext as rustc_serialize::serialize::Encoder>::emit_raw_bytes
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:79:25
   7: <rustc_metadata::rmeta::encoder::EncodeContext as rustc_serialize::serialize::Encoder>::emit_str
             at /code/rust/compiler/rustc_serialize/src/serialize.rs:69:9
   8: <rustc_metadata::rmeta::encoder::EncodeContext as rustc_span::SpanEncoder>::encode_symbol
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:211:26
   9: <rustc_span::symbol::Symbol as rustc_serialize::serialize::Encodable<rustc_metadata::rmeta::encoder::EncodeContext>>::encode
             at /code/rust/compiler/rustc_span/src/lib.rs:1185:9
  10: <(rustc_span::symbol::Symbol, rustc_hir::def::Namespace) as rustc_serialize::serialize::Encodable<rustc_metadata::rmeta::encoder::EncodeContext>>::encode
             at /code/rust/compiler/rustc_serialize/src/serialize.rs:424:19
  11: <std::collections::hash::map::HashMap<(rustc_span::symbol::Symbol, rustc_hir::def::Namespace), core::option::Option<rustc_hir::def::Res<rustc_ast::node_id::NodeId>>, core::hash::BuildHasherDefault<rustc_hash::FxHasher>> as rustc_serialize::serialize::Encodable<rustc_metadata::rmeta::encoder::EncodeContext>>::encode
             at /code/rust/compiler/rustc_serialize/src/serialize.rs:598:13
  12: <rustc_data_structures::unord::UnordMap<(rustc_span::symbol::Symbol, rustc_hir::def::Namespace), core::option::Option<rustc_hir::def::Res<rustc_ast::node_id::NodeId>>> as rustc_serialize::serialize::Encodable<rustc_metadata::rmeta::encoder::EncodeContext>>::encode
             at /code/rust/compiler/rustc_data_structures/src/unord.rs:420:39
  13: <rustc_metadata::rmeta::encoder::EncodeContext>::lazy::<rustc_data_structures::unord::UnordMap<(rustc_span::symbol::Symbol, rustc_hir::def::Namespace), core::option::Option<rustc_hir::def::Res<rustc_ast::node_id::NodeId>>>, &rustc_data_structures::unord::UnordMap<(rustc_span::symbol::Symbol, rustc_hir::def::Namespace), core::option::Option<rustc_hir::def::Res<rustc_ast::node_id::NodeId>>>>
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:466:9
  14: <rustc_metadata::rmeta::encoder::EncodeContext>::encode_def_ids
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:1516:13
  15: <rustc_metadata::rmeta::encoder::EncodeContext>::encode_crate_root::{closure#16}
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:639:33
  16: <rustc_metadata::rmeta::encoder::EncodeContext>::encode_crate_root
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:598:27
  17: rustc_metadata::rmeta::encoder::encode_metadata
             at /code/rust/compiler/rustc_metadata/src/rmeta/encoder.rs:2266:16
  18: rustc_metadata::fs::encode_and_write_metadata
             at /code/rust/compiler/rustc_metadata/src/fs.rs:65:13
  19: rustc_interface::passes::start_codegen
             at /code/rust/compiler/rustc_interface/src/passes.rs:929:44

Lines are from compiler commit 7f2fc33. Source code of the crate is available here.
Full backtrace

In the following diff, the first version (red) is more common.

--- obj/src/starnix/kernel/libstarnix_core.rlib.llvm-objdump.local	2024-04-22 16:14:34.523061172 +0000
+++ obj/src/starnix/kernel/libstarnix_core.rlib.llvm-objdump.remote	2024-04-22 16:14:51.439162227 +0000
@@ -355421,55 +355421,55 @@
  56c560 52d526d5 1ddc7edc 2ba5a001 e0a001a3  R.&...~.+.......
  56c570 a001e315 cb268525 a646930f d336a775  .....&.%.F...6.u
  56c580 c4558a13 b908ee24 c562c72a c4601502  .U.....$.b.*.`..
- 56c590 00000002 c4070100 01b7b904 01000012  ................
- 56c5a0 4d656d6f 72794d61 6e616765 723a3a6d  MemoryManager::m
- 56c5b0 6170c101 00020002 0002c001 0101000f  ap..............
- 56c5c0 010002ab c802000a 45727228 6572726e  ........Err(errn
- 56c5d0 6f29c100 00018daf 04000002 c4070200  o)..............
- 56c5e0 01b7b904 0200019f 8bdb0200 00020001  ................
+ 56c590 00000002 c4070100 01b7b904 0100000a  ................
+ 56c5a0 45727228 6572726e 6f29c101 00020002  Err(errno)......
+ 56c5b0 0002c001 0101000f 010002ab c802018d  ................
+ 56c5c0 af040000 02c40702 0000124d 656d6f72  ...........Memor
+ 56c5d0 794d616e 61676572 3a3a6d61 70c10000  yManager::map...
+ 56c5e0 01b7b904 0200019f 8bdb0202 00020001  ................
  56c5f0 0002c001 020002c4 07000001 8daf0401  ................
- 56c600 01000b00 8531018d af040200 01c78bdb  .....1..........
- 56c610 02010001 c78bdb02 020001b7 b9040001  ................
- 56c620 000100e1 31019f8b db020200 02c00100  ....1...........
- 56c630 01000402 aac80215 00277a78 3a3a7379  .........'zx::sy
- 56c640 733a3a7a 785f7468 72656164 5f737461  s::zx_thread_sta
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 25, 2024
@tmandry tmandry added A-metadata Area: Crate metadata T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. A-reproducibility Area: Reproducible / deterministic builds labels Apr 25, 2024
@tmandry
Copy link
Member Author

tmandry commented Apr 25, 2024

cc @petrochenkov, who authored #94857. Can you think of any obvious places to check for sources of nondeterminism?

@fangism
Copy link

fangism commented Apr 25, 2024

@fangism
Copy link

fangism commented Apr 25, 2024

Our determinism checker suddenly went stably green after #122053 was merged. https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/rust_toolchain.core.x64-determinism-build_only/b8749794715383439057/overview

It turns out there was another change in the Fuchsia source that landed in the same window between failing and passing runs that seems to have fixed the determinism issue (placing each rlib into its own exclusive directory in our build system). #122053 is not the critical change (false alarm).

@bjorn3
Copy link
Member

bjorn3 commented Apr 25, 2024

Just to check are you making sure that different crates with the same crate name have different -Cmetadata arguments passed in? And are you making sure that for all direct dependencies the rlib or rmeta path is explicitly passed in using --extern? If so there should be no non-determinism in resolving dependencies unlike what the comment on rust_one_rlib_per_dir suggests.

@tmandry
Copy link
Member Author

tmandry commented Apr 25, 2024

@bjorn3 We do all of those things, or at least attempt to. I can't tell what's going on, but I'd much rather be able to point to specific rlibs of transitive dependencies than rely on a selection algorithm that can change based on the state of my build directory (whether the build is buggy or not).

@tmandry
Copy link
Member Author

tmandry commented Apr 25, 2024

Since it looks like the nondeterminism was related to changes in inputs, I'm going to close this issue.

@tmandry tmandry closed this as completed Apr 25, 2024
@bjorn3
Copy link
Member

bjorn3 commented Apr 25, 2024

but I'd much rather be able to point to specific rlibs of transitive dependencies than rely on a selection algorithm that can change based on the state of my build directory

It shouldn't change no matter what the state of the build directory is. We record a crate hash for each dependency and only load transitive dependencies matching this crate hash. Each unique rlib should get a unique crate hash no matter how small the change is. If that is not the case, there is a bug somewhere in rustc. If you were building two crates with the same crate name and -Cmetadata it is possible that temporary artifacts get mixed up between them, but as you are passing unique -Cmetadata arguments, that shouldn't happen.

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Apr 27, 2024
@tmandry
Copy link
Member Author

tmandry commented May 7, 2024

It's possible the metadata keys aren't as unique as we thought. In any case it will be difficult to reproduce the issue outside of our build, and it's unlikely we will continue to investigate if we aren't seeing more determinism failures.

IMO the compiler forcing the use of search directories hinders debugging and reproducibility. I'd much rather point it to individual files and rely on such hashing mechanisms as a backup / error detection only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-metadata Area: Crate metadata A-reproducibility Area: Reproducible / deterministic builds T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants