Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong and missing information in PDB file #96475

Closed
MolecularMatters opened this issue Apr 27, 2022 · 6 comments · Fixed by #115704
Closed

Wrong and missing information in PDB file #96475

MolecularMatters opened this issue Apr 27, 2022 · 6 comments · Fixed by #115704
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. O-windows-msvc Toolchain: MSVC, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@MolecularMatters
Copy link

MolecularMatters commented Apr 27, 2022

With both Rust 1.60 stable as well as the latest nightly (rustc 1.62.0-nightly (082e4ca 2022-04-26)), rustc currently does not emit correct S_OBJNAME records and LF_BUILDINFO records into the PDB on Windows.

Steps to reproduce:

  • cargo new repro
  • cd repro
  • cargo build
  • dump the PDB using either DIA2Dump or llvm pdbutil

S_OBJNAME:
The S_OBJNAME record is stored in the PDB for each module stream, and contains the full path to the compiled .o file. This is supported by LLVM at: https://github.com/llvm/llvm-project/blob/9592e88f59cfd399e9285cfec50a23675f43a43a/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp#L783

At the moment, this record is empty for all .o files stored in a PDB.

LF_BUILDINFO:
Similarly, the LF_BUILDINFO record stores the following information for each module stream in the PDB:

  • the current working directory
  • path to the build tool, e.g. rustc
  • source file
  • PDB file (not applicable in this case)
  • command-line used to compile the corresponding .o

This is supported by LLVM at: https://github.com/llvm/llvm-project/blob/9592e88f59cfd399e9285cfec50a23675f43a43a/llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp#L920

At the moment, with rustc there are the following defects in the LF_BUILDINFO record:

  • path to the build tool is missing, i.e. uses an invalid type index, which makes DIA2Dump choke and llvm pdbutil crash
  • command-line is missing, again uses an invalid type index
  • source file is wrong, e.g. it is "src\main.rs@\2j9iw9kw1zbigor6" where it should be "src\main.rs"

Tools like Live++ (https://liveplusplus.tech) need this information in order to be able to recompile individual .o files.
The relevant Zulip discussion can be found here: https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Live.2B.2B.20for.20Rust

@bjorn3 bjorn3 added A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. O-windows-msvc Toolchain: MSVC, Operating system: Windows labels Apr 29, 2022
@nebulark
Copy link
Contributor

nebulark commented Jul 7, 2023

I've take a shot fixing this issue and have a working fix. Note that have very little knowlege in that code area, so this is probably not ideal:

The problem seems that Arg0 and CommandLineArgs of the MCTargetOptions cpp class are not set here:

extern "C" LLVMTargetMachineRef LLVMRustCreateTargetMachine(

My fix would now set thise member in the mentioned function. However, due to pointer lifetime requirements I had to add another function which saves the commandline args for the programs duration in the passwrapper.

I have a draft here:
https://github.com/nebulark/rust_custom/commit/8acc93124c6de80fea5c7efc648c5dd56775d74a
Patchfile in case the link dies:
draft-added-CL-and-CMD-to-pdb.patch

This should now fix the missing path to the build tool, as well as missing commandline.

I have no clue what to do about the 3rd point 'source file is wrong, e.g. it is "src\main.rs@\2j9iw9kw1zbigor6" where it should be "src\main.rs"'. As far as I understand there can only be a single file, which would be the cpp in C++. Due to n:m mapping of sourcfiles to cgus, the current value of root rs file + cgu name makes sense to me as it is.

However, I found a "workaround": Using Dia2Dump one can find all .rs files for a given cgu in the *** FILES section. So while the information might not be as nice, the info must be present in the PDB so this should be good enough?

@MolecularMatters
Copy link
Author

MolecularMatters commented Jul 10, 2023

I have no clue what to do about the 3rd point 'source file is wrong, e.g. it is "src\main.rs@\2j9iw9kw1zbigor6" where it should be "src\main.rs"'. As far as I understand there can only be a single file, which would be the cpp in C++. Due to n:m mapping of sourcfiles to cgus, the current value of root rs file + cgu name makes sense to me as it is.

You are correct, the compiland environment in the PDB only supports one file being specified as source file.
If we can guarantee that the source file specified in the PDB is always of the form "sourceFilename@cguName" then that should be fine.

By the way, how are these names of cgus generated? Is this a hash of the contents? Or just some unique identifier?

However, I found a "workaround": Using Dia2Dump one can find all .rs files for a given cgu in the *** FILES section. So while the information might not be as nice, the info must be present in the PDB so this should be good enough?

Yes, that should be enough.
We use the information stored in the module information stream and individual module streams (named *** FILES in Dia2Dump output) for building a dependency tree used by Live++.
If this information is present for all cgus associated with an .rs file, then we should be able to (re)build the full N:M dependency mapping, essentially giving us enough information to recompile individual .rs files.

I can give your changes a try as soon as they land in a nightly build.

@bjorn3
Copy link
Member

bjorn3 commented Jul 10, 2023

By the way, how are these names of cgus generated? Is this a hash of the contents? Or just some unique identifier?

The cgu name is a unique identifier. The exact format depends on if incremental mode is enabled or not and if the flag to use human readable cgu names is used or not. I'm not sure where the src\main.rs@\2j9iw9kw1zbigor6 string is generated though. It may be generated by LLVM rather than rustc. As such I don't have any clue how stable this format is.

Yes, that should be enough.
We use the information stored in the module information stream and individual module streams (named *** FILES in Dia2Dump output) for building a dependency tree used by Live++.
If this information is present for all cgus associated with an .rs file, then we should be able to (re)build the full N:M dependency mapping, essentially giving us enough information to recompile individual .rs files.

If you include_bytes!() a file I believe it doesn't end up in the debuginfo. And dependencies may not end up in there either. The .d files emitted by rustc when using --emit dep-info (as cargo uses by default for it's own recompilation logic) will contain all information necessary to determine if a crate should be recompiled if -Zbinary-dep-depinfo is used (without library dependencies are missing, but all source files of the crate itself are still mentioned) To actually recompile it you do need the full commandline invocation though. Cargo passes -Cmetadata with an unpredictable value which affects the symbol names of everything defined inside the crate.

bors added a commit to rust-lang-ci/rust that referenced this issue Sep 8, 2023
Add CL and CMD into to pdb debug info

Partial fix for rust-lang#96475

The Arg0 and CommandLineArgs of the MCTargetOptions cpp class are not set within https://github.com/rust-lang/rust/blob/bb548f964572f7fe652716f5897d9050a31c936e/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L378

This causes LLVM to not  neither output any compiler path (cl) nor the arguments that were used when invoking it (cmd) in the PDB file.

This fix adds the missing information to the target machine so LLVM can use it.
github-actions bot pushed a commit to rust-lang/miri that referenced this issue Sep 12, 2023
Add CL and CMD into to pdb debug info

Partial fix for rust-lang/rust#96475

The Arg0 and CommandLineArgs of the MCTargetOptions cpp class are not set within https://github.com/rust-lang/rust/blob/bb548f964572f7fe652716f5897d9050a31c936e/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp#L378

This causes LLVM to not  neither output any compiler path (cl) nor the arguments that were used when invoking it (cmd) in the PDB file.

This fix adds the missing information to the target machine so LLVM can use it.
@nagisa nagisa added the E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. label Sep 25, 2023
@bors bors closed this as completed in 6f13ea0 Sep 25, 2023
@capickett

This comment was marked as outdated.

@ChrisDenton
Copy link
Member

It is generally better to open a new issue rather than on an old closed one.

@capickett
Copy link
Contributor

Apologies for that. I've moved this to #128842.

workingjubilee added a commit to workingjubilee/rustc that referenced this issue Sep 11, 2024
Add test for S_OBJNAME & update test for LF_BUILDINFO cl and cmd

Update the unit test for checking cl and cmd in LF_BUILDINFO. With llvm-pdbutil we can now more specifically check if the string appears at the right location instead of just checking whether the string exists at all.
Context: rust-lang#96475
workingjubilee added a commit to workingjubilee/rustc that referenced this issue Sep 11, 2024
Add test for S_OBJNAME & update test for LF_BUILDINFO cl and cmd

Update the unit test for checking cl and cmd in LF_BUILDINFO. With llvm-pdbutil we can now more specifically check if the string appears at the right location instead of just checking whether the string exists at all.
Context: rust-lang#96475
Zalathar added a commit to Zalathar/rust that referenced this issue Sep 12, 2024
Add test for S_OBJNAME & update test for LF_BUILDINFO cl and cmd

Update the unit test for checking cl and cmd in LF_BUILDINFO. With llvm-pdbutil we can now more specifically check if the string appears at the right location instead of just checking whether the string exists at all.
Context: rust-lang#96475
Zalathar added a commit to Zalathar/rust that referenced this issue Sep 12, 2024
Add test for S_OBJNAME & update test for LF_BUILDINFO cl and cmd

Update the unit test for checking cl and cmd in LF_BUILDINFO. With llvm-pdbutil we can now more specifically check if the string appears at the right location instead of just checking whether the string exists at all.
Context: rust-lang#96475
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Sep 12, 2024
Rollup merge of rust-lang#130156 - nebulark:test_buildinfo, r=jieyouxu

Add test for S_OBJNAME & update test for LF_BUILDINFO cl and cmd

Update the unit test for checking cl and cmd in LF_BUILDINFO. With llvm-pdbutil we can now more specifically check if the string appears at the right location instead of just checking whether the string exists at all.
Context: rust-lang#96475
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.) E-needs-test Call for participation: An issue has been fixed and does not reproduce, but no test has been added. O-windows-msvc Toolchain: MSVC, Operating system: Windows T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
6 participants