Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support 5 segment source mappings #6795

Merged
merged 40 commits into from
Oct 1, 2024

Conversation

osa1
Copy link
Contributor

@osa1 osa1 commented Jul 31, 2024

@osa1 osa1 changed the title 5 segment source mappings Implement 5 segment source mappings Jul 31, 2024
@osa1 osa1 changed the title Implement 5 segment source mappings Support 5 segment source mappings Jul 31, 2024
@kripken
Copy link
Member

kripken commented Aug 7, 2024

What are your plans for using this in practice? I'm not aware of other toolchains doing so, so I'm curious.

@osa1
Copy link
Contributor Author

osa1 commented Aug 8, 2024

When I created this PR I was using the name segment in dart2wasm to name generated code with the source code function that they belong.

However as you note, browsers don't use the name segments in source maps to give names to stack frames, and I'm unable to find where the name segments are used (if they actually are).

So currently I don't have a use case and this isn't a blocker for me. We may still want to merge it for completeness, the source maps spec allows this and there may be some toolchain that makes use of it.

osa1 added a commit to osa1/sdk that referenced this pull request Aug 8, 2024
To be able to know when we are generating a source map, make `dart
compile wasm` aware of the `--no-source-maps` flag.

Note: wasm-opt currently cannot handle the mappings we generate. This
will be merged after WebAssembly/binaryen#6794
and WebAssembly/binaryen#6795.
@kripken
Copy link
Member

kripken commented Aug 8, 2024

I see, thanks for the update. In that case, I think I prefer not to merge this, if we can't test it against a known correct implementation, and don't have a current user. But we can leave the PR open, maybe there are other people that could benefit, and might find this and comment.

@osa1
Copy link
Contributor Author

osa1 commented Sep 16, 2024

@kripken we now have a use case for this, as described in dart-lang/sdk#56718 (comment).

The TL;DR is stack traces of crashes in optimized dart2wasm apps are logged in an external service which has access to the source map file.

The source map file is used to map code offsets in the stack trace to names.

Optimized dart2wasm apps don't have function names as names section can add ~70% overhead to binaries.

To support this use case we need to add names to mappings.

Would it be possible to merge this?

@kripken
Copy link
Member

kripken commented Sep 16, 2024

@osa1 If there is a use case then we should review and get this landed, I agree, but I don't think I understand the need there yet.

The standard solution I am aware of for that name problem is to store the name mapping on the side, which can be done by saving the name section in addition to the source map file. One way is to save the entire unstripped binary. Another approach is to save just the names themselves, for which e.g. wasm-opt --symbolmap can be used (that is what Emscripten does).

Is there a reason those existing solutions can't work for your use case?

@mkustermann
Copy link
Contributor

In the native world this is done via DWARF debugging information obtained when stripping the production binary (i.e. one obtains a single file that can be used to decode production stack traces). For wasm we don't use DWARF (as that disables certain binaryen optimizations) - so we instead rely on source maps for e.g. line number information.

If the method names came from a symbolmap file, we'd still need to have the source map file as well to get line numbers. It seems very inconvenient to have this metadata distributed across two different files - with different formats, where one needs different tools to decode different components of the frames and then merge this information.

@kripken
Copy link
Member

kripken commented Sep 16, 2024

@mkustermann Two files does add some complexity, that is true. But in my experience on the Emscripten side that has not been much of an issue in practice. And as for needing separate tools, we have a small script that can call out to the various tools,

https://github.com/emscripten-core/emscripten/blob/main/emsymbolizer.py

That script can then handle a symbol map, a source map, or DWARF.

To be clear, I'm not saying I'm opposed to this PR. If you feel strongly that it is helpful for you then I can be convinced. I just want to make sure you have considered what I think is the simpler option of using the wasm names section as a symbol map.

@osa1
Copy link
Contributor Author

osa1 commented Sep 17, 2024

the simpler option of using the wasm names section as a symbol map

Do you mean simple for binaryen devs or end users? I think the simplicity here is for binaryen but not necessarily for the users of binaryen.

dart2wasm needs to pass the right flags to wasm-opt to generate symbol maps, then Flutter needs to pass the right flags to dart2wasm. (Flutter doesn't give users a way to pass extra compiler flags, so it needs to be updated)

End users will have to deal with 3 files instead of 2, and if they already had the implementation of mapping stack traces to source locations using a separately distributed source map file (as in our use case linked above) they won't be able to reuse much of their code.

@mkustermann
Copy link
Contributor

mkustermann commented Sep 17, 2024

@kripken Maybe as some background. The request originated due to sentry.io (a very popular crash reporting framework) to extending their flutter support to flutter web apps compiled to wasm (see sentry's bug: dart-lang/sdk#56718). They probably have existing support for uploading source maps and things just work for dart2js code but not for dart2wasm code.

So the real complexity comes due to the fact that from the lowest level of the stack (the dart2wasm compiler), to flutter tooling, to sentry tooling to sentry API server uploading to sentry API server symbolization code to all need to deal with the extra wasm symbols map. So it may require e.g. sentry's crash reporting service to change their client tooling & server APIs to allow upload 3 files (wasm, source map, wasm symbolmap), change their server code that symbolizes stacks to use different tools to get different information (function names and line numbers) and then combine that.

And this isn't dart/flutter specific. Anyone using WasmGC & Binaryen will use source maps (because DWARF disables optimizations) and would then need to have to deal with this extra symbolmap file all the way up their stack.

It seems like the source map format supports the function names for that exact reason, so why not take advantage of it and avoid this complexity?

@kripken
Copy link
Member

kripken commented Sep 17, 2024

Fair enough, as I said, if you feel strongly then I don't object. I'll find time later today to review this in detail.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code lgtm % nits. Please add testing.

src/wasm/wasm-binary.cpp Outdated Show resolved Hide resolved
src/wasm/wasm-binary.cpp Outdated Show resolved Hide resolved
Co-authored-by: Alon Zakai <alonzakai@gmail.com>
@osa1 osa1 requested a review from kripken September 19, 2024 14:08
@osa1
Copy link
Contributor Author

osa1 commented Sep 19, 2024

@kripken thanks, ptal.

Copy link
Member

@kripken kripken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code lgtm, so all that is left is testing.

@osa1
Copy link
Contributor Author

osa1 commented Sep 26, 2024

I believe this should be almost ready now, but I'm not sure how to test changes in module-utils.cpp.

@kripken do we have any tests right now that checks source map info after doing inlining, and after merging modules? The changes in module-utils.cpp will be used when inlining and merging modules, but I can't find where debug info sanity is tested after inlining and merging modules.

@kripken
Copy link
Member

kripken commented Sep 26, 2024

It looks like the code above your changes to module-utils.cpp was added in f44912b (#6372), which includes tests. See specifically the test code in test/lit/merge/sourcemap.wat* for merging.

I don't see specific tests for inlining + source maps, but one could be added alongside e.g. test/lit/debug/source-map-smearing.wast, by adding --inlining. Though perhaps the test can be even simpler, without writing a binary in the middle? Just adding ;;@ annotations to an inlining test like test/lit/passes/inlining_all-features.wast should check that we move the annotations around properly.

@osa1 osa1 marked this pull request as ready for review September 27, 2024 09:45
@osa1
Copy link
Contributor Author

osa1 commented Sep 27, 2024

Thanks @kripken.

This should be ready for review now.

I've added a separate test for inlining instead of modifying one of the existing tests, as there weren't any existing test that checks specifically source maps AFAICS.

The renumbering in lines and columns in existing tests is to make sure each source map annotation has a unique line and column number, to make sure we don't accidentally use numbers of a wrong annotation.

The TODO in the code is copy/paste from a few lines above. When writing strings to the .map files we don't escape characters according to the JSON spec. It's an existing issue.

@osa1 osa1 marked this pull request as draft September 27, 2024 09:56
@osa1 osa1 marked this pull request as ready for review September 27, 2024 10:02
src/binaryen-c.cpp Outdated Show resolved Hide resolved
src/ir/module-utils.h Outdated Show resolved Hide resolved
src/parser/contexts.h Outdated Show resolved Hide resolved
src/ir/module-utils.cpp Outdated Show resolved Hide resolved
src/ir/module-utils.cpp Show resolved Hide resolved
src/passes/Print.cpp Outdated Show resolved Hide resolved
src/wasm.h Outdated Show resolved Hide resolved
@osa1
Copy link
Contributor Author

osa1 commented Oct 1, 2024

@kripken I've addressed the comments but I'm not sure how to update the C API with an optional uint32_t parameter, see my question above.

@osa1 osa1 requested a review from kripken October 1, 2024 20:06
@kripken kripken merged commit 347fc8a into WebAssembly:main Oct 1, 2024
13 checks passed
@osa1 osa1 deleted the 5_segment_source_mappings branch October 1, 2024 21:45
brendandahl added a commit to brendandahl/emscripten that referenced this pull request Oct 2, 2024
brendandahl added a commit to emscripten-core/emscripten that referenced this pull request Oct 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants