-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add suggested format for wasm frames in Error.stack #814
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that we want to mandate anything at all (even under "should"). Could you elaborate on the motivation?
`wasm-function[${i}]` where `i` is the index of the function in the module's | ||
[Code section](BinaryEncoding.md#code-section). | ||
* The filename *should* be that of the JS caller of the originating `Module` | ||
constructor call. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the .wasm
file if there was one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only JS APIs we have atm take an ArrayBuffer
of bytes so I think this is the best we can do. With ES6 module integration, we could have the same URLs as JS <script>s. With Streams integration, if you WebAssembly.compile(stream)
, maybe we could use the URL that was fetched to produce stream
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, right.
*should* be 1. | ||
* The function name *should* be generated from the template string literal | ||
`wasm-function[${i}]` where `i` is the index of the function in the module's | ||
[Code section](BinaryEncoding.md#code-section). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wasm-function
is the symbol section?
"or $i
if no symbol section is present"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite: I'm saying that if you have a stack frame which you'd print as foo@url:line:col (if it was JS) and if you have no Names section (so you don't have any "foo"), that a function name would be generated, with strawman "wasm-function[0]" (where "0" is the index of the function).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha. Why not have wasm-function
at all then, and not just $i
?
Also: I'm not sure I understand where you mention the Names section. Could you say:
The function name should be generated from a module's Names section if present, otherwise the string literal
$i
should be used wherei
is the index of the function in the module's Code section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The intention was just to be able to syntactically distinguish the wasm functions from the normal JS functions since you'll see them mixed into the same callstack. I did indeed forget to explicitly mention the Names section here; I'll add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, update looks good.
Interesting. Regardling line/column numbers: it's possible to get some small compilation time boost if you allow yourself to fudge this a little. Example: [1] int tmp = *p; In this case, on x86, B3's instruction selector would emit something like: addl $1, (p) When this traps, B3 would have a choice of telling you that you trapped at [1], [2], or [3]. I believe we would choose either [2] or [3], depending on reasons. This saves some small amount of time and space, since it means that we don't have to think very hard about the "trap origin" of instructions. Of course none of this matters if you write the code the way you're most likely to write it: *p++ // all on one line What do you think of the following options for dealing with this: Honestly I don't know which is better. What do you think? |
@jfbastien The motivation is just that we're going to all be generating something for @pizlonator So I can definitely see the resource-saving value in allowing imprecise traps; previous discussions made me think everyone was interested in precise offsets. However, 99.9% of the time, the extra precision is probably wasted and so we could say I feel like from your examples that we might be talking about different things, though. In this PR, I'm referring to the simple binary offset of a wasm instruction in the Regardless, your example points to a good question (if we reimagine lines 1-3 as the wasm instructions The more immediate overhead I think is the extra few bytes per possibly-trapping-instruction necessary (for a thunk or metadata, depending on impl) to note the bytecode offset. |
@lukewagner First of all, after thinking about it more, this is one of those things that is more a problem for converting existing compilers to have good trapping support rather than a problem for clean room implementations. B3 made a choice for how to track origin data (what line, bytecode offset, file, etc code comes from) and that choice happens to not allow for precise tracking of traps. That's more my problem than anybody else's. But then again maybe other implementers will also benefit from this wiggle room.
It's a bit weird for the spec to mandate precision without mandating a specific error.stack format. I think that it's healthy for the spec to allow implementers to do best-effort here. The outcome for compilers that have the same situation as B3 is that there will be rare corner cases in which the reported trapping instruction is wrong. What I was trying to say about line numbers is that although the wasm instruction offset will be wrong, it will probably point to the same C++ line number. So, if this does get used for some telemetry, then my bug won't hurt anyone. But I think that the important question is just: since we can't mandate a specific format for error.stack, what does it even mean for the contents to be precise? |
@pizlonator Yep, good points. Given all these arguments and further discussion with the group, I think it probably does make sense to at least switch to best-effort and save precision for some explicit stack inspection future feature. |
So that leaves bikeshedding of |
I'd rather spell out WebAssembly, since that's what the JavaScript object does. |
That could work for me; this particular choice in FF's impl predates |
We use: "at func_name ([func_index]+bytecode_offset)" On Tue, Oct 4, 2016 at 7:09 PM, Luke Wagner notifications@github.com
|
@pizlonator Presumably B3's instruction selector doesn't cover more than one possibly-trapping instruction, so would it be possible to start with a range of bytecode offsets and then refine that with a post-processing of the bytecode to find the one possibly-trapping bytecode? |
@titzer If there is no Names section I'm guessing func_name is simply absent? Is "<WASM>" a literal string? Judging by the "+", is bytecode_offset relative to the start of the function? |
@titzer B3's instruction selector definitely covers more than one trapping instruction, like in the store(add(load(p), v), p) case. You could probably mostly do such a post-processing though. I just fear that it would only barely work, because you'd have to account for all of the tricks that the instruction selector and other parts of the compiler do. If we cared about this being more than best effort then we would add deliberate support for this. I just think that if you want this level of precision then we should give it to you as API rather than a promise that the error.stack string contains the number. Also, if error.stack was spec'd by TC39, then my position would be different. As it stands it's a very weird place to promise to squirrel away precise information. -Filip
|
AFAICT the current design doesn't provide much visibility into the reason for a trap, and so it's pretty forgiving about reordering/merging possibly-trapping instructions. The spec tests check that there is a trap, and the reason for the trap, but there aren't any tests of code that would trap for different reasons depending on execution order. Once signal handling is added, allowing possibly-trapping instructions to be reordered would result in non-determinism. It seems practical to prohibit that; that wouldn't prohibit this specific load-add-store optimization, since the runtime could still invoke the signal handler as-if the load and store happened in two steps. Are there any optimizations that this would prevent (assuming the WASM bytecode is already optimized)? If reordering possibly-trapping instructions is prohibited, then it does seem practical to give a precise offset to the trapping instruction. However, the call stack will also include return addresses. Is it practical to give precise offsets to the instruction to be returned to? I think it can be done for no cost (in the same sense as "zero-cost exception handling") if you resolve the offset for a trapping PC vs a return PC differently, but I'm not sure. IMO it does seem worthwhile to work toward deterministic trap and return addresses for signal handling. But for this PR I would definitely find it useful to have a standard format for stack traces, even if they contained imprecise bytecode offsets. |
@AndrewScheidecker in JS you can still observe the content of the ArrayBuffer, so precision matters in that you have to preserve some of the store ordering. That means that you can't e.g. vectorize a loop to execute it 4-wide when that would reorder stores. Whether wasm is the right place for this type of optimization is another question... |
@jfbastien I wouldn't say "can't". Precision doesn't preclude optimizations, it just makes them more expensive to run. In the case of vectorization, you could do it if you version the loop - something that a vectorizer will probably do anyway. |
Ok, I think the resolution here is |
This PR adds some should clauses for wasm frames in
Error.stack
strings. Based on previous conversations, I think we already agreed for wasm to use bytecode offset as line number. Happy to bikeshed the default auto-generated function name.