Add suggested format for wasm frames in Error.stack #814

lukewagner · 2016-10-03T18:23:10Z

This PR adds some should clauses for wasm frames in Error.stack strings. Based on previous conversations, I think we already agreed for wasm to use bytecode offset as line number. Happy to bikeshed the default auto-generated function name.

jfbastien

I'm not sure that we want to mandate anything at all (even under "should"). Could you elaborate on the motivation?

jfbastien · 2016-10-03T20:30:46Z

JS.md

+  `wasm-function[${i}]` where `i` is the index of the function in the module's
+  [Code section](BinaryEncoding.md#code-section).
+* The filename *should* be that of the JS caller of the originating `Module`
+  constructor call.


Not the .wasm file if there was one?

The only JS APIs we have atm take an ArrayBuffer of bytes so I think this is the best we can do. With ES6 module integration, we could have the same URLs as JS <script>s. With Streams integration, if you WebAssembly.compile(stream), maybe we could use the URL that was fetched to produce stream.

Hmm, right.

jfbastien · 2016-10-03T20:31:34Z

JS.md

+  *should* be 1.
+* The function name *should* be generated from the template string literal 
+  `wasm-function[${i}]` where `i` is the index of the function in the module's
+  [Code section](BinaryEncoding.md#code-section).


wasm-function is the symbol section?

"or $i if no symbol section is present"?

Not quite: I'm saying that if you have a stack frame which you'd print as foo@url:line:col (if it was JS) and if you have no Names section (so you don't have any "foo"), that a function name would be generated, with strawman "wasm-function[0]" (where "0" is the index of the function).

Ah gotcha. Why not have wasm-function at all then, and not just $i?

Also: I'm not sure I understand where you mention the Names section. Could you say:

The function name should be generated from a module's Names section if present, otherwise the string literal $i should be used where i is the index of the function in the module's Code section.

The intention was just to be able to syntactically distinguish the wasm functions from the normal JS functions since you'll see them mixed into the same callstack. I did indeed forget to explicitly mention the Names section here; I'll add that.

Cool, update looks good.

pizlonator · 2016-10-03T21:03:45Z

Interesting. Regardling line/column numbers: it's possible to get some small compilation time boost if you allow yourself to fudge this a little. Example:

[1] int tmp = *p;
[2] tmp++;
[3] *p = tmp;

In this case, on x86, B3's instruction selector would emit something like:

addl $1, (p)

When this traps, B3 would have a choice of telling you that you trapped at [1], [2], or [3]. I believe we would choose either [2] or [3], depending on reasons. This saves some small amount of time and space, since it means that we don't have to think very hard about the "trap origin" of instructions. Of course none of this matters if you write the code the way you're most likely to write it:

*p++ // all on one line

What do you think of the following options for dealing with this:
A) Note in the spec that implementations may fudge the "line number" somehow. This amounts to a blurb saying that the line number is provided on a best-effort basis.
B) Promise that if provided, the "line number" is precise.

Honestly I don't know which is better. What do you think?

lukewagner · 2016-10-03T21:45:24Z

@jfbastien The motivation is just that we're going to all be generating something for Error.stack so it'd be nice for developers if we converge as much as possible to help out anyone who wants to, say, parse Error.stack "crash reports" that they send back to the server.

@pizlonator So I can definitely see the resource-saving value in allowing imprecise traps; previous discussions made me think everyone was interested in precise offsets. However, 99.9% of the time, the extra precision is probably wasted and so we could say Error.stack is imprecise and wait for some later proper stack inspection API and this would allow devs to opt-in to the extra code/metadata cost. Curious to hear other thoughts on this.

I feel like from your examples that we might be talking about different things, though. In this PR, I'm referring to the simple binary offset of a wasm instruction in the .wasm without any form of source notes to map back to the original source language. Thus there wouldn't be any provided line numbers (mentioned by (B)).

Regardless, your example points to a good question (if we reimagine lines 1-3 as the wasm instructions i32.load; i32.add; i32.store). In this particular case, though, I would assume that, since the initial i32.load dominates the i32.store, that when these three IR nodes were folded into one that the i32.load's trap information would get used for the resulting addl. But this wouldn't work if there wasn't a clear dominance relationship, e.g., if there is redundant code on both arms of an if. I'd expect the wasm producer to have eliminated most of this but in some cases (ISA-specific, bounds checks), it can't.

The more immediate overhead I think is the extra few bytes per possibly-trapping-instruction necessary (for a thunk or metadata, depending on impl) to note the bytecode offset.

pizlonator · 2016-10-04T16:44:12Z

@lukewagner First of all, after thinking about it more, this is one of those things that is more a problem for converting existing compilers to have good trapping support rather than a problem for clean room implementations. B3 made a choice for how to track origin data (what line, bytecode offset, file, etc code comes from) and that choice happens to not allow for precise tracking of traps. That's more my problem than anybody else's.

But then again maybe other implementers will also benefit from this wiggle room.

So I can definitely see the resource-saving value in allowing imprecise traps; previous discussions made me think everyone was interested in precise offsets. However, 99.9% of the time, the extra precision is probably wasted and so we could say Error.stack is imprecise and wait for some later proper stack inspection API and this would allow devs to opt-in to the extra code/metadata cost. Curious to hear other thoughts on this.

It's a bit weird for the spec to mandate precision without mandating a specific error.stack format. I think that it's healthy for the spec to allow implementers to do best-effort here. The outcome for compilers that have the same situation as B3 is that there will be rare corner cases in which the reported trapping instruction is wrong.

What I was trying to say about line numbers is that although the wasm instruction offset will be wrong, it will probably point to the same C++ line number. So, if this does get used for some telemetry, then my bug won't hurt anyone.

But I think that the important question is just: since we can't mandate a specific format for error.stack, what does it even mean for the contents to be precise?

lukewagner · 2016-10-04T17:01:14Z

@pizlonator Yep, good points. Given all these arguments and further discussion with the group, I think it probably does make sense to at least switch to best-effort and save precision for some explicit stack inspection future feature.

lukewagner · 2016-10-04T17:05:08Z

So that leaves bikeshedding of wasm-function[i] vs. i vs. other alternatives. Any other comments on this? @titzer what does v8 use?

jfbastien · 2016-10-04T17:06:25Z

I'd rather spell out WebAssembly, since that's what the JavaScript object does.

lukewagner · 2016-10-04T17:09:29Z

That could work for me; this particular choice in FF's impl predates Wasm->WebAssembly. That being said, wasm doesn't suffer the same Wasm vs. WASM vs. WAsm uglyfest.

titzer · 2016-10-04T18:19:00Z

We use:

"at func_name ([func_index]+bytecode_offset)"

On Tue, Oct 4, 2016 at 7:09 PM, Luke Wagner notifications@github.com
wrote:

That could work for me; this particular choice in FF's impl predates Wasm
->WebAssembly. That being said, wasm doesn't suffer the same Wasm vs. WASM
vs. WAsm uglyfest.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#814 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ALnq1FYmxkSntc-pbBwadCtaq224Qbd5ks5qwohfgaJpZM4KM4fI
.

titzer · 2016-10-04T18:20:32Z

@pizlonator Presumably B3's instruction selector doesn't cover more than one possibly-trapping instruction, so would it be possible to start with a range of bytecode offsets and then refine that with a post-processing of the bytecode to find the one possibly-trapping bytecode?

lukewagner · 2016-10-04T18:36:02Z

@titzer If there is no Names section I'm guessing func_name is simply absent? Is "<WASM>" a literal string? Judging by the "+", is bytecode_offset relative to the start of the function?

pizlonator · 2016-10-04T19:02:21Z

@titzer B3's instruction selector definitely covers more than one trapping instruction, like in the store(add(load(p), v), p) case. You could probably mostly do such a post-processing though. I just fear that it would only barely work, because you'd have to account for all of the tricks that the instruction selector and other parts of the compiler do. If we cared about this being more than best effort then we would add deliberate support for this.

I just think that if you want this level of precision then we should give it to you as API rather than a promise that the error.stack string contains the number.

Also, if error.stack was spec'd by TC39, then my position would be different. As it stands it's a very weird place to promise to squirrel away precise information.

-Filip

On Oct 4, 2016, at 11:20, titzer notifications@github.com wrote:

@pizlonator Presumably B3's instruction selector doesn't cover more than one possibly-trapping instruction, so would it be possible to start with a range of bytecode offsets and then refine that with a post-processing of the bytecode to find the one possibly-trapping bytecode?

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

AndrewScheidecker · 2016-10-04T20:05:26Z

AFAICT the current design doesn't provide much visibility into the reason for a trap, and so it's pretty forgiving about reordering/merging possibly-trapping instructions. The spec tests check that there is a trap, and the reason for the trap, but there aren't any tests of code that would trap for different reasons depending on execution order.

Once signal handling is added, allowing possibly-trapping instructions to be reordered would result in non-determinism. It seems practical to prohibit that; that wouldn't prohibit this specific load-add-store optimization, since the runtime could still invoke the signal handler as-if the load and store happened in two steps. Are there any optimizations that this would prevent (assuming the WASM bytecode is already optimized)?

If reordering possibly-trapping instructions is prohibited, then it does seem practical to give a precise offset to the trapping instruction. However, the call stack will also include return addresses. Is it practical to give precise offsets to the instruction to be returned to? I think it can be done for no cost (in the same sense as "zero-cost exception handling") if you resolve the offset for a trapping PC vs a return PC differently, but I'm not sure.

IMO it does seem worthwhile to work toward deterministic trap and return addresses for signal handling. But for this PR I would definitely find it useful to have a standard format for stack traces, even if they contained imprecise bytecode offsets.

jfbastien · 2016-10-04T21:07:45Z

@AndrewScheidecker in JS you can still observe the content of the ArrayBuffer, so precision matters in that you have to preserve some of the store ordering. That means that you can't e.g. vectorize a loop to execute it 4-wide when that would reorder stores. Whether wasm is the right place for this type of optimization is another question...

pizlonator · 2016-10-04T21:46:13Z

@jfbastien I wouldn't say "can't". Precision doesn't preclude optimizations, it just makes them more expensive to run. In the case of vectorization, you could do it if you version the loop - something that a vectorizer will probably do anyway.

lukewagner · 2016-12-20T23:47:37Z

Ok, I think the resolution here is Error.stack isn't the right place for a wasm engine to reliably expose bytecode offsets (that'd be some new API which could be more carefully defined) nor is wasm in a position to attempt to standardize JS's stack string format.

Add Error.stack SHOULD clause

19b8573

jfbastien reviewed Oct 3, 2016

View reviewed changes

Mention Names section for function names

85ea06d

Change offset to best-effort

de99b42

lukewagner mentioned this pull request Dec 1, 2016

Export JSFunctions should have string names #883

Closed

lukewagner closed this Dec 20, 2016

lukewagner deleted the error-stack branch December 20, 2016 23:47

dschuff mentioned this pull request Feb 15, 2017

Display in browsers and tools for addresses of functions and instructions #990

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add suggested format for wasm frames in Error.stack #814

Add suggested format for wasm frames in Error.stack #814

lukewagner commented Oct 3, 2016

jfbastien left a comment

jfbastien Oct 3, 2016

lukewagner Oct 3, 2016

jfbastien Oct 3, 2016

jfbastien Oct 3, 2016

lukewagner Oct 3, 2016

jfbastien Oct 3, 2016

lukewagner Oct 4, 2016

jfbastien Oct 4, 2016

pizlonator commented Oct 3, 2016

lukewagner commented Oct 3, 2016 •

edited

Loading

pizlonator commented Oct 4, 2016

lukewagner commented Oct 4, 2016

lukewagner commented Oct 4, 2016 •

edited

Loading

jfbastien commented Oct 4, 2016

lukewagner commented Oct 4, 2016

titzer commented Oct 4, 2016

titzer commented Oct 4, 2016

lukewagner commented Oct 4, 2016 •

edited

Loading

pizlonator commented Oct 4, 2016

AndrewScheidecker commented Oct 4, 2016

jfbastien commented Oct 4, 2016

pizlonator commented Oct 4, 2016

lukewagner commented Dec 20, 2016

Add suggested format for wasm frames in Error.stack #814

Add suggested format for wasm frames in Error.stack #814

Conversation

lukewagner commented Oct 3, 2016

jfbastien left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pizlonator commented Oct 3, 2016

lukewagner commented Oct 3, 2016 • edited Loading

pizlonator commented Oct 4, 2016

lukewagner commented Oct 4, 2016

lukewagner commented Oct 4, 2016 • edited Loading

jfbastien commented Oct 4, 2016

lukewagner commented Oct 4, 2016

titzer commented Oct 4, 2016

titzer commented Oct 4, 2016

lukewagner commented Oct 4, 2016 • edited Loading

pizlonator commented Oct 4, 2016

AndrewScheidecker commented Oct 4, 2016

jfbastien commented Oct 4, 2016

pizlonator commented Oct 4, 2016

lukewagner commented Dec 20, 2016

lukewagner commented Oct 3, 2016 •

edited

Loading

lukewagner commented Oct 4, 2016 •

edited

Loading

lukewagner commented Oct 4, 2016 •

edited

Loading