-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
asm2wasm debuginfo #895
asm2wasm debuginfo #895
Conversation
Some error on the bots and locally |
d46b672
to
23c5752
Compare
ignore --debuginfo if not emitting text, as wasm binaries don't support that yet
23c5752
to
55b7289
Compare
…binary, all we can do is the Names section
ff420e1
to
3e75a78
Compare
This should be good now, if anyone wants to review. |
src/asm2wasm.h
Outdated
if (out + 100 >= end) { | ||
Fatal() << "error in handling debug info"; | ||
} | ||
if (input[0] == '/' && input[1] == '/' && input[2] == '@' && input[3] == 'l' && input[4] == 'i' && input[5] == 'n' && input[6] == 'e') { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about startsWith(input, "//@line")
? Providing startsWith
in a utility header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing. I worry it will be less efficient, but I guess a modern compiler should be able to do this well...
src/asm2wasm.h
Outdated
*out++ = *input++; | ||
*out++ = *input++; | ||
*out++ = *input++; | ||
*out++ = *input++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memcpy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
src/asm2wasm.h
Outdated
// before it - this is guaranteed to be correct without opts, | ||
// and is usually decently accurate with them. | ||
auto size = strlen(input); | ||
auto upperBound = Index(size * 1.25) + 100; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Magic numbers are magic :)
Does this 100 match the 100 a few lines down?
Would be good to replace these with named constants for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good point, fixing.
src/asm2wasm.h
Outdated
@@ -565,6 +649,16 @@ class Asm2WasmBuilder { | |||
} | |||
|
|||
Function* processFunction(Ref ast); | |||
|
|||
public: | |||
CallImport* isDebugInfo(Expression* curr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is___
makes me thing "boolean". Better name would be just debugInfo
, which can still be used in a boolean context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, yeah. Changing to checkDebugInfo
which is consistent with other stuff in wasm.h.
@@ -19,6 +19,7 @@ | |||
|
|||
#include <atomic> | |||
#include <cassert> | |||
#include <cstdlib> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's needed for abort
, this just wasn't noticed before because other locations happened to have the include anyhow, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense
Thanks for the feedback, added a commit with fixes. |
@@ -19,6 +19,7 @@ | |||
|
|||
#include <atomic> | |||
#include <cassert> | |||
#include <cstdlib> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense
src/asm2wasm.h
Outdated
std::string DEBUGINFO_INTRINSIC = EMSCRIPTEN_DEBUGINFO.str; | ||
auto DEBUGINFO_INTRINSIC_SIZE = DEBUGINFO_INTRINSIC.size(); | ||
bool seenUseAsm = false; | ||
auto startsWith = [&](const char *s) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would make this a top-level function as opposed to a lambda
- when reading this code it can be assumed that
startsWith
exists elsewhere, so it doesn't interrupt the flow of reading this function - the way its currently implemented a better name would be
inputAtTheCurrentPositionStartsWith("foo")
. I feelstartsWith(input, "foo")
is more explicit at the call site and requires less context for the reader to keep in their head
Both should be inlineable by the compiler so it should be equivalent to the unrolled version, either way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair points, but which toplevel? Of the object? Or the entire file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my head it's a top-level function that requires no state, so of the entire file. In theory it could be put in a utils header and shared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, refactored to top level in the file
src/asm2wasm.h
Outdated
out += line.size(); | ||
*out++ = ')'; | ||
*out++ = ';'; | ||
} else if (!seenUseAsm && (startsWith("asm'") || startsWith("asm\""))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could theoretically be optimized by the compiler to be equivalent to what came before, but if (!seenUseAsm && startsWith(input, "asm") && (input[3] == '\'' || input[3] == '"'))
is a possible rewrite that should almost-certainly be the same, while being I think equally legible. (I feel ambivalent about this either way though)
// before it - this is guaranteed to be correct without opts, | ||
// and is usually decently accurate with them. | ||
const auto SCALE_FACTOR = 1.25; // an upper bound on how much more space we need as a multiple of the original | ||
const auto ADD_FACTOR = 100; // an upper bound on how much we write for each debug info element itself |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These comments are pretty helpful, thanks!
src/asm2wasm.h
Outdated
static bool startsWith(const char* string, const char *prefix) { | ||
while (1) { | ||
if (*prefix == 0) return true; | ||
if (*string++ != *prefix++) return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also probably return false if *string == 0
for completeness, e.g. startsWith("foo", "foobar")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, fixing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modulo nits, lgtm!
src/asm2wasm.h
Outdated
} else if (!seenUseAsm && (startsWith(input, "asm'") || startsWith(input, "asm\""))) { | ||
// end of "use asm" or "almost asm" | ||
seenUseAsm = true; | ||
memcpy(out, input, 5); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably worth pulling the 5
s into a constant?
Actually yeah it's not obvious why 5 and not 4, presumably consuming asm\'\n
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, it's actually the semicolon at the end, "use asm";
. I'll clarify.
strcpy(out, line.c_str()); | ||
out += line.size(); | ||
*out++ = ')'; | ||
*out++ = ';'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing to do here just a thought/comment/observation:
This interleaving section of strcpy
and *out++ = char
is pretty verbose and I wish there was a cleaner/more compact way to do it.
Something like outStream << DEBUGINFO_INTRINSIC << '(' << fileIndex << ',' << line << ");"
, though that would require a very different model of how this is done and probably isn't worth it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, might be a better way. The code is not expected to change much if at all, though, so I think it's maintainable as it is. If it were evolving code definitely it would need to be refactored.
Users have complained that debugging wasm is hard, and it'll be a while til we have proper source maps or such. Meanwhile, this pr might help a little, it makes asm2wasm use debug info when present, and emit it as comments in the wast. For example, hello world might look like this:
Technically, the debug info is translated into intrinsic calls. This is for efficiency, so that they incur no overhead when not using it (so we don't make each node have a debug info field or anything like that). Then we compile (and possibly optimize), and then we translate those intrinsics into annotations on the nodes, which are a map on the side per function of
node => annotation
. Then the printer just prints the annotation with the node, if there is one.