Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

asm2wasm debuginfo #895

Merged
merged 7 commits into from
Feb 7, 2017
Merged

asm2wasm debuginfo #895

merged 7 commits into from
Feb 7, 2017

Conversation

kripken
Copy link
Member

@kripken kripken commented Feb 3, 2017

Users have complained that debugging wasm is hard, and it'll be a while til we have proper source maps or such. Meanwhile, this pr might help a little, it makes asm2wasm use debug info when present, and emit it as comments in the wast. For example, hello world might look like this:

  (func $_main (result i32)
    ;; tests/hello_world.c:4
    (drop
      (call $_printf
        (i32.const 1144)
        (i32.const 1160)
      )
    )
    ;; tests/hello_world.c:5
    (return
      (i32.const 0)
    )
  )

Technically, the debug info is translated into intrinsic calls. This is for efficiency, so that they incur no overhead when not using it (so we don't make each node have a debug info field or anything like that). Then we compile (and possibly optimize), and then we translate those intrinsics into annotations on the nodes, which are a map on the side per function of node => annotation. Then the printer just prints the annotation with the node, if there is one.

@kripken
Copy link
Member Author

kripken commented Feb 3, 2017

Some error on the bots and locally other.test_binaryen_names fails - posting to remind myself to look at this tomorrow.

@kripken kripken force-pushed the asm2wasm-debuginfo branch 3 times, most recently from d46b672 to 23c5752 Compare February 3, 2017 18:46
ignore --debuginfo if not emitting text, as wasm binaries don't support that yet
@kripken kripken force-pushed the asm2wasm-debuginfo branch from 23c5752 to 55b7289 Compare February 3, 2017 19:51
@kripken kripken force-pushed the asm2wasm-debuginfo branch from ff420e1 to 3e75a78 Compare February 3, 2017 21:07
@kripken
Copy link
Member Author

kripken commented Feb 3, 2017

This should be good now, if anyone wants to review.

src/asm2wasm.h Outdated
if (out + 100 >= end) {
Fatal() << "error in handling debug info";
}
if (input[0] == '/' && input[1] == '/' && input[2] == '@' && input[3] == 'l' && input[4] == 'i' && input[5] == 'n' && input[6] == 'e') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about startsWith(input, "//@line") ? Providing startsWith in a utility header.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing. I worry it will be less efficient, but I guess a modern compiler should be able to do this well...

src/asm2wasm.h Outdated
*out++ = *input++;
*out++ = *input++;
*out++ = *input++;
*out++ = *input++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memcpy?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

src/asm2wasm.h Outdated
// before it - this is guaranteed to be correct without opts,
// and is usually decently accurate with them.
auto size = strlen(input);
auto upperBound = Index(size * 1.25) + 100;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magic numbers are magic :)

Does this 100 match the 100 a few lines down?
Would be good to replace these with named constants for clarity.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point, fixing.

src/asm2wasm.h Outdated
@@ -565,6 +649,16 @@ class Asm2WasmBuilder {
}

Function* processFunction(Ref ast);

public:
CallImport* isDebugInfo(Expression* curr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is___ makes me thing "boolean". Better name would be just debugInfo, which can still be used in a boolean context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, yeah. Changing to checkDebugInfo which is consistent with other stuff in wasm.h.

@@ -19,6 +19,7 @@

#include <atomic>
#include <cassert>
#include <cstdlib>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unused

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's needed for abort, this just wasn't noticed before because other locations happened to have the include anyhow, I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense

@kripken
Copy link
Member Author

kripken commented Feb 6, 2017

Thanks for the feedback, added a commit with fixes.

@@ -19,6 +19,7 @@

#include <atomic>
#include <cassert>
#include <cstdlib>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense

src/asm2wasm.h Outdated
std::string DEBUGINFO_INTRINSIC = EMSCRIPTEN_DEBUGINFO.str;
auto DEBUGINFO_INTRINSIC_SIZE = DEBUGINFO_INTRINSIC.size();
bool seenUseAsm = false;
auto startsWith = [&](const char *s) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make this a top-level function as opposed to a lambda

  1. when reading this code it can be assumed that startsWith exists elsewhere, so it doesn't interrupt the flow of reading this function
  2. the way its currently implemented a better name would be inputAtTheCurrentPositionStartsWith("foo"). I feel startsWith(input, "foo") is more explicit at the call site and requires less context for the reader to keep in their head

Both should be inlineable by the compiler so it should be equivalent to the unrolled version, either way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair points, but which toplevel? Of the object? Or the entire file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my head it's a top-level function that requires no state, so of the entire file. In theory it could be put in a utils header and shared.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, refactored to top level in the file

src/asm2wasm.h Outdated
out += line.size();
*out++ = ')';
*out++ = ';';
} else if (!seenUseAsm && (startsWith("asm'") || startsWith("asm\""))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could theoretically be optimized by the compiler to be equivalent to what came before, but if (!seenUseAsm && startsWith(input, "asm") && (input[3] == '\'' || input[3] == '"')) is a possible rewrite that should almost-certainly be the same, while being I think equally legible. (I feel ambivalent about this either way though)

// before it - this is guaranteed to be correct without opts,
// and is usually decently accurate with them.
const auto SCALE_FACTOR = 1.25; // an upper bound on how much more space we need as a multiple of the original
const auto ADD_FACTOR = 100; // an upper bound on how much we write for each debug info element itself
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments are pretty helpful, thanks!

src/asm2wasm.h Outdated
static bool startsWith(const char* string, const char *prefix) {
while (1) {
if (*prefix == 0) return true;
if (*string++ != *prefix++) return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also probably return false if *string == 0 for completeness, e.g. startsWith("foo", "foobar")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixing.

Copy link
Contributor

@jgravelle-google jgravelle-google left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modulo nits, lgtm!

src/asm2wasm.h Outdated
} else if (!seenUseAsm && (startsWith(input, "asm'") || startsWith(input, "asm\""))) {
// end of "use asm" or "almost asm"
seenUseAsm = true;
memcpy(out, input, 5);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably worth pulling the 5s into a constant?
Actually yeah it's not obvious why 5 and not 4, presumably consuming asm\'\n?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, it's actually the semicolon at the end, "use asm";. I'll clarify.

strcpy(out, line.c_str());
out += line.size();
*out++ = ')';
*out++ = ';';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing to do here just a thought/comment/observation:
This interleaving section of strcpy and *out++ = char is pretty verbose and I wish there was a cleaner/more compact way to do it.
Something like outStream << DEBUGINFO_INTRINSIC << '(' << fileIndex << ',' << line << ");", though that would require a very different model of how this is done and probably isn't worth it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, might be a better way. The code is not expected to change much if at all, though, so I think it's maintainable as it is. If it were evolving code definitely it would need to be refactored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants