Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Go-specific hacks to fix demo.go #197

Merged
merged 4 commits into from
Apr 26, 2022
Merged

Conversation

cgaebel
Copy link
Contributor

@cgaebel cgaebel commented Apr 26, 2022

See the comment in trace_writer.ml for more details about what this entails. The gist of what I've done here is review every call to gogo within demo.go and make sure the stack returns to the right spot.

This definitely fixes demo.go, but I'm not sure if it's sufficient for full-fledged go programs. @prattmic, could you please try out the artifact built by this CI run and let me know if it works on more a non-trivial go program you have lying around?

Refs #100, but does not fix it.

Here's a copy of the trace file generated by this PR for the go demo.

And here's a screenshot of it:

Screen Shot 2022-04-25 at 10 18 46 PM

@cgaebel cgaebel requested a review from Xyene April 26, 2022 02:25
@cgaebel cgaebel added this to the v1.1.0 milestone Apr 26, 2022
I want to trace its full execution, instead of attaching to a subset of
it. The constant loop is not very interesting compared to a hello world,
because its control flow is too simple.
This one function broke syntax highlighting in vscode, so I'm
quarantining it.
This is to keep them separate and distinct from other language-specific
hacks we must add.

Also, tweak end-of-thread to mark the last decode error time (misnamed,
should really be called something like "where to mark inferred start
times as starting from") and clear out the current callstack. We'll need
this in upcoming golang hacks.
See the comment in trace_writer.ml for more details about what this
entails. The gist of what I've done here is review every call to
`gogo` within demo.go and make sure the stack returns to the right
spot.

Fixes janestreet#100
Copy link
Member

@Xyene Xyene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@prattmic
Copy link

Oh wow, thanks for working on this! This is much better, but I think things are getting a bit confused. I took a look at a trace of gofmt, the autoformatter, which I'd call a medium size program. You can build it yourself with the proper dwarf flags:

$ go build -ldflags=-compressdwarf=false -o /tmp/gofmt cmd/gofmt

And point at a directory with a bunch of Go files. I used:

$ _build/default/bin/magic_trace_bin.exe run -multi-thread /tmp/gofmt -- $(go env GOROOT)/src/runtime/*.go

The trace generally looks much better than before this PR, but I still see some mixups.

  1. Extra frames above new goroutine:

image

Here in "main 4", we switch from a goroutine running go/printer code to a GC worker running runtime.gcBgMarkWorker [1]. Notice that go/printer.(*printer).block appears as a caller of runtime.gcBgMarkWorker, which is not the case. runtime.gcBgMarkWorker should be the top frame.

Interestingly, sometimes these extra frames are dropped later. e.g., here we lose the go/printer.(*printer).block frame after a call to runtime.systemstack.abi0.

image

Here is a different case that looks much worse:

image

I think this is actually the same problem as above though. gofmt starts one main.processFile goroutine per input file, so I think we are just seeing extra frames added each time we switch between those goroutines.

These are just my initial impressions, I still want to take a closer look and perhaps hack on the heuristics to see what I can do. One case I want to check out is asynchronous preemption, which I worry may be broken given that I see the code looking for specific callees after gogo.

[1] N.B. Though this is part of the GC, it is just a normal goroutine, nothing special.

@cgaebel
Copy link
Contributor Author

cgaebel commented Apr 26, 2022

Thanks for trying this out, and the tip to try out gofmt. Despite being imperfect, this PR is a clear improvement on the status quo, so I'm going to merge it as-is.

Next, I'll let you take a whack at improving the heuristic. Hopefully this PR can act as a guide to what you'd need to change and where. Let me know if you give up and I'll come back to this.

I recommend making reduced perf scripts tests as you work. See test/perf_script.mli for how to do so, and the test I added in this feature as an example of one. They drastically speed up iteration speed when trying out different hacks. And you get a free regression test out of it!

Depending on what you learn as you hack on this, you may learn that a heuristic is impossible. For example, we know that's true for certain classes of exception in OCaml. For OCaml, we're going to have the compiler generate a guide of where exceptions jump to in a binary's elf notes. I would not be against adding a similar level of integration with the go compiler.

@cgaebel cgaebel merged commit 285788a into janestreet:master Apr 26, 2022
@cgaebel
Copy link
Contributor Author

cgaebel commented Apr 26, 2022

Oh, I just thought of another hack that might work:

When gogo jumps somewhere else, close stack frames until either hitting the bottom of the stack or the function you're jumping into.

@prattmic

@prattmic
Copy link

Next, I'll let you take a whack at improving the heuristic. Hopefully this PR can act as a guide to what you'd need to change and where. Let me know if you give up and I'll come back to this.

Sounds good. I'm about to go on vacation for a few weeks, but hope to get back to this at some point.

@cgaebel
Copy link
Contributor Author

cgaebel commented Apr 26, 2022

Ok. I'll probably get around to trying the alternative hack in that time. I'll let you know how that goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants