Experiment: Create timing report. #7311

ehuss · 2019-08-30T00:09:41Z

This is just an experiment, so I'm not sure if we'll want to merge it.

This adds an HTML report which gets saved to disk when the build is finished. It is primarily geared for identifying slow dependencies, and for visualizing how pipelining affects the build.

Here's an example: https://ehuss.github.io/cargo-timing.html
You can mouse over the blocks to highlight the reverse-dependencies that are released when a unit finishes. syn is a really good example.

It does a few other things, like displaying a message after each unit is finished. See the docs for more information.

rust-highfive · 2019-08-30T00:09:51Z

r? @nrc

(rust_highfive has picked a reviewer for you, use r? to override)

alexcrichton

Wanted to say first off that this is awesome to play with. I'm looking forward to iterating on this to a point where users can easily know what the bottleneck of their compilation graph is and easily diagnose performance issues.

Some miscellaneous thoughts reading your generated HTML report:

For Cargo, and I suspect other projects, this is a huge amount of information to take in in terms of dependencies. I wonder if we could perhaps hide most units by default? For example if things took less than 100ms they don't necessarily need to be shown (although having them optionally shown would still be good). I find the huge waterfall sort of hard to take in because there's lots of noise to sift through to find what you want, and basically either making it default or dynamically tunable to figure out what happened would be great.
Could the output of rustc --version be included in the summary?
How come only some of the dependency edges showed up when you hover over things? In your example graph, for example, regex-syntax doesn't appear to unlock anything. Ah I guess it's because something else was the last thing to unlock regex-syntax
Printing out a "critical path" would be pretty neat for easy diagnosis.
Have you checked around to see if there's any sort of open formats for this sort of data? We've had success taking perf output and throwing it into Firefox's flame graphs on perf-html.io for example, and I think timing in the compiler has tried to use Chrome's viewer for rendering this data. Basically I'm curious if we could also massage this into an output that's usable in a viewer maintained by others which may be a bit more powerful for exploring the data after you capture a run. I'm largely just curious if we can ping the compiler team to see if this sort of data can be imported into the Chrome viewer.
We should enable rmeta notifications for everything in -Z timings so we can track codegen time for all units, not just those pipelined perhaps? (or maybe document a disclaimer somewhere)
It might be good to have a blurb in the unstable documentation about how to read the generated HTML from Cargo
Wow libgit2-sys and cc take way too long to compile. I'm going to try to fix that.

alexcrichton · 2019-09-03T20:44:18Z

src/cargo/core/compiler/job_queue.rs

                        cx.bcx
                            .config
                            .shell()
                            .verbose(|c| c.status("Running", &cmd))?;
+                        self.timings.unit_start(id, self.active[&id]);


Technically for the highest fidelity of timing information we should push this unit_start to just before the process is actually spawned on the worker thread itself (using internal synchronization in self.timings). It's probably not worth doing that just yet though since we're already getting what appears to be pretty accurate timing information.

src/doc/src/reference/unstable.md

src/cargo/core/compiler/timings.rs

bors · 2019-09-03T22:02:47Z

☔ The latest upstream changes (presumably #7216) made this pull request unmergeable. Please resolve the merge conflicts.

alexcrichton · 2019-09-05T21:34:21Z

Have you checked around to see if there's any sort of open formats for this sort of data?

FWIW I was curious so I was poking around with this today. I was thinking of about:tracing and Chrome's tracing format, but I got some pretty disappointing graphs out of it and it didn't really give me (by default at least) any tools to work with. I wasn't able to really learn much from this output.

I also read over Firefox's format and didn't dive too deep into it but figured it wouldn't work.

All in all I don't think we'll get much use out of trying to use an open format.

One idea I had though was inspired by this post where it'd be ideal if we could flatten this whole graph to see a more linear sequence of events rather than a huge waterfall. We could ideally say what core each process started running on and associate that with rows, so everything would be a bit more compact and you might be able to more visually see idle parts of the build.

If you want to get crazy it'd be pretty sweet to overlay "what's currently building" along with CPU usage over time so we could get an idea of when the CPU is idle, what's building, and what it's bottlenecked on.

To reiterate though there's tons of directions we could take it, and landing anything is better than pontificating about how we could improve it, so I'd lean on the side of landing what we have vs wondering how to implement more fancy features.

ehuss · 2019-09-09T16:32:58Z

How come only some of the dependency edges showed up when you hover over things?

Right, for the regex-syntax case, regex was still waiting on aho-corasick. I didn't want to show all edges because the number of lines would be overwhelming. It's already a bit of a mess.

Printing out a "critical path" would be pretty neat for easy diagnosis.

What does this mean? Can you give an example of what it would say? It seems like there are many, intertwined critical paths, so I'm not sure what it would mean. Hide the units without edges?

Also, some of the critical paths are just a happenstance of timing. If cargo randomly picks a different unit from the waiting set, the critical path may end up looking very different.

Have you checked around to see if there's any sort of open formats for this sort of data?

I looked around a little, and I didn't see anything that would be usable. If anyone has any ideas, I'd like to try them. I also considered using one of the javascript graphing libraries, but most of them didn't fit, and I wanted to avoid heavy dependencies. One thought I had was to add the JSON output and then allow other tools to process and display it. I could also move the actual HTML part of this PR to a separate repo, but I figured its utility would be severely reduced.

flatten this whole graph

My original version was exactly this (it had one row per cpu). It had some drawbacks. You can't visualize how the dependency edges work (particularly for pipelining). It also ends up just being a solid block of boxes until the end when there is 1 or 2 left. I had mouseover tooltips to help show more information, but it just didn't seem that useful.

The alternative I intended was the graph at the bottom tells you whether or not it is maximizing the concurrency (via the green line). But it is a bit difficult to visualize. I also had a mouseover tooltip on the graph line to show which units were currently active, but I didn't finish it and took it out to get earlier feedback on whether this is generally useful.

I'm happy to continue discussing ideas for what to display and more interactive features. I just wanted to see if this is something we'd like to add at all. I'd like to focus on actionable information — that is what decisions and changes would a user make after reading this information. Seeing the graphs can be "nifty", but if it doesn't help make changes, there isn't much value. And the user does not have very many options (removing dependencies, splitting libraries into smaller pieces, removing features, etc.).

alexcrichton · 2019-09-09T19:59:13Z

I just wanted to see if this is something we'd like to add at all. I'd like to focus on actionable information — that is what decisions and changes would a user make after reading this information.

I personally very much agree with this, and I should probably frame my comments moreso in how I think that this sort of information will be interpreted. When I personally analyze graphs like this I'm looking for a few things:

Where is there idle CPU parallelism to be taken advantage of? For example is a dependency compiled very late in the dependency graph when it actually should be built much earlier? For example im-rc is built super later in Cargo's graph, but that's only because its build script depends on rustc_version which depends on serde which has its own big dependency tree. By removing the dependency on rustc_version we could shift most of this bottlenecking build time from the end of the build to idle CPU in the middle.

This is what I was getting at with a more flat view where you might be able to more visually see what's happening for parallelism during the build. We ideally want "entirely full until the end" but most projects don't end up with that sort of graph right now.
What dependencies take a particularly long time in rustc? These are good candidates for splitting up, trying to move more to monomorphization, etc. In general long-compiling crates should have as few dependencies as possible and scheduled as early in the build as possible. Or in other words, a build should just minimize the number of "big crates" it has.
Why did it take so long to get my final crate to start compiling? What was the blocker to getting it running, aka what dependency has the chain that took the longest to compile. This is sort of what I was getting at with the critical path. I definitely agree that the critical path isn't the same between builds and is pretty nondeterministic, but I suspect many crate graphs have a pretty deterministic critical path, and raising awareness about that seems pretty useful. For example Cargo's critical path largely included cc-rs -> libgit2-sys -> git2 -> git2-curl -> cargo, and that's because libgit2-sys took so darned long to compile since it wasn't parallel.

I think that the current implementation you've got here is good enough for most of these metrics above. The reason I was curious about open source tools is that exploring the current graph is pretty difficult. For example my browser (Firefox, maybe it's buggy?) gets super janky when I'm scrolling around the giant SVG. Additionally it's difficult to see what unlocked a dependency (the line drawn into it) as well as what it unlocked because the units it connects to may be either above or below the fold. These aren't really things I expect us to fix, but it'd be cool if there was an open viewer for this that already solved these problems, but I wasn't quite able to find one!

FWIW I think it's also worth pointing out that the green line is an approximation of parallelism but it doesn't take into account internal parallelism in rustc (because it can't, really). That may not matter a huge amount for most projects though.

Overall I'd be fine merging this basically as-is with a few small improvements in the inline comments above. It's pretty low-risk in terms of maintenance and if we ever want to remove it it's not that hard to remove.

bors · 2019-09-11T00:43:32Z

☔ The latest upstream changes (presumably #7351) made this pull request unmergeable. Please resolve the merge conflicts.

…here.

Also add some more features.

ehuss · 2019-09-17T18:01:06Z

I switched to using canvas which should be faster. Let me know if it has any problems. It also allows interactive controls, and I added a few as examples.

alexcrichton

Just a few stylistic nits, but otherwise I think this is basically ready to merge.

One thing I'd be curious to do is to overlay the "# Units" graph with CPU usage of the host system during that time (graphs like this) so we could get an idea of what the CPU usage is as well, knowing that if the number of active units doesn't shoot up you might still actually be using all the parallelism you've got locally (or not, depends). I wouldn't mind trying to whip that up after this lands though

src/cargo/core/compiler/job_queue.rs

src/cargo/core/compiler/timings.rs

src/cargo/core/compiler/mod.rs

src/doc/src/reference/unstable.md

alexcrichton · 2019-09-17T19:35:43Z

Also I can confirm the canvas is indeed speedy, and the touch to color build script executions a different colors was nice!

src/cargo/core/compiler/timings.rs

ehuss · 2019-09-17T20:28:43Z

Updated with review comments. Thanks!

CPU usage of the host system

I'd love to see that. I was reluctant to add a dependency for this, since I suspect getting it to compile and work on all host platforms will be tricky.

I also experimented with changing jobserver to report what it thinks the current concurrency is, which could be more useful with parallel-rustc. That's something we could also consider for later.

alexcrichton · 2019-09-17T20:33:12Z

@bors: r+

bors · 2019-09-17T20:33:13Z

📌 Commit 8be10f7 has been approved by alexcrichton

Experiment: Create timing report. This is just an experiment, so I'm not sure if we'll want to merge it. This adds an HTML report which gets saved to disk when the build is finished. It is primarily geared for identifying slow dependencies, and for visualizing how pipelining affects the build. Here's an example: https://ehuss.github.io/cargo-timing.html You can mouse over the blocks to highlight the reverse-dependencies that are released when a unit finishes. `syn` is a really good example. It does a few other things, like displaying a message after each unit is finished. See the docs for more information.

bors · 2019-09-17T20:33:21Z

⌛ Testing commit 8be10f7 with merge d764fff...

bors · 2019-09-17T20:58:05Z

☀️ Test successful - checks-azure
Approved by: alexcrichton
Pushing d764fff to master...

csmoe · 2019-09-21T05:45:24Z

@ehuss the js render seems not happy with a huge project(~550 deps)

est31 · 2019-09-21T11:10:30Z

@csmoe this sounds similar to #7388 and #7399 . Can you check whether it works with #7397 applied?

csmoe · 2019-09-21T11:15:48Z

@est31 thank you, rebuilding :)

luser · 2019-10-23T13:42:28Z

FYI, I know I had mentioned this before but I have an old branch laying around where I hacked cargo to output profiling info in Chrome's tracing format:
luser@a241a1f

There are a zillion formats for profiling/tracing info like this unfortunately. The nice thing about supporting someone else's format is that you can use other tools to explore it like Chrome's chrome://tracing, Firefox's profiler.firefox.com, or SpeedScope. Speedscope's docs mention a whole pile of formats it supports. That list includes flamescope which is a Rust crate that can output Speedscope's preferred format, so it ought to be possible to refactor that crate such that you could use it in cargo to generate that output.

rust-highfive assigned nrc Aug 30, 2019

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 30, 2019

alexcrichton reviewed Sep 3, 2019

View reviewed changes

ehuss force-pushed the pipeline-timing branch from a4a551e to cf211e0 Compare September 3, 2019 22:24

ehuss added 11 commits September 14, 2019 09:23

Experiment: Create timing report.

0664484

Add rustc info to timing output.

aca3274

Move hardlink_or_copy to a common location so it can be reused.

674150e

Make the cargo-timing.html filename unique per run.

6f353b5

Add some asserts.

06ed7a4

Remove format!

da07061

Make timings optional.

8bfae2d

Always emit rmeta when timings are enable to visualize codegen everyw…

911a9b0

…here.

Give build scripts a different color.

095f154

Switch rendering to canvas.

77a47b3

Also add some more features.

Update docs.

0df0595

ehuss force-pushed the pipeline-timing branch from cf211e0 to 0df0595 Compare September 17, 2019 17:56

alexcrichton reviewed Sep 17, 2019

View reviewed changes

src/cargo/core/compiler/timings.rs Outdated Show resolved Hide resolved

ehuss added 4 commits September 17, 2019 12:48

Remove Option from Timings.

6c6aa97

Style update.

aae1416

Move timings check to rmeta_required.

e913efe

Set min slider to step 0.1 seconds.

8be10f7

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 17, 2019

bors merged commit 8be10f7 into rust-lang:master Sep 17, 2019

This was referenced Sep 17, 2019

Added ability to crosscompile doctests #6892

Merged

Support for named profiles (RFC 2678) #6989

Merged

ehuss mentioned this pull request Sep 21, 2019

Tracking issue for -Ztimings #7405

Closed

ehuss added this to the 1.39.0 milestone Feb 6, 2022

ehuss mentioned this pull request Jul 13, 2022

Tracking Issue for JSON timings #10857

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment: Create timing report. #7311

Experiment: Create timing report. #7311

ehuss commented Aug 30, 2019

rust-highfive commented Aug 30, 2019

alexcrichton left a comment

alexcrichton Sep 3, 2019

bors commented Sep 3, 2019

alexcrichton commented Sep 5, 2019

ehuss commented Sep 9, 2019

alexcrichton commented Sep 9, 2019

bors commented Sep 11, 2019

ehuss commented Sep 17, 2019

alexcrichton left a comment

alexcrichton commented Sep 17, 2019

ehuss commented Sep 17, 2019

alexcrichton commented Sep 17, 2019

bors commented Sep 17, 2019

bors commented Sep 17, 2019

bors commented Sep 17, 2019

csmoe commented Sep 21, 2019 •

edited

Loading

est31 commented Sep 21, 2019

csmoe commented Sep 21, 2019

luser commented Oct 23, 2019

Experiment: Create timing report. #7311

Experiment: Create timing report. #7311

Conversation

ehuss commented Aug 30, 2019

rust-highfive commented Aug 30, 2019

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton Sep 3, 2019

Choose a reason for hiding this comment

bors commented Sep 3, 2019

alexcrichton commented Sep 5, 2019

ehuss commented Sep 9, 2019

alexcrichton commented Sep 9, 2019

bors commented Sep 11, 2019

ehuss commented Sep 17, 2019

alexcrichton left a comment

Choose a reason for hiding this comment

alexcrichton commented Sep 17, 2019

ehuss commented Sep 17, 2019

alexcrichton commented Sep 17, 2019

bors commented Sep 17, 2019

bors commented Sep 17, 2019

bors commented Sep 17, 2019

csmoe commented Sep 21, 2019 • edited Loading

est31 commented Sep 21, 2019

csmoe commented Sep 21, 2019

luser commented Oct 23, 2019

csmoe commented Sep 21, 2019 •

edited

Loading