Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include terminators in instance size estimate #86777

Merged
merged 1 commit into from
Jul 2, 2021

Conversation

tmiasko
Copy link
Contributor

@tmiasko tmiasko commented Jul 1, 2021

For example, drop glue generated for struct below, doesn't have any
statements, only terminators. Previously it received an estimate of 0,
the new estimate is 13 (6+5 drop terminators, +1 resume, +1 return).

struct S {
    a: String,
    b: String,
    c: String,
    d: String,
    e: String,
    f: String,
}

Originally reported in #69382 (comment)

For example, drop glue generated for struct below, doesn't have any
statements, only terminators. Previously it received an estimate of 0,
the new estimate is 13 (6+5 drop terminators, +1 resume, +1 return).

struct S {
    a: String,
    b: String,
    c: String,
    d: String,
    e: String,
    f: String,
}

Originally reported in rust-lang#69382 (comment)
@rust-highfive
Copy link
Collaborator

r? @estebank

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jul 1, 2021
@estebank
Copy link
Contributor

estebank commented Jul 1, 2021

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Jul 1, 2021

📌 Commit 5621987 has been approved by estebank

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jul 1, 2021
JohnTitor added a commit to JohnTitor/rust that referenced this pull request Jul 1, 2021

Unverified

This commit is not signed, but one or more authors requires that any commit attributed to them is signed.
…ebank

Include terminators in instance size estimate

For example, drop glue generated for struct below, doesn't have any
statements, only terminators. Previously it received an estimate of 0,
the new estimate is 13 (6+5 drop terminators, +1 resume, +1 return).

```rust
struct S {
    a: String,
    b: String,
    c: String,
    d: String,
    e: String,
    f: String,
}
```

Originally reported in rust-lang#69382 (comment)
@wesleywiser
Copy link
Member

We should probably not roll this up as this can affect CGU partitioning and show up on perf.rlo.

@bors rollup=never

@bors
Copy link
Contributor

bors commented Jul 2, 2021

⌛ Testing commit 5621987 with merge 7a9ff74...

@bors
Copy link
Contributor

bors commented Jul 2, 2021

☀️ Test successful - checks-actions
Approved by: estebank
Pushing 7a9ff74 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Jul 2, 2021
@bors bors merged commit 7a9ff74 into rust-lang:master Jul 2, 2021
@rustbot rustbot added this to the 1.55.0 milestone Jul 2, 2021
@tmiasko tmiasko deleted the estimate-terminators branch July 2, 2021 06:33
@rylev
Copy link
Member

rylev commented Jul 6, 2021

This change led to moderate performance regressions along with some performance gains.

For changes that impact CGU partitioning, we should probably be in the habit of running performance tests. This seemed to impact the deeply-nested-async benchmark which has the tendency to be more sensitive to changes like this.

As part of the performance triage process, I'm marking this as a performance regression. If you believe this performance regression to be justifiable or once you have an issue or PR that addresses this regression, please mark this PR with the label perf-regression-triaged.

@rustbot label +perf-regression

@the8472
Copy link
Member

the8472 commented Aug 5, 2021

The biggest impact is the regex-debug, full, wall-time measurement regressing by 20%. That has been persistent for a few weeks now, it's not spurious.

@Mark-Simulacrum
Copy link
Member

Agreed that the regression seems genuine. That said, it's a really weird regression, as instruction counts and cycles:u went down.

Not really sure how to investigate it further, given that... it also seems like the only benchmark that did that after this PR.

Going to think a little more on this.

@the8472
Copy link
Member

the8472 commented Sep 17, 2021

That said, it's a really weird regression, as instruction counts and cycles:u went down.

Maybe -j1 isn't making things entirely sequential? So if there was some parallel work going on that now becomes more sequential due to CGUs being shuffled around then this would show up as trading a decrease in cycles (less concurrency overhead) for increased wall time.

Maybe perf should be changed to run the benchmarks confined to a single core and unwanted threading would show up as more context switches.

@Mark-Simulacrum
Copy link
Member

There's no setting of -j1 for benchmarks on perf today, LLVM is definitely parallel on most of them.

@the8472
Copy link
Member

the8472 commented Sep 17, 2021

I assumed it would use -j1 to reduce noise. Well, that makes it even easier then, CGU partitioning likely got worse in that case.

@Mark-Simulacrum
Copy link
Member

Mark-Simulacrum commented Sep 18, 2021

You're correct. It took me way too long to get to this point -- we should figure out a way to draw this directly on perf.r-l.o and with a nicer comparison view -- but, see the optimize + codegen threads below. These aren't entirely to scale -- the bottom one takes 2x as long, roughly -- but the important point is that the long tail CGU there is scheduled way later than before, so even though the module sizes themselves haven't changed that much from run to run (in fact, on average they're more evenly distributed), that ends up really hurting overall end-to-end compilation time.

Regardless, this is just a scheduling issue, and is not directly actionable at this time. Better estimates -- like this PR -- are probably still good, even if they hurt us temporarily, particularly when it's limited to just one benchmark regressing in practice (with new significance algorithm we're much better at hiding insignificant noise - https://perf.rust-lang.org/compare.html?calcNewSig=true&start=56dee7c49ecdec4c2c9eccc6ff966cf58847bda6&end=7a9ff746fe20a38a3adc0ac65e1789f6e4b099ad&stat=wall-time)

@rustbot label +perf-regression-triaged

image

Edit: for context, these graphs are taken from the crox downloads of self-profile data and then screenshotted for combining in gimp.

@rustbot rustbot added the perf-regression-triaged The performance regression has been triaged. label Sep 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants