Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline SyntaxContext in both encoded span representation. #98840

Merged
merged 2 commits into from
Sep 22, 2022

Conversation

cjgillot
Copy link
Contributor

@cjgillot cjgillot commented Jul 3, 2022

The current interned representation for spans does not use the ctxt_or_zero: u16 field. This PR proposes to use this field to store the SyntaxContext of the interned span instead. When ctxt_or_zero and the interned span's ctxt don't match, the inlined one takes precedence.

This allows to implement Span::ctxt and Span::with_ctxt with much less probability to access the interner. Those functions are used a lot for hygiene, so this may be worth it.

@rustbot rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Jul 3, 2022
@cjgillot
Copy link
Contributor Author

cjgillot commented Jul 3, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 3, 2022
@bors
Copy link
Contributor

bors commented Jul 3, 2022

⌛ Trying commit 849d460578d5fd1c1161e9da8f07b6cb5f45b571 with merge f14fc23e45f90f90d8af18adab0747a52ec58aaa...

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Contributor

bors commented Jul 3, 2022

💔 Test failed - checks-actions

@bors bors added the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Jul 3, 2022
@cjgillot cjgillot force-pushed the span-inline-ctxt branch from 849d460 to ded911f Compare July 3, 2022 12:30
@cjgillot
Copy link
Contributor Author

cjgillot commented Jul 3, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@bors
Copy link
Contributor

bors commented Jul 3, 2022

⌛ Trying commit ded911fc98d555f8c701e650dce721aa3c4edb48 with merge 6bd01fd732fde597d8ea22f2010c2e51a4613133...

@bors
Copy link
Contributor

bors commented Jul 3, 2022

☀️ Try build successful - checks-actions
Build commit: 6bd01fd732fde597d8ea22f2010c2e51a4613133 (6bd01fd732fde597d8ea22f2010c2e51a4613133)

@rust-timer
Copy link
Collaborator

Queued 6bd01fd732fde597d8ea22f2010c2e51a4613133 with parent b04bfb4, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (6bd01fd732fde597d8ea22f2010c2e51a4613133): comparison url.

Instruction count

  • Primary benchmarks: 🎉 relevant improvements found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
0.5% 0.5% 1
Improvements 🎉
(primary)
-0.4% -1.3% 69
Improvements 🎉
(secondary)
-0.6% -2.4% 41
All 😿🎉 (primary) -0.4% -1.3% 69

Max RSS (memory usage)

Results
  • Primary benchmarks: 🎉 relevant improvement found
  • Secondary benchmarks: 😿 relevant regression found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
3.8% 3.8% 1
Improvements 🎉
(primary)
-1.1% -1.1% 1
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) -1.1% -1.1% 1

Cycles

Results
  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-3.6% -4.0% 2
All 😿🎉 (primary) N/A N/A 0

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-perf Status: Waiting on a perf run to be completed. labels Jul 3, 2022
@cjgillot
Copy link
Contributor Author

cjgillot commented Jul 3, 2022

r? @nnethercote

if ctxt2 < MAX_CTXT {
// Inline format or interned format with inline ctxt.
self.ctxt_or_max = ctxt2 as u16;
return self;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just self should suffice, no need for a return.

} else {
// Interned format.
let index =
with_span_interner(|interner| interner.intern(&SpanData { lo, hi, ctxt, parent }));
Span { base_or_index: index, len_or_tag: LEN_TAG, ctxt_or_zero: 0 }
let ctxt_or_max = if ctxt2 < MAX_CTXT { ctxt2 } else { MAX_CTXT } as u16;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_CTXT is now a misleading name -- it's actually one past the max context, because it's a special out-of-band value. Perhaps rename it INTERNED_CTXT? And likewise rename ctxt_or_max as ctxt_or_interned?

///
/// Interned format:
/// - `span.base_or_index == index` (indexes into the interner table)
/// - `span.len_or_tag == LEN_TAG` (high bit set, all other bits are zero)
/// - `span.ctxt == 0`
/// - `span.ctxt == span_data.ctxt` (must be < `MAX_CTXT`) or `MAX_CTXT` otherwise
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now a bit unclear. Please expand this comment more by explaining that there are now effectively three cases: inline, interned-but-with-inline-ctxt, and interned.

Copy link
Contributor

@nnethercote nnethercote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said "When ctxt_or_zero and the interned span's ctxt don't match, the inlined one takes precedence." Does this happen? If so, can it be avoided?

The perf results here are very good. Surprisingly so, I thought very little spans were being interned nowadays, mostly for the rare ones where the length exceeds 32 KiB. Do you know which of the fields is mostly often causing the interning?

@cjgillot
Copy link
Contributor Author

cjgillot commented Jul 4, 2022

The second commit breaks the PartialEq and Hash impls for Span. I'll remove it from this PR.

@cjgillot cjgillot force-pushed the span-inline-ctxt branch from ded911f to 92c9b5a Compare July 4, 2022 07:40
@cjgillot
Copy link
Contributor Author

cjgillot commented Jul 4, 2022

@bors try @rust-timer queue

@rust-timer
Copy link
Collaborator

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

@bors
Copy link
Contributor

bors commented Jul 4, 2022

⌛ Trying commit 92c9b5a02be525f6bc741f2d5e20305bd8319ea5 with merge a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1...

@cjgillot
Copy link
Contributor Author

cjgillot commented Jul 4, 2022

You said "When ctxt_or_zero and the interned span's ctxt don't match, the inlined one takes precedence." Does this happen? If so, can it be avoided?

This was added on purpose to implement with_ctxt. I removed it because it's buggy.
Now, this should not happen.

The perf results here are very good. Surprisingly so, I thought very little spans were being interned nowadays, mostly for the rare ones where the length exceeds 32 KiB. Do you know which of the fields is mostly often causing the interning?

I tried a while back to lengthen the len_or_tag field: #84290. This did not lead to measurable results. Since base_or_index is u32, the only compressed field is ctxt.

Before any conjecture, I'll wait on this new perf run on a not-obviously-buggy implementation.

@bors
Copy link
Contributor

bors commented Jul 4, 2022

☀️ Try build successful - checks-actions
Build commit: a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1 (a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1)

@rust-timer
Copy link
Collaborator

Queued a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1 with parent a5c6a48, future comparison URL.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1): comparison url.

Instruction count

  • Primary benchmarks: no relevant changes found
  • Secondary benchmarks: 🎉 relevant improvements found
mean1 max count2
Regressions 😿
(primary)
N/A N/A 0
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
-0.4% -1.1% 16
All 😿🎉 (primary) N/A N/A 0

Max RSS (memory usage)

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: 😿 relevant regressions found
mean1 max count2
Regressions 😿
(primary)
2.5% 2.5% 1
Regressions 😿
(secondary)
2.6% 2.8% 4
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 2.5% 2.5% 1

Cycles

Results
  • Primary benchmarks: 😿 relevant regression found
  • Secondary benchmarks: no relevant changes found
mean1 max count2
Regressions 😿
(primary)
2.3% 2.3% 1
Regressions 😿
(secondary)
N/A N/A 0
Improvements 🎉
(primary)
N/A N/A 0
Improvements 🎉
(secondary)
N/A N/A 0
All 😿🎉 (primary) 2.3% 2.3% 1

If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf.

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Jul 4, 2022
@nnethercote
Copy link
Contributor

The new perf results are a smaller improvement than the old ones.

I'm still puzzled by the goodness of the original perf results. Two of the benchmarks that improved the most are keccak and cranelift-codegen-0.82.1. I did some ad hoc profiling, here is the distribution of calls to Span::new for cranelift-codegen, i.e. measuring creation of spans:

2448625 counts
(  1)  2447415 (100.0%,100.0%): i
(  2)       52 ( 0.0%,100.0%): I 1030124 1142673 0 None
(  3)       13 ( 0.0%,100.0%): I 18039207 18226193 0 None
(  4)        6 ( 0.0%,100.0%): I 2728213 2827861 0 None
(  5)        5 ( 0.0%,100.0%): I 2614338 2728211 0 None
(  6)        4 ( 0.0%,100.0%): I 4819422 4859743 0 None
(  7)        3 ( 0.0%,100.0%): I 1034177 1142673 0 None
(  8)        3 ( 0.0%,100.0%): I 1034480 1142641 0 None
(  9)        3 ( 0.0%,100.0%): I 1189728 1323468 0 None
( 10)        3 ( 0.0%,100.0%): I 1189864 1323468 0 None
( 11)        3 ( 0.0%,100.0%): I 1190422 1323454 0 None
( 12)        3 ( 0.0%,100.0%): I 1514964 1777150 0 None
( 13)        3 ( 0.0%,100.0%): I 1639111 1730662 0 None
( 14)        2 ( 0.0%,100.0%): I 1032858 1142673 0 None
( 15)        2 ( 0.0%,100.0%): I 1032984 1142673 0 None
...

i means "inline" and I means "interned". I.e. there are very few (1210) interned spans, and they're all interned due to len exceeding 15 bits.

Here is the distribution for Span::data_untracked, i.e. measuring use of spans:

7498928 counts
(  1)  7496902 (100.0%,100.0%): j
(  2)      879 ( 0.0%,100.0%): J 18337367 18412187 0 None
(  3)       51 ( 0.0%,100.0%): J 1030124 1142673 0 None
(  4)       39 ( 0.0%,100.0%): J 1189683 1323468 0 None
(  5)       36 ( 0.0%,100.0%): J 1190216 1323468 0 None
(  6)       25 ( 0.0%,100.0%): J 1515023 1777150 0 None
(  7)       23 ( 0.0%,100.0%): J 1503281 1777169 0 None
(  8)       23 ( 0.0%,100.0%): J 1639182 1730662 0 None
(  9)       21 ( 0.0%,100.0%): J 1032978 1142673 0 None
( 10)       19 ( 0.0%,100.0%): J 1514964 1777150 0 None
( 11)       19 ( 0.0%,100.0%): J 1515033 1777150 0 None
( 12)       19 ( 0.0%,100.0%): J 1639111 1730662 0 None
( 13)       16 ( 0.0%,100.0%): J 1034480 1142641 0 None
( 14)       16 ( 0.0%,100.0%): J 1190422 1323454 0 None
( 15)       12 ( 0.0%,100.0%): J 18039207 18226193 0 None
...

j means "inline" and J means "interned". Again, the number of accesses to interned spans (2026) is tiny.

For keccak, it's even more extreme. Span::new:

558110 counts
(  1)   558109 (100.0%,100.0%): i
(  2)        1 ( 0.0%,100.0%): I 547987 647635 0 None

And Span::data_untracked:

2589252 counts
(  1)  2589252 (100.0%,100.0%): j

I.e. there is a single non-interned span created and it is never accessed after creation.

I know that keccak is sensitive to minor changes in code generation for hot code relating to obligation predicates, so the benefits we saw there must be due to that somehow, rather than the span accesses themselves.

@apiraino
Copy link
Contributor

apiraino commented Aug 4, 2022

Hmm this PR is both waiting for review and on the author :) I'll switch back to the author since the perf. run received comments.

@rustbot ready

@apiraino apiraino removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 4, 2022
@cjgillot
Copy link
Contributor Author

@nnethercote the commit on which the first perf was run was buggy: we could get the wrong ctxt in some cases.

One complement to the explanation could be the effect on hygiene: Ident comparison only checks the ctxt, which would get a stronger effect than just span comparison.

@cjgillot cjgillot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Aug 10, 2022
Copy link
Contributor

@nnethercote nnethercote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow review response here. This is looking good, just a couple of comment fixes as mentioned above.

compiler/rustc_span/src/span_encoding.rs Outdated Show resolved Hide resolved
compiler/rustc_span/src/span_encoding.rs Outdated Show resolved Hide resolved
compiler/rustc_span/src/span_encoding.rs Outdated Show resolved Hide resolved
@wesleywiser
Copy link
Member

@nnethercote's review feedback has been addressed so I'm going to

@bors r+

@bors
Copy link
Contributor

bors commented Sep 22, 2022

📌 Commit 3e67bde has been approved by wesleywiser

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Sep 22, 2022
@bors
Copy link
Contributor

bors commented Sep 22, 2022

⌛ Testing commit 3e67bde with merge e7119a0...

@bors
Copy link
Contributor

bors commented Sep 22, 2022

☀️ Test successful - checks-actions
Approved by: wesleywiser
Pushing e7119a0 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 22, 2022
@bors bors merged commit e7119a0 into rust-lang:master Sep 22, 2022
@rustbot rustbot added this to the 1.66.0 milestone Sep 22, 2022
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (e7119a0): comparison URL.

Overall result: ❌✅ regressions and improvements - no action needed

@rustbot label: -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
- - 0
Regressions ❌
(secondary)
1.1% [0.6%, 1.6%] 7
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-0.7% [-0.7%, -0.7%] 1
All ❌✅ (primary) - - 0

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
2.0% [0.6%, 3.5%] 5
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
-2.6% [-3.3%, -2.0%] 2
All ❌✅ (primary) 2.0% [0.6%, 3.5%] 5

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean1 range count2
Regressions ❌
(primary)
2.9% [2.9%, 2.9%] 1
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-2.2% [-2.2%, -2.2%] 1
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 0.4% [-2.2%, 2.9%] 2

Footnotes

  1. the arithmetic mean of the percent change 2 3

  2. number of relevant changes 2 3

@cjgillot cjgillot deleted the span-inline-ctxt branch September 23, 2022 17:29
@nnethercote
Copy link
Contributor

Hmm, the post-merge results indicate this was a slight pessimisation :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants