-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inline SyntaxContext in both encoded span representation. #98840
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 849d460578d5fd1c1161e9da8f07b6cb5f45b571 with merge f14fc23e45f90f90d8af18adab0747a52ec58aaa... |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
💔 Test failed - checks-actions |
849d460
to
ded911f
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit ded911fc98d555f8c701e650dce721aa3c4edb48 with merge 6bd01fd732fde597d8ea22f2010c2e51a4613133... |
☀️ Try build successful - checks-actions |
Queued 6bd01fd732fde597d8ea22f2010c2e51a4613133 with parent b04bfb4, future comparison URL. |
Finished benchmarking commit (6bd01fd732fde597d8ea22f2010c2e51a4613133): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
r? @nnethercote |
if ctxt2 < MAX_CTXT { | ||
// Inline format or interned format with inline ctxt. | ||
self.ctxt_or_max = ctxt2 as u16; | ||
return self; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just self
should suffice, no need for a return
.
} else { | ||
// Interned format. | ||
let index = | ||
with_span_interner(|interner| interner.intern(&SpanData { lo, hi, ctxt, parent })); | ||
Span { base_or_index: index, len_or_tag: LEN_TAG, ctxt_or_zero: 0 } | ||
let ctxt_or_max = if ctxt2 < MAX_CTXT { ctxt2 } else { MAX_CTXT } as u16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MAX_CTXT
is now a misleading name -- it's actually one past the max context, because it's a special out-of-band value. Perhaps rename it INTERNED_CTXT
? And likewise rename ctxt_or_max
as ctxt_or_interned
?
/// | ||
/// Interned format: | ||
/// - `span.base_or_index == index` (indexes into the interner table) | ||
/// - `span.len_or_tag == LEN_TAG` (high bit set, all other bits are zero) | ||
/// - `span.ctxt == 0` | ||
/// - `span.ctxt == span_data.ctxt` (must be < `MAX_CTXT`) or `MAX_CTXT` otherwise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now a bit unclear. Please expand this comment more by explaining that there are now effectively three cases: inline, interned-but-with-inline-ctxt, and interned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You said "When ctxt_or_zero and the interned span's ctxt don't match, the inlined one takes precedence." Does this happen? If so, can it be avoided?
The perf results here are very good. Surprisingly so, I thought very little spans were being interned nowadays, mostly for the rare ones where the length exceeds 32 KiB. Do you know which of the fields is mostly often causing the interning?
The second commit breaks the |
ded911f
to
92c9b5a
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
⌛ Trying commit 92c9b5a02be525f6bc741f2d5e20305bd8319ea5 with merge a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1... |
This was added on purpose to implement
I tried a while back to lengthen the Before any conjecture, I'll wait on this new perf run on a not-obviously-buggy implementation. |
☀️ Try build successful - checks-actions |
Queued a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1 with parent a5c6a48, future comparison URL. |
Finished benchmarking commit (a623c2ebc5aef6e9c9fce6aa7ef64118aefd23d1): comparison url. Instruction count
Max RSS (memory usage)Results
CyclesResults
If you disagree with this performance assessment, please file an issue in rust-lang/rustc-perf. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf. @bors rollup=never Footnotes |
The new perf results are a smaller improvement than the old ones. I'm still puzzled by the goodness of the original perf results. Two of the benchmarks that improved the most are
Here is the distribution for
For
And
I.e. there is a single non-interned span created and it is never accessed after creation. I know that |
Hmm this PR is both waiting for review and on the author :) I'll switch back to the author since the perf. run received comments. @rustbot ready |
@nnethercote the commit on which the first perf was run was buggy: we could get the wrong One complement to the explanation could be the effect on hygiene: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the slow review response here. This is looking good, just a couple of comment fixes as mentioned above.
92c9b5a
to
3e67bde
Compare
@nnethercote's review feedback has been addressed so I'm going to @bors r+ |
☀️ Test successful - checks-actions |
Finished benchmarking commit (e7119a0): comparison URL. Overall result: ❌✅ regressions and improvements - no action needed@rustbot label: -perf-regression Instruction countThis is a highly reliable metric that was used to determine the overall result at the top of this comment.
Max RSS (memory usage)ResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
CyclesResultsThis is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.
Footnotes |
Hmm, the post-merge results indicate this was a slight pessimisation :( |
The current interned representation for spans does not use the
ctxt_or_zero: u16
field. This PR proposes to use this field to store theSyntaxContext
of the interned span instead. Whenctxt_or_zero
and the interned span'sctxt
don't match, the inlined one takes precedence.This allows to implement
Span::ctxt
andSpan::with_ctxt
with much less probability to access the interner. Those functions are used a lot for hygiene, so this may be worth it.