Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use MixedBitSets in const qualif #134438

Merged
merged 2 commits into from
Dec 20, 2024
Merged

Conversation

lqd
Copy link
Member

@lqd lqd commented Dec 17, 2024

These analyses' domains should be very homogeneous, having compressed bitmaps on huge cfgs should make a difference (and doesn’t have an impact on the smaller / regular cfgs in our benchmarks).

This is a >40% walltime reduction on this stress test extracted from a real world ICU case, and a 10x or so max-rss reduction.

cc @oli-obk @RalfJung

Should help with (or fix) issue #134404.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Dec 17, 2024
@lqd
Copy link
Member Author

lqd commented Dec 17, 2024

@bors try @rust-timer queue

@rust-timer

This comment has been minimized.

@rustbot rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 17, 2024
@bors
Copy link
Contributor

bors commented Dec 17, 2024

⌛ Trying commit f2a1b09 with merge 8280a60...

bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 17, 2024
Try using compressible bitsets in const qualif

These analyses domains should be very homogeneous, having compressed bitmaps on huge cfgs should make a difference (and hopefully have little impact on the smaller ones).

r? ghost
@bors
Copy link
Contributor

bors commented Dec 17, 2024

☀️ Try build successful - checks-actions
Build commit: 8280a60 (8280a60cd16db5b22b55e7c7d6b9f2d8a3960a20)

@rust-timer

This comment has been minimized.

@compiler-errors
Copy link
Member

r=me if it ends up working out well

@lqd
Copy link
Member Author

lqd commented Dec 17, 2024

It is working well on manish's real world stress test, hopefully benchmarks are not negatively impacted. I think they shouldn't, but we'll see ^^.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (8280a60): comparison URL.

Overall result: no relevant changes - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: -S-waiting-on-perf -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (primary 1.5%, secondary 2.4%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.5% [1.5%, 1.5%] 1
Regressions ❌
(secondary)
2.4% [2.2%, 2.7%] 4
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.5% [1.5%, 1.5%] 1

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 770.505s -> 772.124s (0.21%)
Artifact size: 330.30 MiB -> 330.34 MiB (0.01%)

@rustbot rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Dec 17, 2024
@lqd lqd changed the title Try using compressible bitsets in const qualif Use MixedBitSets in const qualif Dec 17, 2024
@lqd lqd marked this pull request as ready for review December 17, 2024 22:47
@rustbot
Copy link
Collaborator

rustbot commented Dec 17, 2024

Some changes occurred to the CTFE machinery

cc @rust-lang/wg-const-eval

@lqd
Copy link
Member Author

lqd commented Dec 17, 2024

r? compiler-errors @bors r=compiler-errors

@bors
Copy link
Contributor

bors commented Dec 17, 2024

📌 Commit f2a1b09 has been approved by compiler-errors

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 17, 2024
@RalfJung
Copy link
Member

Given that this perf diff isn't reflected by our benchmarks at all, it seems likely to regress again in the future. I'd recommend adding the stress test to the benchmark suite.

@@ -248,10 +248,10 @@ where
#[derive(Debug, PartialEq, Eq)]
pub(super) struct State {
/// Describes whether a local contains qualif.
pub qualif: BitSet<Local>,
pub qualif: MixedBitSet<Local>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth a comment explaining the choice of representation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that. In general using a BitSet on large domain sizes can create issues like this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no way a contributor could know this. BitSet is the most canonically named type so it's the one developers will reach for by default.

Copy link
Member Author

@lqd lqd Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't disagree. Thinking about the domain representation and whether it's dense or sparse is super common in dataflow so "no way to know this" is probably too broad of a brush stroke though.

We could add a comment to BitSet, or the bit_set module, describing and linking to the other bitsets, cc @nnethercote on that.

In this case I've added the requested comment to the analysis state, if you're happy with it, feel free to r you and michael.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be worth renaming BitSet to DenseBitSet perhaps? That way there's not an obvious one to reach for.

But that should be follow-up for sure :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add it to my todo list :)

@lqd
Copy link
Member Author

lqd commented Dec 18, 2024

I'd recommend adding the stress test to the benchmark suite.

Sorry I don't really have the time to do that, but also I'm not sure this stress test is that important to track over time? It was easy to analyze and fix, and probably no one else has seen that issue.

250K locals on a 750K-node CFG is bound to cause slowness elsewhere (and is still slow with this PR) so it's not uninteresting as a benchmark. I think when people encounter such issues of slowness triggered by the function size (and we have some of these in our stress tests), they just split their code into more functions or constants.

@RalfJung
Copy link
Member

RalfJung commented Dec 18, 2024

It doesn't have to be you adding the benchmark, maybe @Manishearth can look into this. :) But if we don't yet have a benchmark covering such extreme MIR bodies, it seems like we should? (Might also be interesting to have this as a const fn since functions get more processing than consts.)

@rust-lang/wg-compiler-performance probably know best whether it'd be a good idea to add such a benchmark or not.

@lqd lqd force-pushed the const-qualif-bitsets branch from f2a1b09 to db7d6a9 Compare December 18, 2024 07:21
@lqd
Copy link
Member Author

lqd commented Dec 18, 2024

I am in wg-compiler-performance 😅 . We likely don't have a benchmark one for such consts per se, but I believe we have others with big functions.

@RalfJung
Copy link
Member

RalfJung commented Dec 18, 2024

Comment looks good, thanks. :)
@bors r=compiler-errors

We likely don't have a benchmark one for such consts per se, but I believe we have others with big functions.

So maybe we just need to make one of those existing big functions a const fn? Or is it better to add a new benchmark?

@bors
Copy link
Contributor

bors commented Dec 18, 2024

📌 Commit db7d6a9 has been approved by compiler-errors

It is now in the queue for this repository.

@lqd
Copy link
Member Author

lqd commented Dec 18, 2024

Maybe yeah, but we'd need to profile to make sure.

To trigger the most const checks, some of these may need to be called in const contexts as well, I assume. In that case, maybe a dedicated benchmark could be better suited (esp since some of the big functions in rustc-perf may actually be closures) if we wanted to add one. Though now that I think about it we have similar-sounding consts and arrays in real world benchmarks from unicode crates. I don't know if they have a similar profile as this stress test, but on smaller cfgs, and it'd probably be the first thing to check.

(It's also possible that some other real world crates will exercise these code paths more, and we'll update the benchmarks in a few months, after the edition.)

@RalfJung
Copy link
Member

RalfJung commented Dec 18, 2024 via email

@Manishearth
Copy link
Member

Manishearth commented Dec 18, 2024

@RalfJung if you point me to where the benchmark should go I can contribute it eventually. One note is that that generated code uses Unicode-3 licensed data, I could replace it with some random nonsense instead if needed.

Also do let me know if you'd prefer the benchmark to use macros or not. I can simplify further so it benchmarks the things you care about. The benchmark uses a macro because that's what we do (because our data crates are pure macro crates and they get invoked by our code crates).

@nnethercote
Copy link
Contributor

We are due to do a rustc-perf benchmark upgrade in early 2025, it would make sense to add any new const benchmark as part of that. I have started an update doc here, and linked to this PR as one to possibly add.

@bors
Copy link
Contributor

bors commented Dec 20, 2024

⌛ Testing commit db7d6a9 with merge 5dfe648...

@bors
Copy link
Contributor

bors commented Dec 20, 2024

☀️ Test successful - checks-actions
Approved by: compiler-errors
Pushing 5dfe648 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Dec 20, 2024
@bors bors merged commit 5dfe648 into rust-lang:master Dec 20, 2024
7 checks passed
@rustbot rustbot added this to the 1.85.0 milestone Dec 20, 2024
@lqd lqd deleted the const-qualif-bitsets branch December 20, 2024 07:31
@rust-timer
Copy link
Collaborator

Finished benchmarking commit (5dfe648): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results (primary 1.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
1.1% [1.0%, 1.3%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 1.1% [1.0%, 1.3%] 2

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 768.017s -> 769.305s (0.17%)
Artifact size: 330.27 MiB -> 330.31 MiB (0.01%)

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Jan 11, 2025
Rename `BitSet` to `DenseBitSet`

r? `@Mark-Simulacrum` as you requested this in rust-lang#134438 (comment) after such a confusion.

This PR renames `BitSet` to `DenseBitSet` to make it less obvious as the go-to solution for bitmap needs, as well as make its representation (and positives/negatives) clearer. It also expands the comments there to hopefully make it clearer when it's not a good fit, with some alternative bitsets types.

(This migrates the subtrees cg_gcc and clippy to use the new name in separate commits, for easier review by their respective owners, but they can obvs be squashed)
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jan 11, 2025
Rollup merge of rust-lang#135205 - lqd:bitsets, r=Mark-Simulacrum

Rename `BitSet` to `DenseBitSet`

r? `@Mark-Simulacrum` as you requested this in rust-lang#134438 (comment) after such a confusion.

This PR renames `BitSet` to `DenseBitSet` to make it less obvious as the go-to solution for bitmap needs, as well as make its representation (and positives/negatives) clearer. It also expands the comments there to hopefully make it clearer when it's not a good fit, with some alternative bitsets types.

(This migrates the subtrees cg_gcc and clippy to use the new name in separate commits, for easier review by their respective owners, but they can obvs be squashed)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants