make `mk_attr_id` part of `ParseSess` #101313

SparrowLii · 2022-09-02T08:38:44Z

The current mk_attr_id uses the AtomicU32 type, which is not very efficient and adds a lot of lock contention in a parallel environment.

This PR refers to the task list in #48685, uses mk_attr_id as a method of the AttrIdGenerator struct, and adds a new field attr_id_generator to ParseSess.

AttrIdGenerator uses the WorkerLocal, which has two advantages: 1. Cell is more efficient than AtomicU32, and does not increase any lock contention. 2. We put the index of the work thread in the first few bits of the generated AttrId, so that the AttrId generated in different threads can be easily guaranteed to be unique.

cc @cjgillot

rust-highfive · 2022-09-02T08:38:48Z

r? @fee1-dead

(rust-highfive has picked a reviewer for you, use r? to override)

fee1-dead · 2022-09-02T12:49:21Z

compiler/rustc_ast/src/attr/mod.rs

+        // starting value of AttrId in each worker thread.
+        // The `index` is the index of the worker thread.
+        // This ensures that the AttrId generated in each thread is unique.
+        AttrIdGenerator(WorkerLocal::new(|index| Cell::new((index as u32).reverse_bits())))


I am not familiar with parallel compiler code, but is there a cap as to how many threads can be used for parallel rustc? If there are too many then the actual usable bits would be decreased.

Yes it is a cap. AFAIK, the number of threads is usually no more than 64. Besides, as the number of threads increases, the number of AttrIds that need to be assigned per thread will decreases relatively, so I don't think this is a problem.

cjgillot · 2022-09-05T09:35:04Z

I understand that the main motivation is performance in parallel environment. Do you have a measurement of the perf improvement?

SparrowLii · 2022-09-05T10:02:08Z

I understand that the main motivation is performance in parallel environment. Do you have a measurement of the perf improvement?

I haven't learned about a good benchmark for measuring the efficiency of parallel compilation, which is also in my follow-up implementation plan. I guess for now we just need to guarantee that there is no negative impact on the efficiency in non-parallel mode. So I think we can run rustc perf directly.

SparrowLii · 2022-09-05T10:03:57Z

There are already some issues about parallel compilation' benchmarks, such as #59667 and #92596, I think we will solve them in subsequent implementations.

cjgillot · 2022-09-05T10:05:22Z

@bors try @rust-timer queue

rust-timer · 2022-09-05T10:05:24Z

Awaiting bors try build completion.

@rustbot label: +S-waiting-on-perf

bors · 2022-09-05T10:05:42Z

⌛ Trying commit f9234777e44bde1b27786e9f4300b0ee3323c5f9 with merge 7e04d9038009c47e6b24a62aab7fa9d31c71706a...

bors · 2022-09-05T11:31:42Z

☀️ Try build successful - checks-actions
Build commit: 7e04d9038009c47e6b24a62aab7fa9d31c71706a (7e04d9038009c47e6b24a62aab7fa9d31c71706a)

rust-timer · 2022-09-05T11:31:43Z

Queued 7e04d9038009c47e6b24a62aab7fa9d31c71706a with parent 5b4bd15, future comparison URL.

rust-timer · 2022-09-05T19:30:11Z

Finished benchmarking commit (7e04d9038009c47e6b24a62aab7fa9d31c71706a): comparison URL.

Overall result: ✅ improvements - no action needed

Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. While you can manually mark this PR as fit for rollup, we strongly recommend not doing so since this PR may lead to changes in compiler perf.

@bors rollup=never
@rustbot label: +S-waiting-on-review -S-waiting-on-perf -perf-regression

Instruction count

This is a highly reliable metric that was used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.7%	[-1.2%, -0.3%]	13
Improvements ✅ (secondary)	-1.2%	[-1.7%, -0.8%]	8
All ❌✅ (primary)	-0.7%	[-1.2%, -0.3%]	13

Max RSS (memory usage)

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	1.6%	[0.8%, 2.3%]	2
Regressions ❌ (secondary)	2.3%	[1.4%, 3.7%]	3
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.3%	[-2.3%, -2.3%]	1
All ❌✅ (primary)	1.6%	[0.8%, 2.3%]	2

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.8%	[-2.8%, -2.8%]	1
All ❌✅ (primary)	-	-	0

the arithmetic mean of the percent change ↩ ↩² ↩³
number of relevant changes ↩ ↩² ↩³

cjgillot · 2022-09-12T20:22:24Z

From the perf report, this PR looks like a great idea to improve perf.
However, I'm a bit afraid of Heisenbugs in parallel-compiler due to silent collisions of AttrId because they overflow the allocated 27 bits or so.
Could you add a debug-assertion that this does not happen?
Then r=me

SparrowLii · 2022-09-13T08:44:13Z

Sure, I added the corresponding modifications.

SparrowLii · 2022-09-13T08:44:54Z

@bors r=cjgillot

bors · 2022-09-13T08:44:55Z

@SparrowLii: 🔑 Insufficient privileges: Not in reviewers

SparrowLii · 2022-09-13T08:46:15Z

@cjgillot It looks like I don't have the privilege to r=

bors · 2022-09-13T16:15:41Z

☔ The latest upstream changes (presumably #101757) made this pull request unmergeable. Please resolve the merge conflicts.

cjgillot · 2022-09-14T17:14:08Z

@bors r+

bors · 2022-09-14T17:14:09Z

📌 Commit bfc4f2e has been approved by cjgillot

It is now in the queue for this repository.

bors · 2022-09-14T20:52:22Z

⌛ Testing commit bfc4f2e with merge 750bd1a...

bors · 2022-09-14T23:36:46Z

☀️ Test successful - checks-actions
Approved by: cjgillot
Pushing 750bd1a to master...

rust-timer · 2022-09-15T00:53:16Z

Finished benchmarking commit (750bd1a): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

This benchmark run did not return any relevant results for this metric.

Cycles

Results

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean¹	range	count²
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.1%	[2.1%, 2.1%]	1
Improvements ✅ (primary)	-4.5%	[-4.9%, -4.1%]	2
Improvements ✅ (secondary)	-3.2%	[-3.3%, -3.1%]	2
All ❌✅ (primary)	-4.5%	[-4.9%, -4.1%]	2

the arithmetic mean of the percent change ↩
number of relevant changes ↩

Zoxc · 2023-01-27T01:07:06Z

It's possible that WorkerLocal is slower than fetch_add since the latter is quite fast already. I don't think mk_attr_id is hot enough for it to matter, but to actually measure the overhead I'd recommend measuring check builds using a single thread on a CPU with its frequency locked using my benchmark tool.

SparrowLii · 2023-01-28T01:09:08Z

Thanks, your suggestion is very valuable! In fact, we don't have good tools for measuring the performance of compilers in parallel environments at current. I will try this tool then!

Zoxc · 2023-02-07T23:18:41Z

This happened to break the parallel compiler due to WorkerLocal being created outside the Rayon thread pool.

SparrowLii · 2023-02-08T00:41:41Z

You mean WorlerLocal can not get the index correctly? If so then WorkerLocal really shouldn't be used. It might be reasonable to use thread_local.

Or we should add this WorkerLocal to TyCtxt so it doesn't exceed Rayon thread pool

Zoxc · 2023-02-08T00:56:21Z

I think it just ends up spawning the global Rayon thread pool. I kind of have a workaround in #107782, but it's not particularly clean.

rust-highfive assigned fee1-dead Sep 2, 2022

rustbot added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Sep 2, 2022

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 2, 2022

cjgillot self-assigned this Sep 2, 2022

fee1-dead reviewed Sep 2, 2022

View reviewed changes

rustbot added the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 5, 2022

rustbot removed the S-waiting-on-perf Status: Waiting on a perf run to be completed. label Sep 5, 2022

SparrowLii added 2 commits September 14, 2022 08:49

make mk_attr_id part of ParseSess

1a3ecbd

add debug assertion for max attr_id

bfc4f2e

SparrowLii force-pushed the mk_attr_id branch from 87cd6ff to bfc4f2e Compare September 14, 2022 01:01

bors removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Sep 14, 2022

bors added the S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. label Sep 14, 2022

bors added the merged-by-bors This PR was explicitly merged by bors. label Sep 14, 2022

bors merged commit 750bd1a into rust-lang:master Sep 14, 2022

rustbot added this to the 1.65.0 milestone Sep 14, 2022

bors mentioned this pull request Sep 15, 2022

Shrink ast::Expr harder #101562

Merged

SparrowLii mentioned this pull request Nov 10, 2022

Reboot Parallel Rustc WG Proposal rust-lang/compiler-team#567

Closed

3 tasks

Zoxc mentioned this pull request Feb 8, 2023

Move the WorkerLocal type from the rustc-rayon fork into rustc_data_structures #107782

Merged

make mk_attr_id part of ParseSess #101313

make mk_attr_id part of ParseSess #101313

Uh oh!

Conversation

SparrowLii commented Sep 2, 2022

Uh oh!

rust-highfive commented Sep 2, 2022

Uh oh!

fee1-dead Sep 2, 2022

Choose a reason for hiding this comment

Uh oh!

SparrowLii Sep 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cjgillot commented Sep 5, 2022

Uh oh!

SparrowLii commented Sep 5, 2022

Uh oh!

SparrowLii commented Sep 5, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cjgillot commented Sep 5, 2022

Uh oh!

rust-timer commented Sep 5, 2022

Uh oh!

bors commented Sep 5, 2022

Uh oh!

bors commented Sep 5, 2022

Uh oh!

rust-timer commented Sep 5, 2022

Uh oh!

rust-timer commented Sep 5, 2022

Overall result: ✅ improvements - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

cjgillot commented Sep 12, 2022

Uh oh!

SparrowLii commented Sep 13, 2022

Uh oh!

SparrowLii commented Sep 13, 2022

Uh oh!

bors commented Sep 13, 2022

Uh oh!

SparrowLii commented Sep 13, 2022

Uh oh!

bors commented Sep 13, 2022

Uh oh!

cjgillot commented Sep 14, 2022

Uh oh!

bors commented Sep 14, 2022

Uh oh!

bors commented Sep 14, 2022

Uh oh!

bors commented Sep 14, 2022

Uh oh!

rust-timer commented Sep 15, 2022

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Footnotes

Uh oh!

Zoxc commented Jan 27, 2023

Uh oh!

SparrowLii commented Jan 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zoxc commented Feb 7, 2023

Uh oh!

SparrowLii commented Feb 8, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Zoxc commented Feb 8, 2023

Uh oh!

Uh oh!

make `mk_attr_id` part of `ParseSess` #101313

make `mk_attr_id` part of `ParseSess` #101313

SparrowLii Sep 2, 2022 •

edited

Loading

SparrowLii commented Sep 5, 2022 •

edited

Loading

SparrowLii commented Jan 28, 2023 •

edited

Loading

SparrowLii commented Feb 8, 2023 •

edited

Loading