This repository has been archived by the owner on Jan 23, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
[WIP] Threadpool exploration RFC #18158
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An issue with this is going to be that when an idle thread pool gets a burst of work items in quick succession, only one thread would be requested, and only one thread then will be available to process all of that work. It would be fine (maybe even better) if those work items are very short, but they may not be and it may be beneficial to have multiple threads processing them. I have some thoughts on how to approach this issue and other perf issues here, but I'd have to prototype them to validate it. I think it will need some design changes to solve these problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Except... one of the things we were trying to fix in the reverted PR https://github.com/dotnet/coreclr/issues/14017 was the over requesting of threads; because the very first thing
Dispatch
if it dequeues a work item; before running it, is to request another thread (as does Enqueue - so double requesting):coreclr/src/System.Private.CoreLib/src/System/Threading/ThreadPool.cs
Lines 561 to 564 in 208ea16
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than trying to fix that, which was problematic and difficult to get round the races; idea is to work with it. So it will always (except between wake-up and dequeue gap) have a request active while it is successfully dequeueing; though only 1 rather than # cpu count?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the over-requesting is currently necessary because the thread request count tracking is fragile, mainly to reduce the chance of failure to near-zero (but not zero), theoretically. @sdmaclea tried to limit the requesting to the right amount but it ended up having races - I don't think there was anything wrong with the solution, but because of existing issues, it didn't work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, but this takes a different approach by instead working with the over-requesting?
So with this change, while there is any items in the queue there will always[1] be one thread request active (plus when there are no more items, but one has just been dequeued).
Whereas currently it will always very quickly grow to always ProcessorCount thread requests active (as the number will never decrease while items are dequeued); as well as having a hot CompareExchange loop to decrease and increase the number.
[1] There will be a slight gap between thread being activated and dequeing an item, where it will drop to 0