-
-
Notifications
You must be signed in to change notification settings - Fork 964
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Dispatcher] improve api, reduce overhead, improve performances for items > 1k #2083
Conversation
Any benchmark? |
I'll send it over later this week |
…cted, something to investigate again when JIT gets better
Here's a benchmark running on latest changes:
Difference between N.B.: Also included the changes from PR #1190 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work
{ | ||
using (Profile(action)) | ||
var batch = (BatchState<TJob>)obj; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any guarantees this would only be called with objects of the correct type? The cast like this throws when type doesn't match and can be more costly compared to obj as/is BatchState<TJob>
if the type is sealed or JIT knows it doesn't have any subclasses.
Not sure that it would make much perf difference here anyways.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, it's guaranteed;
ThreadPool.Instance.QueueUnsafeWorkItem(batch, &TypeAdapter<TJob>, batchCount - 1);
This call schedules the function void TypeAdapter<TJob>(object)
to run on other threads with batch
passed as the obj
parameter. Both TypeAdapter<TJob>
and BatchState<TJob>
's generic types match as they both come from the ForBatched<TJob>
generic.
It would be a fairly serious bug if there wasn't any guarantee.
But as you said, as
is faster than direct cast, it won't move the needle that much compared to how long those jobs tend to take, but anything helps. I'll change it to as
shortly.
private class BatchState | ||
/// <summary> | ||
/// An implementation of a job running in batches. | ||
/// This object is shared across all threads scheduled for the job; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to mention here as well it's not shared if it's a struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, forgot about that one when changing things over, good catch !
{ | ||
// For net6+ this should not happen, logging instead of throwing as this is just a performance regression | ||
if(Environment.Version.Major >= 6) | ||
Console.Out?.WriteLine($"{typeof(ThreadPool).FullName}: Falling back to suboptimal semaphore"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log errors to Stride's logging system instead of Console? A lot of the time on Windows you're not going to have a console attached.
|
||
public DotnetLifoSemaphore(int spinCount) | ||
{ | ||
Type lifoType = Type.GetType("System.Threading.LowLevelLifoSemaphore"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to add a comment here that we use reflection to access an internal type
private sealed class DotnetLifoSemaphore : ISemaphore | ||
{ | ||
private readonly IDisposable semaphore; | ||
private readonly Func<int, bool, bool> wait; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious, can C# 9.0 function pointers (and GetFunctionPointer() be useful in this scenario?
(not sure it would make an actual perf difference though, but curious as if usable enough for this use case, as this would mean I can probably use it in some other places)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my limited testing, yes, although I do not know the implications this has for JIT and such; I do remember that the address static function pointers lay on when taking its address is not fixed. If the method is 'moved' after JIT took care of the method, then we might have to retrieve the function pointer from its runtime method handle on every call to make sure we run the optimal version ...
#1190 can probably be closed now? |
Yep, thanks for the reminder @Kryptos-FR |
PR Details
Threadpool now accepts generic work items, allowing the dispatcher and other callers to compose work items out of structures, in turn enabling the JIT to inline the composed call tree into just one function call from the thread's scope up to the actual work the caller dispatched.
Arguments can now be passed by ref to the dispatcher, which is also passed by ref to threads resolving a lot of issues that would require painful workarounds and allocations, like allowing jobs to read and write to a shared structure, interlocked operations per dispatch, and outputting from a job.
Added an interface for processing items in batches, allowing for improved throughput for jobs that have thousands of items to process.
I only changed a couple of engine calls to be batched, others may benefit in more demanding games.
Types of changes
Checklist