-
Notifications
You must be signed in to change notification settings - Fork 292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Rework ExecuteReaderAsync to minimize allocations #528
perf: Rework ExecuteReaderAsync to minimize allocations #528
Conversation
f697060
to
5c66f1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask you to describe how do you get this report and what do you mean of the DataAccessPerformance
project?
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/AAsyncCallContext.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/AAsyncCallContext.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/AAsyncCallContext.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/AAsyncCallContext.cs
Outdated
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/ProviderBase/DbConnectionFactory.cs
Show resolved
Hide resolved
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/ProviderBase/DbConnectionFactory.cs
Outdated
Show resolved
Hide resolved
The DataAccessPerformance project emulates the TechEmpower fortunes benchmark and I was pointed to it as a good performance based benchmarking solution. If you look at my history of pull requests here and in corefx you'll see I use it a lot and have been gradually improving performance and memory usage using it. The screenshot is from dotMemory which is a memory profiler. I highlighted the items that I've affected in red using paint |
|
||
private Task<DbConnectionInternal> CreateReplaceConnectionContinuation(Task<DbConnectionInternal> task, DbConnection owningConnection, TaskCompletionSource<DbConnectionInternal> retry, DbConnectionOptions userOptions, DbConnectionInternal oldConnection, DbConnectionPoolGroup poolGroup, CancellationTokenSource cancellationTokenSource) | ||
{ | ||
return task.ContinueWith( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Small comment @Wraith2: ContinueWith generally performs much worse than a simple async method with await (not to mention that's it's more brittle wrt exception handling etc.). I can post some benchmarking if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though this may be totally justified if TaskContinuationOptions.LongRunning is important here - just making the general remark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Language supported async await would be easier to understand as well.
Whoever wrote the original code had some reason to use imperative constructs instead of language support but I've no idea what it was.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Language supported async await would be easier to understand as well.
Absolutely.
Whoever wrote the original code had some reason to use imperative constructs instead of language support but I've no idea what it was.
I suspect a lot of the code was simply written before async/await was introduced (in .NET 4.5, relatively "late"). I'd definitely consider replacing the callback approach as you work through the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine with me but it's down to the MS team to make that sort of decision. I don't want to make working on this library any harder than it already is so I try to replicate the existing style wherever it's sensible to do so in my changes.
There are several library wide things that would be nice to do.
- Merge the netfx and netcore codebases into a single solution with documented reasons for any differences
- Change how async methods are implemented with snapshots to reduce performance issues.
- Change from imperative async to language async.
We've got 1 in progress but the others really need 1 to be done and judged stable so we don't have to duplicate a lot of work. 1 will aoso bring perf improvements done in netcore to netfx.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I vote for prioritizing 1 - it is currently a quite confusing codebase, which makes community contributions harder than the should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
working on it bit by bit. #625 The easy part is the identical files. The complex part is the files which have opposing or complex changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, thing kind of cleanup can be done separately of course. I'd suggest that if ContinueWith continuations are being rewritten for unrelated reasons, it may be a good opportunity to just do that with async/await, but of course that's up to the team and you.
I also totally agree that merging the netfx/netcore codebases is a high-priority task.
@@ -1864,14 +1873,22 @@ private static void ChangePassword(string connectionString, SqlConnectionString | |||
// this only happens once per connection | |||
// SxS: using named file mapping APIs | |||
|
|||
internal void RegisterForConnectionCloseNotification<T>(ref Task<T> outerTask, object value, int tag) | |||
internal Task<T> RegisterForConnectionCloseNotification<T>(Task<T> outerTask, object value, int tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a candidate for rewriting with simple async/await
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@David-Engel @cheenamalhotra I can give this a try in the PR if you like or we can leave it for a later date, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there more similar cases that we would like to address?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original code tooks a ref Task and added a continuation to it then returned the new wrapper task in the same variable. It doesn't do any awaiting and if the caller was rewritten to use language async then this function would be totally rewritten or removed as well. It'd prefer not to touch it further in this PR and work on it at a stage when a larger scale transition from contiunation to langauge async is being done.
Can I ask you to produce some information to illustrate how do these changes affect the performance? |
Using Northwind and the benchmark code: [Benchmark]
public async Task<int> AsyncReadOrders()
{
int max = 0;
using (SqlDataReader reader = await command.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
int value = reader.GetInt32(0);
if (value > max)
{
max = value;
}
}
}
return max;
}
It's a really tiny benchmark as you can see from the times. It simply gets some ints using a data reader and reads them. The main push is the allocated memory. Very little of those allocations are user data they're task machinery and library internals. The reduction you see is delegates which are removed by caching static versions of them and some async closures which are removed by using direct state objects to avoid capture. |
6ce2966
to
6150b3b
Compare
I was able to verify the difference in allocations but am I correct in thinking that the benefits would mostly be lost when returning larger result sets for each call to ExecuteReaderAsync? |
src/Microsoft.Data.SqlClient/netcore/src/Microsoft/Data/SqlClient/SqlCommand.cs
Show resolved
Hide resolved
The benefits wouldn't be lost they'd just be less of the total memory allocated if you do a lot of reading. If people are making an effort to do sensible things, chunky not chatty, then this is just a nice to have, if they're being bad at sql and doing chatty calls then this is nicer to have. It's never fundamental. In an ideal situation I'd like the library to only allocate memory for the results that a user requested, that isn't practical but if there is benefit to users for relatively small increase in complexity of the library it's possibly worth taking. The reason I started working on this library is that I had written some code using the new pipelines and span apis in netcore and had a really lean fast project and then when I started trying to get that data in and out of a database using this library obliterated all the careful performance work I'd done. So yes these are smaller gains but they're still of value if you want to support high performance scenarios. |
merged and fixed up. |
Benchmarking against the usual DataAccessPerformance project I've got all the way down to the once per call allocations now. ExecuteDataReader uses callbacks delegates and closures which can be cached eliminated or minimized in many cases.
The AAsyncCallContext class is very similar to the one already used internally in SqlDataReader because the pattern of allocation and caching is similar. Once this PR is reviewed and merged I intend to go back and rebase the SqlDataReader internal classes on this new base class.
The changes here come in a number of patterns:
this
type and bounces the call through a translation to an instance call again.I also changed a void returning function which was altering a ref parameter to one which returns the new value because it's easier to understand what it's doing using the more familiar pattern.
some of the allocations eliminated are:
there are some more off the bottom of the screenshot.
The same work can be done for ExecuteScalar eventually.