Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query/Perf: Compile identifier lambdas passed to PopulateIncludeCollection rather than inline #35212

Closed
maumar opened this issue Nov 26, 2024 · 2 comments · Fixed by #35213
Closed
Assignees
Labels
area-perf area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported preview-1 regression
Milestone

Comments

@maumar
Copy link
Contributor

maumar commented Nov 26, 2024

Part of the bigger perf issue: #35053

PopulateIncludeCollection (as well as couple other methods) take a number of delegate arguments. For scenarios with significant number of entities we see significant perf improvement when these delegates are compiled (like we used to do in EF8), rather than inlined.

Relevant shaper fragment in EF8

ShaperProcessingExpressionVisitor.PopulateIncludeCollection<SalesOrderHeader, SalesOrderDetail>(
	collectionId: 0, 
	queryContext: queryContext, 
	dbDataReader: dataReader, 
	resultCoordinator: resultCoordinator, 
	parentIdentifier: Func<QueryContext, DbDataReader, object[]>, 
	outerIdentifier: Func<QueryContext, DbDataReader, object[]>, 
	selfIdentifier: Func<QueryContext, DbDataReader, object[]>, 
       (...)

and the same fragment in EF9:

ShaperProcessingExpressionVisitor.PopulateIncludeCollection<SalesOrderHeader, SalesOrderDetail>(
	collectionId: 0, 
	queryContext: queryContext, 
	dbDataReader: dataReader, 
	resultCoordinator: resultCoordinator, 
	parentIdentifier: (queryContext, dataReader) => new object[]
	{ 
		(object)(int?)dataReader.GetInt32(0), 
		(object)(int?)dataReader.GetInt32(26) 
	}, 
	outerIdentifier: (queryContext, dataReader) => new object[]
	{ 
		(object)(int?)dataReader.GetInt32(0), 
		(object)(int?)dataReader.GetInt32(26) 
	}, 
	selfIdentifier: (queryContext, dataReader) => new object[]
	{ 
		(object)dataReader.IsDBNull(33) ? default(int?) : (int?)dataReader.GetInt32(33), 
		(object)dataReader.IsDBNull(34) ? default(int?) : (int?)dataReader.GetInt32(34) 
	}, 
       (...)

Benchmark results:

before the fix (but already with invoke fix and no interpretation)

Method Async Mean Error StdDev Op/s Gen0 Gen1 Allocated
MultiInclue False 455.1 ms 8.94 ms 10.29 ms 2.197 11000.0000 6000.0000 67.92 MB
MultiInclue True 435.4 ms 1.77 ms 1.66 ms 2.297 11000.0000 6000.0000 67.92 MB

after the fix:

Method Async Mean Error StdDev Op/s Gen0 Gen1 Allocated
MultiInclue False 363.3 ms 6.72 ms 6.29 ms 2.752 9000.0000 3000.0000 58.51 MB
MultiInclue True 351.9 ms 2.08 ms 1.74 ms 2.842 9000.0000 3000.0000 58.51 MB

This gets us pretty close to the EF8 numbers which were:

Method Async Mean Error StdDev Op/s Gen0 Gen1 Allocated
MultiInclue False 297.1 ms 1.47 ms 1.30 ms 3.365 8000.0000 6000.0000 52.4 MB
MultiInclue True 290.2 ms 3.76 ms 3.52 ms 3.446 8500.0000 6000.0000 52.4 MB
@maumar maumar changed the title Query/Perf: In compile identifier lambdas passed to PopulateIncludeCollection rather than inline Query/Perf: Compile identifier lambdas passed to PopulateIncludeCollection rather than inline Nov 26, 2024
@maumar maumar self-assigned this Nov 26, 2024
maumar added a commit that referenced this issue Nov 26, 2024
…lateIncludeCollection rather than inline

PopulateIncludeCollection (as well as couple other methods) take a number of delegate arguments. For scenarios with significant number of entities we see significant perf improvement when these delegates are compiled (like we used to do in EF8), rather than inlined.

Fixes #35212
@maumar maumar closed this as completed in 8d18b1c Nov 26, 2024
@maumar maumar added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Nov 26, 2024
@maumar
Copy link
Contributor Author

maumar commented Nov 26, 2024

re-openinig for potential patch

@maumar maumar reopened this Nov 26, 2024
maumar added a commit that referenced this issue Nov 26, 2024
…lateIncludeCollection rather than inline

Fixes #35212
Port of #35213

**Description**
In EF9 we changed the way we generate shapers in preparation for AOT scenarios. As part of these changes we started inlining some delegates passed to PopulateIncludeCollection (as well as couple other methods), rather than compiling them. For scenarios with large number of entities we see significant perf degradation when these delegates are inlined, as opposed to compiled (like we used to do in EF8). This change reverts to EF8 behavior.

**Customer impact**
Queries using collection navigation with significant amount of data suffer large performance degradation when compared with EF8. No good workaround.

**How found**
Multiple customer reports on 9.0.0.

**Regression**
Yes, from 8.0. Perf regression only, no functional regression here.

**Testing**
Ad-hoc perf testing with BenchmarkDotNet. Functional change already covered by numerous tests.

**Risk**
Low - essentially reverting back to EF8 behavior, quirk added.

**Benchmarks**

before the fix (but already with invoke fix and no interpretation)

|      Method | Async |     Mean |   Error |   StdDev |  Op/s |       Gen0 |      Gen1 | Allocated |
|------------ |------ |---------:|--------:|---------:|------:|-----------:|----------:|----------:|
| MultiInclue | False | 455.1 ms | 8.94 ms | 10.29 ms | 2.197 | 11000.0000 | 6000.0000 |  67.92 MB |
| MultiInclue |  True | 435.4 ms | 1.77 ms |  1.66 ms | 2.297 | 11000.0000 | 6000.0000 |  67.92 MB |

after the fix:

|      Method | Async |     Mean |   Error |  StdDev |  Op/s |      Gen0 |      Gen1 | Allocated |
|------------ |------ |---------:|--------:|--------:|------:|----------:|----------:|----------:|
| MultiInclue | False | 363.3 ms | 6.72 ms | 6.29 ms | 2.752 | 9000.0000 | 3000.0000 |  58.51 MB |
| MultiInclue |  True | 351.9 ms | 2.08 ms | 1.74 ms | 2.842 | 9000.0000 | 3000.0000 |  58.51 MB |

This gets us pretty close to the EF8 numbers which were:

|      Method | Async |     Mean |   Error |  StdDev |  Op/s |      Gen0 |      Gen1 | Allocated |
|------------ |------ |---------:|--------:|--------:|------:|----------:|----------:|----------:|
| MultiInclue | False | 297.1 ms | 1.47 ms | 1.30 ms | 3.365 | 8000.0000 | 6000.0000 |   52.4 MB |
| MultiInclue |  True | 290.2 ms | 3.76 ms | 3.52 ms | 3.446 | 8500.0000 | 6000.0000 |   52.4 MB |
@maumar maumar added this to the 9.0.x milestone Nov 27, 2024
maumar added a commit that referenced this issue Nov 27, 2024
…lateIncludeCollection rather than inline (#35217)

Fixes #35212
Port of #35213

**Description**
In EF9 we changed the way we generate shapers in preparation for AOT scenarios. As part of these changes we started inlining some delegates passed to PopulateIncludeCollection (as well as couple other methods), rather than compiling them. For scenarios with large number of entities we see significant perf degradation when these delegates are inlined, as opposed to compiled (like we used to do in EF8). This change reverts to EF8 behavior.

**Customer impact**
Queries using collection navigation with significant amount of data suffer large performance degradation when compared with EF8. No good workaround.

**How found**
Multiple customer reports on 9.0.0.

**Regression**
Yes, from 8.0. Perf regression only, no functional regression here.

**Testing**
Ad-hoc perf testing with BenchmarkDotNet. Functional change already covered by numerous tests.

**Risk**
Low - essentially reverting back to EF8 behavior, quirk added.

**Benchmarks**

before the fix (but already with invoke fix and no interpretation)

|      Method | Async |     Mean |   Error |   StdDev |  Op/s |       Gen0 |      Gen1 | Allocated |
|------------ |------ |---------:|--------:|---------:|------:|-----------:|----------:|----------:|
| MultiInclue | False | 455.1 ms | 8.94 ms | 10.29 ms | 2.197 | 11000.0000 | 6000.0000 |  67.92 MB |
| MultiInclue |  True | 435.4 ms | 1.77 ms |  1.66 ms | 2.297 | 11000.0000 | 6000.0000 |  67.92 MB |

after the fix:

|      Method | Async |     Mean |   Error |  StdDev |  Op/s |      Gen0 |      Gen1 | Allocated |
|------------ |------ |---------:|--------:|--------:|------:|----------:|----------:|----------:|
| MultiInclue | False | 363.3 ms | 6.72 ms | 6.29 ms | 2.752 | 9000.0000 | 3000.0000 |  58.51 MB |
| MultiInclue |  True | 351.9 ms | 2.08 ms | 1.74 ms | 2.842 | 9000.0000 | 3000.0000 |  58.51 MB |

This gets us pretty close to the EF8 numbers which were:

|      Method | Async |     Mean |   Error |  StdDev |  Op/s |      Gen0 |      Gen1 | Allocated |
|------------ |------ |---------:|--------:|--------:|------:|----------:|----------:|----------:|
| MultiInclue | False | 297.1 ms | 1.47 ms | 1.30 ms | 3.365 | 8000.0000 | 6000.0000 |   52.4 MB |
| MultiInclue |  True | 290.2 ms | 3.76 ms | 3.52 ms | 3.446 | 8500.0000 | 6000.0000 |   52.4 MB |
@maumar
Copy link
Contributor Author

maumar commented Nov 30, 2024

fixed for 9.0.1 - b7a436f

@maumar maumar closed this as completed Nov 30, 2024
@maumar maumar modified the milestones: 9.0.x, 9.0.1 Nov 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-perf area-query closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported preview-1 regression
Projects
None yet
1 participant