Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Execution Performance #1392

Merged
merged 57 commits into from
Mar 1, 2024
Merged

Improve Execution Performance #1392

merged 57 commits into from
Mar 1, 2024

Conversation

michaelstaib
Copy link
Collaborator

@michaelstaib michaelstaib commented Mar 30, 2023

(note: descriptions that are quoted were intially authored by GitHub Copilot and modified for brevity, correctness, and conciseness. @seantleonard)

Why this change?

Take advantage from Hot Chocolates execution pipeline to improve execution of pre-acquired data. Without this change, there exists inefficiencies in JsonDocument object allocations and disposal, inefficiencies in GraphQL result processing in the ResolverMiddleware class

What is this change?

  • Introduction of ResolverTypeInterceptor.cs class which

  • Introduction of ExecutionHelper.cs class w

  • Introduction of ArrayPoolWriter.cs class

The ArrayPoolWriter.cs file provides a way to manage and interact with pooled arrays, mainly byte arrays, in a more efficient and controlled manner. Here are some key benefits:

Memory Efficiency: It uses ArrayPool<byte>, a shared pool of byte arrays. This is an efficient way to handle memory because it reuses arrays, thereby reducing the frequency of garbage collection and the memory footprint of the application.

Buffer Management: The class ArrayPoolWriter implements the IBufferWriter<byte> interface, providing a standard way to write to the buffer. It also implements the IDisposable interface, ensuring proper cleanup of resources.

Exception Handling: The class includes robust error handling, throwing exceptions when methods are used incorrectly, such as attempting to advance the buffer past its capacity or trying to use the writer after it has been disposed.

Dynamic Buffer Expansion: If the required capacity exceeds the current buffer's capacity, it automatically rents a larger buffer from the pool, copies existing data, and returns the original buffer, ensuring that the buffer size can grow as needed.
Access Flexibility: It provides methods to access written data as both ReadOnlyMemory<byte> and ReadOnlySpan<byte>, offering flexible ways to interact with the data.

This class can be beneficial in scenarios where you anticipate writing to a byte buffer repeatedly or where the amount of data to write may grow over time.

  • JsonObjectExtensions.cs

Easy Conversion: Provides a straightforward way to convert JsonObject instances to JsonElement or JsonDocument.

Memory Optimization: The conversion process leverages the ArrayPoolWriter for writing the JsonObject to a pooled buffer, which avoids the need for serializing to a full JSON string. This can potentially save memory and improve performance.

  • SqlQueryEngine updates

The changes in the SqlQueryEngine.cs file can be summarized as follows:

The ResolveInnerObject method is now renamed to ResolveObject. The new method handles the case where the JsonElement is of JsonValueKind.String type and requires parsing into a JSON object. It also handles the case where the JsonElement is of JsonValueKind.Object type, directly returning the element.

The ResolveListType method is now renamed to ResolveList. It has been refactored to handle two types of JsonElement: JsonValueKind.Array and JsonValueKind.String. It deserializes both types into a list of JsonElements. The method now also checks and handles the case where metadata is not null.

The public JsonDocument? ResolveInnerObject(JsonElement element, IObjectField fieldSchema, ref IMetadata metadata) method signature has been changed to public JsonElement ResolveObject(JsonElement element, IObjectField fieldSchema, ref IMetadata metadata) to account for no longer passing back and forth JsonDocument objects.

The public object? ResolveListType(JsonElement element, IObjectField fieldSchema, ref IMetadata metadata) method signature has been changed to public object ResolveList(JsonElement array, IObjectField fieldSchema, ref IMetadata? metadata) to account for no longer passing back and forth JsonDocument objects.

The ResolveObject and ResolveList methods now have detailed summary comments explaining their functionality.
An extra check for parentMetadata.Subqueries.TryGetValue(QueryBuilder.PAGINATION_FIELD_NAME, out PaginationMetadata? paginationObjectMetadata) has been added in the ResolveObject method. If true, parentMetadata is updated with paginationObjectMetadata.

Overall, these changes focus on handling the cases where JsonElement is in the form of a string and needs to be parsed into a JSON object or array, as well as improving the metadata handling.

ExecutionHelper.cs

The ExecutionHelper class contains methods that are primarily used for interacting with the query engine and mutation engine to execute queries and mutations, respectively.

ExecuteQueryAsync(...): Represents the root query resolver and fetches initial data from the query engine. It accepts a context parameter and uses the query engine to execute the query. If the selection type is a list, it will execute a list query, register a cleanup action to dispose the documents after the query, and set the result. If not, it will execute a single item query and set the result.

ExecuteMutateAsync(...): Represents the root mutation resolver and invokes the mutation on the query engine. Similar to the ExecuteQueryAsync method, it handles both list and single item mutations.

The ExecuteLeafField, ExecuteObjectField, ExecuteListField methods are HotChocolate coined "pure resolvers" used to resolve the results of specific types of fields in the GraphQL request.

The GetMetadata, GetMetadataObjectField, SetNewMetadata, SetNewMetadataChildren methods handling storing and retreiving SqlPaginationMetadata objects utilized by each level of depth in the path of a GraphQL result.

  • Added fix for DW JSON result differing from MSSQL json result. DW has extraneous escape characters in the json which results in JsonDocument/Element not resolving the expected JsonValueKind, and requires additional JsonDocument.Parse operation to remediate.

Testing

  • Unit testing added for the ArrayPoolWriter functionality. This was adopted from HotChocolate code.
  • GraphQL integration tests that already existed were used to ensure the refactor maintained the ability to handle request scenarios we test for. There were tests that initially failed that were used to help validate SqlPaginationMetadata usage, storage, and creation within the PureResolver context. The following test classes are directly applicable for testing query execution through the PureResolver code paths established in ExecutionHelper
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLFilterTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLMutationTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLPaginationTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLQueryTests
    • Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLSupportedTypesTests

@michaelstaib
Copy link
Collaborator Author

@Aniruddh25 its not finished but shows a bit the direction. Before I go any further lets discuss these changes.

It will already now remove so much allocations from your execution. JsonObjects are no longer copied around when executing them. Instead in the root resolver we register the JsonDocuments for cleanup with the execution engine. Once the execution is done the execution pipeline will run the registered cleanup tasks.

Further I have split the middleware into two middleware and three resolvers. The work of figuring out when to execute what is done once at schema build time. With 14 we can do that at csharp build time when compiling AOT.

For child fields I am now using pure resolvers as opposed to an async middleware. This will allow the execution engine to fold these into the parent resolver and remove pressure from the task execution and memory pools.

There is more work needed in the query engine as json is copied around in a lot of places where it should not be copied.

What do you think?

@seantleonard
Copy link
Contributor

@michaelstaib Thank you so much for your time and your work you've put into this so far! I'll take a look with @Aniruddh25 and then can discuss with you.
This might not be until Monday morning (GMT-7), but I wanted to acknowledge that I've seen this and will be addressing.

@Aniruddh25
Copy link
Contributor

Aniruddh25 commented Apr 4, 2023

@michaelstaib So far, the changes look good to me, left few comments. Thanks for catching the unnecessary copying of Json issue.

Yes, we can discuss the changes, over a Teams call, what time works best for you? (I'm in UTC-7)


In reply to: 1495252097

@michaelstaib
Copy link
Collaborator Author

@Aniruddh25 I have one last thing to rework:
SqlPaginationUtil.CreatePaginationConnectionFromJsonElement(element, currentMetadata);

Once this works with the new structure it should work again. I have several other things that can be simplified and optimized ... but lets get this over the line first.

Can you ping me on teams and we plan meeting time there.

@michaelstaib michaelstaib marked this pull request as ready for review April 11, 2023 18:50
@Aniruddh25
Copy link
Contributor

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

…for two types of jsonelements received in processing list types. Elaborate on CosmosDB non-usage of IMetadata. Update nullability of IMetadata return and ref types to prevent usage of null forgiving operator. make metadatakey generation recursive for improved readability.
@seantleonard
Copy link
Contributor

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

…has extraneous escape characters in the json which results in JsonDocument/Element not resolving the expected JsonValueKind, and requires additional JsonDocument.Parse operation to remediate.
@seantleonard
Copy link
Contributor

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@michaelstaib
Copy link
Collaborator Author

Uh... its looks like this is finally coming together! Great job everyone.

@seantleonard
Copy link
Contributor

Uh... its looks like this is finally coming together! Great job everyone.

Yep, last little change and can merge in :)

Copy link
Contributor

@Aniruddh25 Aniruddh25 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge, thanks for these impactful contributions!

…y is not found and doesn't return null. As a result, we don't need to make StringValues nullable.
@seantleonard
Copy link
Contributor

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

Copy link
Contributor

@ayush3797 ayush3797 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@seantleonard seantleonard dismissed their stale review March 1, 2024 18:39

dismiss as my comments were resolved and addressed in my follow up commits.

@seantleonard
Copy link
Contributor

/azp run

Copy link

Azure Pipelines successfully started running 2 pipeline(s).

@seantleonard seantleonard merged commit 1dd4b9d into Azure:main Mar 1, 2024
9 checks passed
seantleonard added a commit that referenced this pull request Mar 1, 2024
(note: descriptions that are quoted were intially authored by GitHub
Copilot and modified for brevity, correctness, and conciseness.
@seantleonard)

## Why this change?

Take advantage from Hot Chocolates execution pipeline to improve
execution of pre-acquired data. Without this change, there exists
inefficiencies in `JsonDocument` object allocations and disposal,
inefficiencies in GraphQL result processing in the `ResolverMiddleware`
class

## What is this change?

+ Introduction of `ResolverTypeInterceptor.cs` class which 
+ Introduction of `ExecutionHelper.cs` class w

+ Introduction of `ArrayPoolWriter.cs` class
> The `ArrayPoolWriter.cs` file provides a way to manage and interact
with pooled arrays, mainly byte arrays, in a more efficient and
controlled manner. Here are some key benefits:

> **Memory Efficiency**: It uses `ArrayPool<byte>`, a shared pool of
byte arrays. This is an efficient way to handle memory because it reuses
arrays, thereby reducing the frequency of garbage collection and the
memory footprint of the application.
>
> **Buffer Management**: The class ArrayPoolWriter implements the
`IBufferWriter<byte>` interface, providing a standard way to write to
the buffer. It also implements the `IDisposable` interface, ensuring
proper cleanup of resources.
>
> **Exception Handling**: The class includes robust error handling,
throwing exceptions when methods are used incorrectly, such as
attempting to advance the buffer past its capacity or trying to use the
writer after it has been disposed.
>
> **Dynamic Buffer Expansion**: If the required capacity exceeds the
current buffer's capacity, it automatically rents a larger buffer from
the pool, copies existing data, and returns the original buffer,
ensuring that the buffer size can grow as needed.
> Access Flexibility: It provides methods to access written data as both
`ReadOnlyMemory<byte>` and `ReadOnlySpan<byte>`, offering flexible ways
to interact with the data.
>
> This class can be beneficial in scenarios where you anticipate writing
to a byte buffer repeatedly or where the amount of data to write may
grow over time.

+ JsonObjectExtensions.cs
> **Easy Conversion**: Provides a straightforward way to convert
`JsonObject` instances to `JsonElement` or `JsonDocument`.
>
> **Memory Optimization**: The conversion process leverages the
ArrayPoolWriter for writing the JsonObject to a pooled buffer, which
avoids the need for serializing to a full JSON string. This can
potentially save memory and improve performance.

+ SqlQueryEngine updates
> The changes in the `SqlQueryEngine.cs` file can be summarized as
follows:
>
> The `ResolveInnerObject` method is now renamed to `ResolveObject`. The
new method handles the case where the JsonElement is of
JsonValueKind.String type and requires parsing into a JSON object. It
also handles the case where the JsonElement is of JsonValueKind.Object
type, directly returning the element.
>
> The `ResolveListType` method is now renamed to `ResolveList`. It has
been refactored to handle two types of `JsonElement`:
`JsonValueKind.Array` and `JsonValueKind.String`. It deserializes both
types into a list of JsonElements. The method now also checks and
handles the case where metadata is not null.
>
> The `public JsonDocument? ResolveInnerObject(JsonElement element,
IObjectField fieldSchema, ref IMetadata metadata)` method signature has
been changed to `public JsonElement ResolveObject(JsonElement element,
IObjectField fieldSchema, ref IMetadata metadata)` to account for no
longer passing back and forth `JsonDocument` objects.
>
> The `public object? ResolveListType(JsonElement element, IObjectField
fieldSchema, ref IMetadata metadata)` method signature has been changed
to `public object ResolveList(JsonElement array, IObjectField
fieldSchema, ref IMetadata? metadata)` to account for no longer passing
back and forth `JsonDocument` objects.
>
> The `ResolveObject` and `ResolveList` methods now have detailed
summary comments explaining their functionality.
An extra check for
`parentMetadata.Subqueries.TryGetValue(QueryBuilder.PAGINATION_FIELD_NAME,
out PaginationMetadata? paginationObjectMetadata)` has been added in the
`ResolveObject` method. If true, `parentMetadata` is updated with
`paginationObjectMetadata`.
>
> Overall, these changes focus on handling the cases where `JsonElement`
is in the form of a string and needs to be parsed into a JSON object or
array, as well as improving the metadata handling.

ExecutionHelper.cs
> The `ExecutionHelper` class contains methods that are primarily used
for interacting with the query engine and mutation engine to execute
queries and mutations, respectively.

> `ExecuteQueryAsync(...)`: Represents the root query resolver and
fetches initial data from the query engine. It accepts a context
parameter and uses the query engine to execute the query. If the
selection type is a list, it will execute a list query, register a
cleanup action to dispose the documents after the query, and set the
result. If not, it will execute a single item query and set the result.
>
> `ExecuteMutateAsync(...)`: Represents the root mutation resolver and
invokes the mutation on the query engine. Similar to the
ExecuteQueryAsync method, it handles both list and single item
mutations.
>
> The `ExecuteLeafField`, `ExecuteObjectField`, `ExecuteListField`
methods are HotChocolate coined "pure resolvers" used to resolve the
results of specific types of fields in the GraphQL request.
>
> The `GetMetadata`, `GetMetadataObjectField`, `SetNewMetadata`,
`SetNewMetadataChildren` methods handling storing and retreiving
SqlPaginationMetadata objects utilized by each level of depth in the
path of a GraphQL result.

- Added fix for DW JSON result differing from MSSQL json result. DW has
extraneous escape characters in the json which results in
JsonDocument/Element not resolving the expected JsonValueKind, and
requires additional JsonDocument.Parse operation to remediate.

## Testing

- Unit testing added for the ArrayPoolWriter functionality. This was
adopted from HotChocolate code.
- GraphQL integration tests that already existed were used to ensure the
refactor maintained the ability to handle request scenarios we test for.
There were tests that initially failed that were used to help validate
SqlPaginationMetadata usage, storage, and creation within the
PureResolver context. The following test classes are directly applicable
for testing query execution through the PureResolver code paths
established in `ExecutionHelper`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLFilterTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLMutationTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLPaginationTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLQueryTests`
-
`Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLSupportedTypesTests`

---------

Co-authored-by: Sean Leonard <sean.leonard@microsoft.com>
seantleonard added a commit that referenced this pull request Mar 4, 2024
(note: descriptions that are quoted were intially authored by GitHub
Copilot and modified for brevity, correctness, and conciseness.
@seantleonard)

## Why this change?

Take advantage from Hot Chocolates execution pipeline to improve
execution of pre-acquired data. Without this change, there exists
inefficiencies in `JsonDocument` object allocations and disposal,
inefficiencies in GraphQL result processing in the `ResolverMiddleware`
class

## What is this change?

+ Introduction of `ResolverTypeInterceptor.cs` class
+ Introduction of `ExecutionHelper.cs` class
+ Introduction of `ArrayPoolWriter.cs` class
> The `ArrayPoolWriter.cs` file provides a way to manage and interact
with pooled arrays, mainly byte arrays, in a more efficient and
controlled manner. Here are some key benefits:

> **Memory Efficiency**: It uses `ArrayPool<byte>`, a shared pool of
byte arrays. This is an efficient way to handle memory because it reuses
arrays, thereby reducing the frequency of garbage collection and the
memory footprint of the application.
>
> **Buffer Management**: The class ArrayPoolWriter implements the
`IBufferWriter<byte>` interface, providing a standard way to write to
the buffer. It also implements the `IDisposable` interface, ensuring
proper cleanup of resources.
>
> **Exception Handling**: The class includes robust error handling,
throwing exceptions when methods are used incorrectly, such as
attempting to advance the buffer past its capacity or trying to use the
writer after it has been disposed.
>
> **Dynamic Buffer Expansion**: If the required capacity exceeds the
current buffer's capacity, it automatically rents a larger buffer from
the pool, copies existing data, and returns the original buffer,
ensuring that the buffer size can grow as needed.
> Access Flexibility: It provides methods to access written data as both
`ReadOnlyMemory<byte>` and `ReadOnlySpan<byte>`, offering flexible ways
to interact with the data.
>
> This class can be beneficial in scenarios where you anticipate writing
to a byte buffer repeatedly or where the amount of data to write may
grow over time.

+ JsonObjectExtensions.cs
> **Easy Conversion**: Provides a straightforward way to convert
`JsonObject` instances to `JsonElement` or `JsonDocument`.
>
> **Memory Optimization**: The conversion process leverages the
ArrayPoolWriter for writing the JsonObject to a pooled buffer, which
avoids the need for serializing to a full JSON string. This can
potentially save memory and improve performance.

+ SqlQueryEngine updates
> The changes in the `SqlQueryEngine.cs` file can be summarized as
follows:
>
> The `ResolveInnerObject` method is now renamed to `ResolveObject`. The
new method handles the case where the JsonElement is of
JsonValueKind.String type and requires parsing into a JSON object. It
also handles the case where the JsonElement is of JsonValueKind.Object
type, directly returning the element.
>
> The `ResolveListType` method is now renamed to `ResolveList`. It has
been refactored to handle two types of `JsonElement`:
`JsonValueKind.Array` and `JsonValueKind.String`. It deserializes both
types into a list of JsonElements. The method now also checks and
handles the case where metadata is not null.
>
> The `public JsonDocument? ResolveInnerObject(JsonElement element,
IObjectField fieldSchema, ref IMetadata metadata)` method signature has
been changed to `public JsonElement ResolveObject(JsonElement element,
IObjectField fieldSchema, ref IMetadata metadata)` to account for no
longer passing back and forth `JsonDocument` objects.
>
> The `public object? ResolveListType(JsonElement element, IObjectField
fieldSchema, ref IMetadata metadata)` method signature has been changed
to `public object ResolveList(JsonElement array, IObjectField
fieldSchema, ref IMetadata? metadata)` to account for no longer passing
back and forth `JsonDocument` objects.
>
> The `ResolveObject` and `ResolveList` methods now have detailed
summary comments explaining their functionality.
An extra check for

`parentMetadata.Subqueries.TryGetValue(QueryBuilder.PAGINATION_FIELD_NAME,
out PaginationMetadata? paginationObjectMetadata)` has been added in the
`ResolveObject` method. If true, `parentMetadata` is updated with
`paginationObjectMetadata`.
>
> Overall, these changes focus on handling the cases where `JsonElement`
is in the form of a string and needs to be parsed into a JSON object or
array, as well as improving the metadata handling.

ExecutionHelper.cs
> The `ExecutionHelper` class contains methods that are primarily used
for interacting with the query engine and mutation engine to execute
queries and mutations, respectively.

> `ExecuteQueryAsync(...)`: Represents the root query resolver and
fetches initial data from the query engine. It accepts a context
parameter and uses the query engine to execute the query. If the
selection type is a list, it will execute a list query, register a
cleanup action to dispose the documents after the query, and set the
result. If not, it will execute a single item query and set the result.
>
> `ExecuteMutateAsync(...)`: Represents the root mutation resolver and
invokes the mutation on the query engine. Similar to the
ExecuteQueryAsync method, it handles both list and single item
mutations.
>
> The `ExecuteLeafField`, `ExecuteObjectField`, `ExecuteListField`
methods are HotChocolate coined "pure resolvers" used to resolve the
results of specific types of fields in the GraphQL request.
>
> The `GetMetadata`, `GetMetadataObjectField`, `SetNewMetadata`,
`SetNewMetadataChildren` methods handling storing and retreiving
SqlPaginationMetadata objects utilized by each level of depth in the
path of a GraphQL result.

- Added fix for DW JSON result differing from MSSQL json result. DW has
extraneous escape characters in the json which results in
JsonDocument/Element not resolving the expected JsonValueKind, and
requires additional JsonDocument.Parse operation to remediate.

## Testing

- Unit testing added for the ArrayPoolWriter functionality. This was
adopted from HotChocolate code.
- GraphQL integration tests that already existed were used to ensure the
refactor maintained the ability to handle request scenarios we test for.
There were tests that initially failed that were used to help validate
SqlPaginationMetadata usage, storage, and creation within the
PureResolver context. The following test classes are directly applicable
for testing query execution through the PureResolver code paths
established in `ExecutionHelper`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLFilterTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLMutationTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLPaginationTests`
  - `Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLQueryTests` -
`Azure.DataApiBuilder.Service.Tests.SqlTests.GraphQLSupportedTypesTests`

Co-authored-by: Michael Staib <michael@chillicream.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance Improvements
5 participants