[Query Insights] Capture query-level resource usage metrics #12399

ansjcy · 2024-02-20T22:40:44Z

Is your feature request related to a problem? Please describe

The resource tracking framework (#1179) tracks task-level resource usage, such as CPU and memory utilization. However there's a gap to infer query-level resource usage from the resource tracking framework. We need to come up a solution for it since it would one of the most important metrics for query insights (#11429) features like top n queries (#11186) and also cost estimations (#12390).

Describe the solution you'd like

The most challenging task here is how to propagate the task-level resource usage information to the coordinator node for calculating query-level resource usage. The most straightforward solution is to piggyback the resource usage data as part of the SearchPhaseResult node response and use SearchRequestOperationsListener::onPhaseEnd to extract this information from the phase results and forward it to the query insights framework. However, this approach has limitations as the obtained resource usage data may not be entirely accurate. The reason is explained below.

Here's the workflow of a search request and resource tracking: The coordinator node sends requests to data nodes and the data nodes will create tasks to do search on shards. On a data node,

The data node creates tasks and starts request tracking using the resource tracking framework;
The data node sends back the SearchPhaseResult to the coordinator node;
The data node stops thes task and stops request tracking, and records the final resource utilization in resource tracking framework.

If we want to piggyback the resource utilization data in SearchPhaseResult, we must retrieve this data before the task is considered "finished." Through some experiments and analysis, I found reading the resource utilization data before the second step would result in up to ~10% lower CPU and Memory utilization compared to the final actual usage. If the actual results are not accurate at all, the data would be of no use except for roughly analyzing the overall usage trend and "relative" resource usage comparasion between 2 queries - We won't be able to use this data to make reliable query cost estimations.

Related component

Search:Query Insights

Describe alternatives you've considered

Another approach is to implement an asynchronous post-processor as part of the query insights data consumption pipeline. This post-processor would periodically gather data from data nodes and correlate it with queries to calculate the final resource usage accurately. While this method ensures the most accurate resource usage data, it comes with the overhead of introducing a periodic job running in the background to collect and share the data between nodes. We need to consider the trade-offs when deciding on the best approach for capturing query-level resource usage.

Additional context

Query Insights Meta issue: [META] Generic Query Insights Framework #11522
Top N Queries RFC: [RFC] Real-time Insights into Top N Queries by Latency and Resource Usage #11186

The text was updated successfully, but these errors were encountered:

peternied · 2024-02-21T16:23:35Z

[Triage - attendees 1 2 3 4 5]
@ansjcy Thanks for filing

ansjcy · 2024-02-24T02:05:53Z

Draft for the proposed solution: #12449

Instrument resource usages before a task finishes on a data node, more specifically, get resource usages at the last step (serialization) before a phase response is sent from query/fetch/.. phase.
Piggyback the resource usages data with phase results.
Gather resource usages for all shard search tasks on coordinator node (in query insights plugin) to get the query-level resource usage.

deshsidd · 2024-02-26T20:18:54Z

@ansjcy Thanks for the above solution and the proposed alternative.

Did we get some numbers regarding how much percentage lower cpu and memory we can expect with the proposed approach?

In the proposed approach, seems like if we stop the task and resource tracking before sending the result back to the co-ordinator node we will end up increasing the overall search latency which is undesirable.

For the alternative solution, can we piggyback any other background jobs that periodically share data to other nodes?
I am interested in the alternative approach if the overhead is not too drastic.

sgup432 · 2024-02-26T23:29:28Z

@ansjcy
I had few questions:

How are we going to determine/calculate the top N queries at a coordinator/request level? Considering a request fans out to multiple nodes, where a query can consume(for example) 70% in one data node, 60% in another data node and 20% at a coordinator node. How are we planning to make sense of these?
As you mentioned, task resource tracking does run after a shard level response has been sent. That seems to happens here. To consider sync approach ie piggybacking along with response, have we checked if it is feasible to attach the resourceTask listener in a way that it runs before response is returned? Otherwise it may be inaccurate as you mentioned and it might not make sense.
I think background approach might be better. Considering we have a coordinator node level parent taskId and multiple associated child taskIds(shard level), can we store all these info on each node and then aggregate it later by using these relationships?

ansjcy · 2024-02-27T02:14:46Z

Another draft for the background job approach: #12473
With this approach we can get accurate resource usage measurements for each task from the cluster manager node.

ansjcy · 2024-02-27T02:25:52Z

Thanks for the comments!

Did we get some numbers regarding how much percentage lower cpu and memory we can expect with the proposed approach?

Based on my measurements of the approach (#12449), the measured result would be ~10% lower than the accurate result.

In the proposed approach, seems like if we stop the task and resource tracking before sending the result back to the co-ordinator node we will end up increasing the overall search latency which is undesirable.

Yes, this is another concern I have. But the latency impact might be ignorable since we are just reading one value and append a new field, both are O(1) operations. But yes, we need to do more benchmarking to understand the performance impact.

For the alternative solution, can we piggyback any other background jobs that periodically share data to other nodes?
I am interested in the alternative approach if the overhead is not too drastic.

Yes, this would be a benefit of this approach - we can add customized data in the future, without worrying about adding impact to the search requests.

How are we going to determine/calculate the top N queries at a coordinator/request level? Considering a request fans out to multiple nodes, where a query can consume(for example) 70% in one data node, 60% in another data node and 20% at a coordinator node. How are we planning to make sense of these?

As I mentioned in the description, there are 2 ways to do this, we either piggyback the resource usages on each node in SearchPhaseResult before the task finish, or have a background job to periodically sync the task-level resource usage. Either way, we need to sum up the resource usage from each node to get the query-level resource usages.

As you mentioned, task resource tracking does run after a shard level response has been sent. That seems to happens here. To consider sync approach ie piggybacking along with response, have we checked if it is feasible to attach the resourceTask listener in a way that it runs before response is returned? Otherwise it may be inaccurate as you mentioned and it might not make sense.

A task is considered as finished only after the response is sent to the coordinator node. The first approach mentioned above actually implements "attach the resourceTask listener in a way that it runs before response is returned". It reads the thread level resource usage at the end of shard search request (in serialization) and return to coordinator node. Whether we attach another resourceTask listener or not, it won't change the fact it is not an accurate measurement of the search task resource usage.

I think background approach might be better. Considering we have a coordinator node level parent taskId and multiple associated child taskIds(shard level), can we store all these info on each node and then aggregate it later by using these relationships?

Agreed! I have a draft for this approach, exactly as you descirbed :) #12473
The resource usage might be higher for the second approach and it might add a burden to the cluster manager node, but it would be much straightforward and won't have potential impact on the search workflow.

sgup432 · 2024-02-27T19:46:03Z

@ansjcy

Either way, we need to sum up the resource usage from each node to get the query-level resource usages.

I had a doubt on this ie whether a simple sum is the right way to portray top N expensive queries.

Consider two queries(on a 2 node cluster):

1st query:
Query consumes 30% at a coordinator node, and when it is fanned out to 2 nodes, a query consumes 30% each on both nodes. Lets say this query requires another hop to both nodes(lets say max_concurrent_shard_requests is set to 1) and again consumes 30% each. So overall it consumes (30 + 60*2) = 150% overall

2nd query:
Query consumes 20% at a coordinator node, but consumes 100% on one of the data node. Overall: 120% overall

2nd query consumed less CPU overall but had more impact as took down one of the data node. So if a user says give me the top expensive query, should we still return 1st query? Doesn't seem very right.

reta · 2024-02-28T19:40:38Z

@ansjcy would do you think about the approach that separates tasks and queries: we do track task resource utilization and there is pretty clear link between query (or better to say, search) and the tasks its execution spawns across the nodes. If we capture the tasks information separately (as it is now) but made it available after query finishes and response is returned to the user (so the completion times will be captured accurately)? Yes, that could be a separate call, very likely (or as with explain, we could ask coordinator to consolidate task tracking for the query).

ansjcy · 2024-03-04T20:54:40Z

@sgup432

2nd query consumed less CPU overall but had more impact as took down one of the data node. So if a user says give me the top expensive query, should we still return 1st query? Doesn't seem very right.

Deciding "which query is more expensive" is out of scope for this issue. This issue focus only on "how to get the query-level resource usage (CPU, memory usage) metrics". It could be a future improvements of the top n queries feature - we can come up with a "scoring" mechanism to evaluate which query is more expensive based on multiple metrics like latency, resource usage, index/shard involved etc.

ansjcy · 2024-03-04T20:57:12Z

@reta Thanks for the comment! Yes, in fact the implementation of this draft PR: #12473 is similar to what you described. We capture and store the task level resource usages in the query insights plugin after the tasks finish, and consolidate them on the cluster manager node. It would be nice if you can also take a look at the draft when you got time :)

sgup432 · 2024-03-05T21:15:09Z

@ansjcy
Sure. But I think calculating scoring(or resource estimation) will be an important to make sense of the data for the user and take decisions if needed.
Maybe you can expose an interface with default implementation being simpleSum. Eventually we can plugin any kind of implementation later if needed.

reta · 2024-03-08T20:48:14Z

@sgup432 I think having per-query resource estimation would be a tremendously useful feature, however not easy to implement, if you have viable proposal - please share, otherwise we sadly piggy back on the resource tracking approaches all the time ...

ansjcy added enhancement Enhancement or improvement to existing feature or request untriaged labels Feb 20, 2024

ansjcy changed the title ~~[Query Insights] Capture query level resource usage metrics~~ [Query Insights] Capture query-level resource usage metrics Feb 20, 2024

github-actions bot added the Search:Query Insights label Feb 20, 2024

github-project-automation bot added this to Search Project Board Feb 20, 2024

github-project-automation bot moved this to 🆕 New in Search Project Board Feb 20, 2024

ansjcy added this to Performance Roadmap Feb 20, 2024

github-project-automation bot moved this to Todo in Performance Roadmap Feb 20, 2024

ansjcy moved this from Todo to In Progress in Performance Roadmap Feb 20, 2024

ansjcy self-assigned this Feb 20, 2024

ansjcy mentioned this issue Feb 20, 2024

[META] Generic Query Insights Framework #11522

Open

33 tasks

peternied removed the untriaged label Feb 21, 2024

ansjcy mentioned this issue Apr 12, 2024

Query-level resource usages tracking #13172

Merged

8 tasks

getsaurabh02 moved this from In Progress to In-Review in Performance Roadmap Apr 29, 2024

ansjcy mentioned this issue May 17, 2024

consume query level cpu and memory usage in query insights #13739

Merged

9 tasks

getsaurabh02 added the v2.15.0 Issues and PRs related to version 2.15.0 label May 28, 2024

rishabhmaurya mentioned this issue May 30, 2024

Add fetch phase to search profile #1764

Open

getsaurabh02 added this to OpenSearch Roadmap May 31, 2024

github-project-automation bot moved this to Planned work items in OpenSearch Roadmap May 31, 2024

jed326 closed this as completed in #13172 Jun 7, 2024

github-project-automation bot moved this from 🆕 New to ✅ Done in Search Project Board Jun 7, 2024

github-project-automation bot moved this from In-Review to Done in Performance Roadmap Jun 7, 2024

ansjcy mentioned this issue Dec 5, 2024

[WIP] Improve serialization for TaskResourceInfo #16700

Open

3 tasks

ansjcy mentioned this issue Feb 20, 2025

[Feature Request] Optimizing Resource Usage Header Performance for Search Requests #17407

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Query Insights] Capture query-level resource usage metrics #12399

[Query Insights] Capture query-level resource usage metrics #12399

ansjcy commented Feb 20, 2024 •

edited

Loading

peternied commented Feb 21, 2024

ansjcy commented Feb 24, 2024 •

edited

Loading

deshsidd commented Feb 26, 2024

sgup432 commented Feb 26, 2024

ansjcy commented Feb 27, 2024

ansjcy commented Feb 27, 2024 •

edited

Loading

sgup432 commented Feb 27, 2024

reta commented Feb 28, 2024

ansjcy commented Mar 4, 2024

ansjcy commented Mar 4, 2024

sgup432 commented Mar 5, 2024

reta commented Mar 8, 2024 •

edited

Loading

[Query Insights] Capture query-level resource usage metrics #12399

[Query Insights] Capture query-level resource usage metrics #12399

Comments

ansjcy commented Feb 20, 2024 • edited Loading

Is your feature request related to a problem? Please describe

Describe the solution you'd like

Related component

Describe alternatives you've considered

Additional context

peternied commented Feb 21, 2024

ansjcy commented Feb 24, 2024 • edited Loading

deshsidd commented Feb 26, 2024

sgup432 commented Feb 26, 2024

ansjcy commented Feb 27, 2024

ansjcy commented Feb 27, 2024 • edited Loading

sgup432 commented Feb 27, 2024

reta commented Feb 28, 2024

ansjcy commented Mar 4, 2024

ansjcy commented Mar 4, 2024

sgup432 commented Mar 5, 2024

reta commented Mar 8, 2024 • edited Loading

ansjcy commented Feb 20, 2024 •

edited

Loading

ansjcy commented Feb 24, 2024 •

edited

Loading

ansjcy commented Feb 27, 2024 •

edited

Loading

reta commented Mar 8, 2024 •

edited

Loading