Query priority #5605

justinjung04 · 2023-10-17T18:03:28Z

What this PR does:

Currently, each tenant has single FIFO request queue in the query path, where queriers assigned for the tenant will pick up next request from the queue.

Having a single FIFO queue results in situations where

Slow queries exhaust all queriers and their concurrency
Fast (or more important) queries get queued up, waiting for next querier to pick them up
If the slow queries took 30 seconds, then the fast (or more important) queries now have 30 seconds added to their latency
If the slow queries timed out, then the fast (or more important) queries are also likely to timeout in the queue (without even reaching querier)

This PR suggests to introduce new "query priority" for each tenant that does the following:

Able to assign priorities to queriers that meet the query attributes (regex, start time, end time)
When query priority is enabled, schedulers use priority queue that orders the requests by priority
You can also have "reserved" queriers that only pick requests with a certain priority. This is useful when you want to guarantee certain number of queriers available to handle high priority requests at any given time.

Here's an example configuration:

  <tenant-id>:
    query_priority:
      enabled: true
      default_priority: 0
      priorities:
        - priority: 3
          reserved_queriers: 1
          query_attributes:
            - regex: "special_query"
        - priority: 2
          query_attributes:
            - regex: "max"
              time_window:
                start: 2h
                end: 0h
        - priority: 1
          query_attributes:
            - time_window:
                start: 24h

Which issue(s) this PR fixes:
n/a

End-to-end test result:

There were 4 things to verify from the end-to-end test:

Confirm high priority requests get dequeued more than lower priority requests
Confirm reserved queriers do not pick up low priority requests
Confirm the tenant limit config change gets picked up by query frontend and scheduler without the pods restarting
Check memory and cpu usage

In short, all of them were verified successfully with some cool remarks:

P50 and avg latency droped significantly, as my test was assigning higher priority to ingester queries (data within 24h)
Number of timeouts decreased, at the cost of more 429s and higher queue length (queue length increased with low queries not handled in time, which can be addressed by scaling up queriers in real world scenario)
Memory usage of QFE is better when the query priority is enabled
Memory usage of scheduler is slightly worse when the query priority is enabled, but negligible

Image 1: Metrics with new labels showing smooth transition between FIFO and priority queue.

Image 2: Change in QPS per status code and latency.

Image 3: Change in CPU and memory usage.

Image 4: One reserved querier waiting for priority 3 query that was never enqueued.

Test setup:

Queries ran continuously:

"query?query=max({__name__=~'metric_[0-4]'})&time=0h
"query?query=avg({__name__=~'metric_[1-5]'})&time=-12h
"query?query=min({__name__=~'metric_[2-6]'})&time=-32h

Query priority config used:

    query_priority:
      enabled: true
      default_priority: 0
      priorities:
        - priority: 3
          reserved_queriers: 1
          query_attributes:
            - regex: "special_query"
        - priority: 2
          query_attributes:
            - regex: "max"
              time_window:
                start: 2h
                end: 0h
        - priority: 1
          query_attributes:
            - time_window:
                start: 24h

Timeline:

14:00 | Applied new image that contains this PR
14:50 | Enabled query priority config
15:45 | Disabled query priority config without removing the priority definitions

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

justinjung04 · 2023-10-20T17:00:59Z

Since this is a big change, I'm running test in my dev env as well. Once I have the result I'll share it here.

yeya24 · 2023-10-22T17:46:59Z

Can you please add an E2E test case for this feature? It would be good to show that the reserved querier only processes high priority query

pkg/scheduler/schedulerpb/scheduler.proto

pkg/util/query/priority_test.go

pkg/scheduler/queue/queue.go

pkg/scheduler/queue/user_queues.go

pkg/frontend/v2/frontend.go

pkg/util/validation/limits.go

pkg/frontend/v2/frontend.go

pkg/scheduler/queue/queue_test.go

alanprot · 2023-10-26T19:24:38Z

I know we are using a chan now as a queue.. but why not use a real priority queue instead of 2 queue so we can have any number of priorities (instead just HIGH and LOW)

I would also have try to have some kind of heuristic to set this priority? setting this manually seems not scalable.

justinjung04 · 2023-11-10T20:59:34Z

Major upgrade completed, @yeya24 and @alanprot could you take a look?

Meanwhile I'll run beta test to check the followings:

Confirm high priority requests get dequeued more than lower priority requests
Confirm reserved queriers do not pick up low priority requests
Confirm the tenant limit config change gets picked up by query frontend and scheduler without the pods restarting
Check if memory usage changes significantly when using priority queue

justinjung04 · 2023-11-15T13:48:25Z

E2E testing completed and the result looks awesome. Added to the PR description.

justinjung04 · 2023-11-15T22:42:20Z

pkg/util/priority_queue.go

 	queue       queue
 	lengthGauge prometheus.Gauge
 }

-// Op is an operation on the priority queue.
-type Op interface {
-	Key() string


Generalizing PriorityQueue by removing the Key and duplicate key check. Confirmed that this code is not used elsewhere in the codebase.

yeya24

Thanks. Overall I think the pr is in the right direction and in good shape. I have added some comments.

pkg/util/query/priority.go

pkg/scheduler/queue/user_request_queue.go

pkg/frontend/v2/frontend.go

pkg/frontend/v2/frontend_test.go

pkg/util/validation/limits.go

pkg/util/query/priority.go

pkg/frontend/transport/roundtripper.go

Signed-off-by: Justin Jung <jungjust@amazon.com>

…auge vector in user request queue Signed-off-by: Justin Jung <jungjust@amazon.com>

Signed-off-by: Justin Jung <jungjust@amazon.com>

pkg/querier/tripperware/priority.go

pkg/util/validation/limits.go

pkg/frontend/transport/handler.go

pkg/querier/tripperware/instantquery/instant_query.go

pkg/frontend/transport/handler.go

Signed-off-by: Justin Jung <jungjust@amazon.com>

damnever · 2023-11-29T09:12:14Z

We should also consider this scenario: a large number of high-priority requests could potentially cause lower-priority ones to time out.

justinjung04 · 2023-11-29T17:19:41Z

We should also consider this scenario: a large number of high-priority requests could potentially cause lower-priority ones to time out.

Yes, I wanted to make this PR as a preliminary feature that assumes small list of priorities and querier scale up if too many queries are timing out.

I agree that this feature should be improved by introducing fairness logic between different priorities, but wanted to keep the initial PR simple as the code change involved with introducing priority itself seemed big enough.

Signed-off-by: Justin Jung <jungjust@amazon.com>

docs/configuration/config-file-reference.md

Signed-off-by: Justin Jung <jungjust@amazon.com>

yeya24

LGTM! Awesome work

justinjung04 · 2023-11-30T09:36:48Z

Performed another end-to-end test, and things are still working as expected.

pull-request-size bot added the size/L label Oct 17, 2023

justinjung04 force-pushed the scheduler-improvement branch 2 times, most recently from 9a9275c to 5a57717 Compare October 17, 2023 21:09

pull-request-size bot added size/XL and removed size/L labels Oct 18, 2023

justinjung04 marked this pull request as ready for review October 19, 2023 04:26

justinjung04 force-pushed the scheduler-improvement branch from fc1260d to c75a0ca Compare October 19, 2023 17:06

yeya24 reviewed Oct 22, 2023

View reviewed changes

oberkem reviewed Oct 24, 2023

View reviewed changes

pkg/scheduler/queue/queue_test.go Outdated Show resolved Hide resolved

justinjung04 force-pushed the scheduler-improvement branch from fc9351a to 20692fa Compare October 30, 2023 21:17

justinjung04 marked this pull request as draft November 1, 2023 18:08

justinjung04 force-pushed the scheduler-improvement branch from 280dbab to 6044ec7 Compare November 1, 2023 18:58

justinjung04 changed the title ~~High priority query queue~~ Priority query Nov 7, 2023

pull-request-size bot added size/XXL and removed size/XL labels Nov 8, 2023

justinjung04 changed the title ~~Priority query~~ Query priority Nov 8, 2023

justinjung04 force-pushed the scheduler-improvement branch from 09610c2 to 17bdef6 Compare November 9, 2023 00:47

justinjung04 marked this pull request as ready for review November 10, 2023 20:50

justinjung04 force-pushed the scheduler-improvement branch 2 times, most recently from 5d0746f to ededdc2 Compare November 15, 2023 12:59

justinjung04 commented Nov 15, 2023

View reviewed changes

yeya24 reviewed Nov 16, 2023

View reviewed changes

justinjung04 added 3 commits November 28, 2023 09:34

Separate user queue channel into two - normalQueue and highPriorityQueue

52d520b

Signed-off-by: Justin Jung <jungjust@amazon.com>

Add ReservedHighPriorityQueriers

646853a

Signed-off-by: Justin Jung <jungjust@amazon.com>

Change default priority for all requests to low

39639c4

Signed-off-by: Justin Jung <jungjust@amazon.com>

justinjung04 added 14 commits November 28, 2023 09:42

Refactor

a510f77

Signed-off-by: Justin Jung <jungjust@amazon.com>

Add more tests

717d746

Signed-off-by: Justin Jung <jungjust@amazon.com>

Lint

f3a9772

Signed-off-by: Justin Jung <jungjust@amazon.com>

Update doc

87c9a0e

Signed-off-by: Justin Jung <jungjust@amazon.com>

Improve time comparison when assigning priority

506d5c2

Signed-off-by: Justin Jung <jungjust@amazon.com>

Make reserved querier to match exact priority + change query length g…

c0f268d

…auge vector in user request queue Signed-off-by: Justin Jung <jungjust@amazon.com>

Bug fix

0b98773

Signed-off-by: Justin Jung <jungjust@amazon.com>

Add comments

3fffc3f

Signed-off-by: Justin Jung <jungjust@amazon.com>

Address comments

2abeffb

Signed-off-by: Justin Jung <jungjust@amazon.com>

Make reserved querier to handle priorities higher or equal

ab4b9c7

Signed-off-by: Justin Jung <jungjust@amazon.com>

Add benchmark tests

8086aee

Signed-off-by: Justin Jung <jungjust@amazon.com>

Update regex to be compiled upon unmarshal

584c6f9

Signed-off-by: Justin Jung <jungjust@amazon.com>

Make start and end time check to be skipped if not specified

4001fe4

Signed-off-by: Justin Jung <jungjust@amazon.com>

Assign priority before splitting the query

aa38a8c

Signed-off-by: Justin Jung <jungjust@amazon.com>

justinjung04 force-pushed the scheduler-improvement branch from 621ca87 to aa38a8c Compare November 28, 2023 17:42

yeya24 reviewed Nov 28, 2023

View reviewed changes

pkg/querier/tripperware/priority.go Outdated Show resolved Hide resolved

yeya24 reviewed Nov 28, 2023

View reviewed changes

pkg/util/validation/limits.go Outdated Show resolved Hide resolved

pkg/frontend/transport/handler.go Outdated Show resolved Hide resolved

pkg/querier/tripperware/instantquery/instant_query.go Outdated Show resolved Hide resolved

pkg/frontend/transport/handler.go Show resolved Hide resolved

Attempt to fix tests

6c1813b

Signed-off-by: Justin Jung <jungjust@amazon.com>

Address comments

cdd2a0c

Signed-off-by: Justin Jung <jungjust@amazon.com>

alanprot reviewed Nov 29, 2023

View reviewed changes

docs/configuration/config-file-reference.md Outdated Show resolved Hide resolved

justinjung04 added 4 commits November 29, 2023 17:00

Minor improvements

3f347e4

Signed-off-by: Justin Jung <jungjust@amazon.com>

Rename query start end time

3ee0b0d

Signed-off-by: Justin Jung <jungjust@amazon.com>

Improve tests

c0252f1

Signed-off-by: Justin Jung <jungjust@amazon.com>

Nit

b6eb74d

Signed-off-by: Justin Jung <jungjust@amazon.com>

yeya24 approved these changes Nov 30, 2023

View reviewed changes

alanprot approved these changes Nov 30, 2023

View reviewed changes

yeya24 merged commit e85a331 into cortexproject:master Nov 30, 2023
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Query priority #5605

Query priority #5605

justinjung04 commented Oct 17, 2023 •

edited

Loading

justinjung04 commented Oct 20, 2023

yeya24 commented Oct 22, 2023

alanprot commented Oct 26, 2023

justinjung04 commented Nov 10, 2023

justinjung04 commented Nov 15, 2023 •

edited

Loading

justinjung04 Nov 15, 2023 •

edited

Loading

yeya24 left a comment

damnever commented Nov 29, 2023

justinjung04 commented Nov 29, 2023 •

edited

Loading

yeya24 left a comment

justinjung04 commented Nov 30, 2023

Query priority #5605

Query priority #5605

Conversation

justinjung04 commented Oct 17, 2023 • edited Loading

justinjung04 commented Oct 20, 2023

yeya24 commented Oct 22, 2023

alanprot commented Oct 26, 2023

justinjung04 commented Nov 10, 2023

justinjung04 commented Nov 15, 2023 • edited Loading

justinjung04 Nov 15, 2023 • edited Loading

Choose a reason for hiding this comment

yeya24 left a comment

Choose a reason for hiding this comment

damnever commented Nov 29, 2023

justinjung04 commented Nov 29, 2023 • edited Loading

yeya24 left a comment

Choose a reason for hiding this comment

justinjung04 commented Nov 30, 2023

justinjung04 commented Oct 17, 2023 •

edited

Loading

justinjung04 commented Nov 15, 2023 •

edited

Loading

justinjung04 Nov 15, 2023 •

edited

Loading

justinjung04 commented Nov 29, 2023 •

edited

Loading