Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--active-only expensive when many queues are active, add caching option #41

Closed
ryanwitt opened this issue Jun 25, 2019 · 0 comments
Closed

Comments

@ryanwitt
Copy link
Member

After switching to --active-only on jobs that have a large number of dynamic queues, we notice that we start spending a lot of money on GetQueueAttributes calls:

Screen Shot 2019-06-25 at 12 05 01 PM

This makes sense, comparing the --active-only API call complexity with the base case, when a is high, then so are the calls:

Context Calls Details
qdone worker (while listening, per listen round) n + (1 per n×w) w: --wait-time in seconds
n: number of queues
qdone worker (while listening with --active-only, per round) 2n + (1 per a×w) w: --wait-time in seconds
a: number of active queues

However the state of the active queues is very cacheable, especially if queues tend to have large backlogs, as ours do.

I propose we add three options:

  • --cache-url that takes a redis://... cluster url [no default]
  • --cache-ttl-seconds that takes a number of seconds [default 10]
  • --cache-prefix that defines a cache key prefix [default qdone]

The presence of the --cache-url option will cause the worker to cache GetQueueAttributes for each queue for the specified ttl. Probably can use mget for this, if we're careful about key slots.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant