-
Notifications
You must be signed in to change notification settings - Fork 95
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
uniqueness does not work at scale #446
Comments
I'm going to look into this, but although we can speed it up, I'm a bit worried that it'll be hard to get to something that works well for you — it sounds like your app is fundamentally churning through so much work that a very busy DB will be somewhat inevitable. |
Opened #451. Should make unique insertions something like 20-45x faster as long as you stay within the default set of unique states. |
super excited. we are in this happy path, so i expect it to speed up our scheduling by a lot. @bgentry this is a little off topic maybe, but how do you recommend people do long term job metrics? do you think should we write something that views the river jobs table and exports prometheus metrics (like river-prometheus-exporter), or instrument our workers similar to how we instrument tracing (wrapping work functions with tracing instrumentation) im not too sure which was the vision you had for river - so we havent made a move here yet. |
My 100% recommendation is to instrument the workers or use the client subscriptions to do this kind of metrics work. As your job table grows then scanning it in any way other than the exact queries used by River are going to have severe performance impacts. |
This issue is a followup to #346
@brandur gave me this recommendation
however, this solution currently schedules hundreds of rps across our clusters, which is causing a lot of extra load, across all the job logic + the notifier.
more importantly, we have ~200-400k unique units of work every hour or so, but we would really like these things to be done every 15 minutes. without a uniqueness filter, it schedules millions of units of work every hour that while they do end up getting deduplicated at work-time, at the expense of large amounts of db work that ends up slowing down other calculations and other routines, which causes a vicious cycle of more jobs not getting completed, and more jobs piling up.
a side effect is this also causes is that the few places where we do schedule unique to become very slow, and so we basically can't use the unique feature in any jobs without fear those scheduling operations taking multiple seconds because of all the operations currently going on in the jobs table.
we could move river to a separate postgres cluster, but at that point, we would migrate away from river, because the advantage of it running in the same database as our data is gone.
for now we are likely going to implement our own hooks on top of the existing river client using inserttx to not schedule tasks when we dont need to - but it really feels like a weakness of river's unique insert feature. i'm still not really sure who it's for, since it can't scale to any reasonable throughput, and also is missing a good amount of features that come standard in other work queues (the most obvious that comes to mind is being able to do subset of args).
it would be really nice if there was some sort of uniqueness mechanism that didn't use advisory locks, for instance, a nullable unique column with a user-definable id on input in the jobs column immediately comes to mind. this would allow me to de-duplicate tasks by a subset of arguments and time interval/sequence id, which is more than enough for me.
The text was updated successfully, but these errors were encountered: