Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ratelimit: try_wait_n to acquire multiple tokens #129

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dgrr
Copy link

@dgrr dgrr commented Dec 13, 2024

Trying to consume multiple limits in a single call, and I need the call to either fail or succeed. I can't call try_wait multiple times because it could fail only on the last one, and still I want to be able to recover from the error.

@brayniac
Copy link
Contributor

I'm not sure that this is the right way to achieve this. An issue to discuss the needs may have been better.

My concern with this approach is that it seems that a mix of requests for small and large amounts of tokens can result in starvation of the requests for larger numbers of tokens - as the small requests will succeed and consume all the budget. I don't think we can merge and release this PR without giving some consideration to fairness concerns. There are several approaches that could be taken there and I think we'd need to weigh the options carefully, provide some mechanism for selecting a matching behavior, or simply deciding this is out-of-scope for this crate.

It would be helpful to know more about your use-case. There might be an easy way to use this crate without any changes. But eitherway, knowing more would help inform an appropriate design.

Thanks!

@dgrr
Copy link
Author

dgrr commented Dec 14, 2024

Hey, right, it could be out of the scope for the crate, that is fine, I already made my own impl of a bucket rate limiting, just wanted to share that if it could be useful to someone else.
The goal of this PR is to be used in somewhat constrained environments for clients. I work with APIs with scarce limits. Sometimes those limits are weighted, for example: requests to /expensive consume 10 tokens, whilst calls to /cheap consume 1.
Indeed multiple calls to /expensive could drain the tokens quickly, but the user should be able to balance that by themselves using next_refill and available parameters, etc...

@brayniac
Copy link
Contributor

brayniac commented Jan 6, 2025

@dgrr - for now, I'm inclined to decline this patch. I'm still concerned about calls to /cheap as being able to starve calls for /expensive when the token bucket contains few available tokens and refill amount is lower than the expensive call's token requirement. (Calls to acquire a single token will succeed and keep the available tokens below the expensive call's amount). We'd likely need to do something like fair queueing, or at least tracking partial acquisition until fulfilling the request for N tokens, or some other mechanism to prevent starvation here.

I think that this change could cause surprise to some users with mixed cost token acquisition and for now we are avoiding that by keeping the crate fairly simple. I think for now one could call try_wait multiple times and track how many more tokens they need to acquire and use some backoff strategy if try_wait fails. Yes, I can see the argument that this starvation scenario is a configuration issue, but it's a complication and potential for surprise that we are avoiding today.

I'm willing to reconsider this if there's enough demand or if we can do something more comprehensive to address fairness / starvation concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants