-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub compatibility mode #998
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com> Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
mauroservienti
changed the title
Use a rate limiter to fix rate exceed exception when using pub/sub hybrid mode
Use a rate limiter to fix rate exceeded exception when using pub/sub hybrid mode
Sep 15, 2021
mauroservienti
changed the title
Use a rate limiter to fix rate exceeded exception when using pub/sub hybrid mode
Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub hybrid mode
Sep 15, 2021
tmasternak
approved these changes
Sep 16, 2021
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed typo fixes. Apart from that LGTM
mauroservienti
changed the title
Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub hybrid mode
Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub compatibility mode
Sep 16, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When using message-driven pub/sub compatibility mode, the transport might throw a
rate exceeded
due to AWS rate-limiting the usage of some APIs. SpecificallyListTopics
andListSubscriptionsByTopic
. This PR introduces a rate limiter to prevent AWS exceptions and an in-memory caching layer to minimize the rate limiter usage, which might impact performance.Problem description
When message-driven pub/sub compatibility mode is enabled, the transport needs to check if the message it's trying to dispatch is a unicast message with a
Publish
intent, and the subscriber is also subscribed via SNS. If this is the case, the transport doesn't dispatch the message twice; the subscriber will receive it via SNS and not via a unicast send. To check if, in this scenario, subscribers are subscribed via SNS, the transport uses theFindSubscriptionsByTopic
API to list all the subscriptions to a topic and retrieve the destination queues.ListSubscriptionsByTopic
is rate limited to 30 requests per second, under high throughput, the limit is very quickly exceeded as an endpoint in the hybrid mode performs the check for every message published.The problem is particularly evident when the environment is composed of mixed subscribers (both native and message-driven). A similar problem happens if the environment is composed mostly or only of message-driven subscribers, but message-driven pub/sub compatibility mode is enabled on publishers. This is the case when the migration to native pub/sub is about to start. Publishers, when publishing, still needs to check the SNS topology; however, in this second scenario, there'll be no topics (or very few topics), meaning that the transport will continuously try to look for non-existent topics. The API that gets rate limited is
ListTopics
(rate limited to 30 requests per second).Implemented solution
This PR introduces an in-memory rate limiter that limits calls, respectively, to the
ListTopics
andListSubscriptionsByTopic
APIs to prevent rate exceeded exceptions. The rate limiter behaves as a bottleneck and causes some performance degradation. To avoid performance issues, the transport cachesListTopics
andListSubscriptionsByTopic
results, by default, for 5 seconds (configuration options are available to tweak the values). It's essential to notice that the transport cachesnull
results too; otherwise, there are scenarios in which the transport will continuously query the AWS API anyway. For example, if an API call toListTopics
returns no results, the not found result (null
) is cached. The transport assumes that the topic doesn't exist for the next 5 seconds (by default). The same behavior applies to subscriptions lookup using theListSubscriptionsByTopic
API. If a subscription doesn't exist at the query time, the result is cached asnull
for the configured time to live.When using one of the rate-limited APIs, the transport first checks the cache for any cached value. If nothing is available in the cache, it goes through the rate limiter to prevent any rate exceeded exception. However, before hitting the AWS API, it rechecks the cache. The second (inside the rate limiter) check is required to guarantee that when publishing events in a tight loop, every publish attempt hits the AWS API even if the previous one cached values.
Replaces #933, fix #866