Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub compatibility mode #998

Merged
merged 21 commits into from
Sep 16, 2021

Conversation

mauroservienti
Copy link
Member

@mauroservienti mauroservienti commented Sep 15, 2021

When using message-driven pub/sub compatibility mode, the transport might throw a rate exceeded due to AWS rate-limiting the usage of some APIs. Specifically ListTopics and ListSubscriptionsByTopic. This PR introduces a rate limiter to prevent AWS exceptions and an in-memory caching layer to minimize the rate limiter usage, which might impact performance.


Problem description

When message-driven pub/sub compatibility mode is enabled, the transport needs to check if the message it's trying to dispatch is a unicast message with a Publish intent, and the subscriber is also subscribed via SNS. If this is the case, the transport doesn't dispatch the message twice; the subscriber will receive it via SNS and not via a unicast send. To check if, in this scenario, subscribers are subscribed via SNS, the transport uses the FindSubscriptionsByTopic API to list all the subscriptions to a topic and retrieve the destination queues.

ListSubscriptionsByTopic is rate limited to 30 requests per second, under high throughput, the limit is very quickly exceeded as an endpoint in the hybrid mode performs the check for every message published.

The problem is particularly evident when the environment is composed of mixed subscribers (both native and message-driven). A similar problem happens if the environment is composed mostly or only of message-driven subscribers, but message-driven pub/sub compatibility mode is enabled on publishers. This is the case when the migration to native pub/sub is about to start. Publishers, when publishing, still needs to check the SNS topology; however, in this second scenario, there'll be no topics (or very few topics), meaning that the transport will continuously try to look for non-existent topics. The API that gets rate limited is ListTopics (rate limited to 30 requests per second).

Implemented solution

This PR introduces an in-memory rate limiter that limits calls, respectively, to the ListTopics and ListSubscriptionsByTopic APIs to prevent rate exceeded exceptions. The rate limiter behaves as a bottleneck and causes some performance degradation. To avoid performance issues, the transport caches ListTopics and ListSubscriptionsByTopic results, by default, for 5 seconds (configuration options are available to tweak the values). It's essential to notice that the transport caches null results too; otherwise, there are scenarios in which the transport will continuously query the AWS API anyway. For example, if an API call to ListTopics returns no results, the not found result (null) is cached. The transport assumes that the topic doesn't exist for the next 5 seconds (by default). The same behavior applies to subscriptions lookup using the ListSubscriptionsByTopic API. If a subscription doesn't exist at the query time, the result is cached as null for the configured time to live.

When using one of the rate-limited APIs, the transport first checks the cache for any cached value. If nothing is available in the cache, it goes through the rate limiter to prevent any rate exceeded exception. However, before hitting the AWS API, it rechecks the cache. The second (inside the rate limiter) check is required to guarantee that when publishing events in a tight loop, every publish attempt hits the AWS API even if the previous one cached values.


Replaces #933, fix #866

mauroservienti and others added 20 commits September 15, 2021 10:53
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
Co-Authored-By: Tomasz Masternak <1092707+tmasternak@users.noreply.github.com>
Co-Authored-By: Dennis van der Stelt <dvdstelt@gmail.com>
@mauroservienti mauroservienti added this to the 5.4.0 milestone Sep 15, 2021
@mauroservienti mauroservienti changed the title Use a rate limiter to fix rate exceed exception when using pub/sub hybrid mode Use a rate limiter to fix rate exceeded exception when using pub/sub hybrid mode Sep 15, 2021
@mauroservienti mauroservienti linked an issue Sep 15, 2021 that may be closed by this pull request
@mauroservienti mauroservienti changed the title Use a rate limiter to fix rate exceeded exception when using pub/sub hybrid mode Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub hybrid mode Sep 15, 2021
Copy link
Member

@tmasternak tmasternak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed typo fixes. Apart from that LGTM

@tmasternak tmasternak merged commit 03f9a61 into release-5.3 Sep 16, 2021
@tmasternak tmasternak deleted the fix-rate-exceeded-exception branch September 16, 2021 10:57
@tmasternak tmasternak added the Bug label Sep 16, 2021
@mauroservienti mauroservienti changed the title Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub hybrid mode Use a rate limiter and in-memory cache to fix rate exceeded exception when using pub/sub compatibility mode Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exception From Send (Rate Exceeded)
3 participants