Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rate limiting can be incorrectly applied to callables with no limit set #2627

Closed
dgw opened this issue Oct 9, 2024 · 3 comments · Fixed by #2629
Closed

Rate limiting can be incorrectly applied to callables with no limit set #2627

dgw opened this issue Oct 9, 2024 · 3 comments · Fixed by #2629
Assignees
Labels
Bug Things to squish; generally used for issues
Milestone

Comments

@dgw
Copy link
Member

dgw commented Oct 9, 2024

I was trying to figure out why I could get multiple URLs to trigger the same plugin callable, but other people in the channel couldn't.

Turns out, I'm a bot admin, and I'm exempt from rate limiting. The others are not.

Digging deeper, I went to bot.Sopel.rate_limit_info() and added some debug logging to the if/elif/elif/else block:

        if user_metrics.is_limited(at_time - rule.user_rate_limit):
            template = rule.user_rate_template
            rate_limit_type = "user"
            rate_limit = rule.user_rate_limit
            metrics = user_metrics
            LOGGER.debug(
                'user_metrics limited %s from running %s due to rate limit of %r',
                trigger.nick, rule.get_rule_label(), rate_limit)
        elif is_channel and channel_metrics.is_limited(at_time - rule.channel_rate_limit):
            template = rule.channel_rate_template
            rate_limit_type = "channel"
            rate_limit = rule.channel_rate_limit
            metrics = channel_metrics
            LOGGER.debug(
                'channel_metrics limited %s from running %s due to rate limit of %r',
                trigger.nick, rule.get_rule_label(), rate_limit)
        elif global_metrics.is_limited(at_time - rule.global_rate_limit):
            template = rule.global_rate_template
            rate_limit_type = "global"
            rate_limit = rule.global_rate_limit
            metrics = global_metrics
            LOGGER.debug(
                'global_metrics limited %s from running %s due to rate limit of %r',
                trigger.nick, rule.get_rule_label(), rate_limit)
        else:
            return False, None

Sure enough, if I joined a test channel as my plebeian alter ego and sent three links that would trigger this callable, only one of them triggered the plugin, and two debug messages got logged:

[2024-10-09 10:52:52,833] sopel.bot            DEBUG    - user_metrics limited totallynotdgw from running danbooru_url
                                                          due to rate limit of datetime.timedelta(0)
[2024-10-09 10:52:52,833] sopel.bot            DEBUG    - user_metrics limited totallynotdgw from running danbooru_url
                                                          due to rate limit of datetime.timedelta(0)

That seems pretty obviously wrong. For reference, the plugin callable's decorators are:

@plugin.url(r'<regex omitted, not relevant>')
@plugin.output_prefix(OUTPUT_PREFIX)
def danbooru_url(bot, trigger):
    ...

No rate limits to be seen anywhere, so why is this getting blocked? That's the mystery to solve.

Version notes

Tested with 17d46b7 (something else I'm working on fixing), but should still reproduce identically on current master or v8.0.0 tag/PyPI release 8.0.0.

@dgw dgw added Bug Things to squish; generally used for issues Needs Triage Issues that need to be reviewed and categorized labels Oct 9, 2024
@dgw
Copy link
Member Author

dgw commented Oct 10, 2024

Did some bisecting, with git bisect good d84ffb277a13f459f9a600f8a625612b6f809ac2 (where Sopel does still respond to multiple links in the same message from a non-admin user) and git bisect bad master. Bisecting pointed to* c280fc4 as the cause of this specific behavior. That's from #2434, which passes the "could this be related?" vibe test.

It turns out, the RuleMetrics object is Kind Of Dumb, Actually and has no idea whether its associated Rule even has a rate limit. I still haven't figured out why it works before, though. RuleMetrics itself could be a red herring; previous code used Rule.is_user_rate_limited() and friends instead of calling RuleMetrics.is_limited() directly, but those methods on the Rule were and are just wrappers around the relevant RuleMetrics methods.

Another thing that's changed, and might be relevant, is that Rule.is_user_rate_limited() used to take just one argument, the nick. Now it takes two, the nick and the at_time, which is expected to be populated by trigger.time at the time of use. Sopel.rate_limit_info() uses the trigger.time when calling each RuleMetrics.is_limited() flavor, in fact.

I've done some more playing around with debug prints in various places, inspecting the control flow both at master and d84ffb2. The bot is clearly considering both URLs separately, so that's not a problem; and the at_time in rate_limit_info() is identical for both, so that isn't directly a problem. I'm well past the amount of time I actually had to spend on this today (whoops), but I'm very suspicious of the RuleMetrics.last_time parameter:

  • If the time_limit is before the last_time, the rule is considered as rate-limited, regardless of whether it's a separate dispatch or the same dispatch, and regardless of whether the rule even has a rate limit.
  • RuleMetrics.started_at can be set when RuleMetrics.ended_at isn't, and RuleMetrics.started_at will be after the time_limit because that's simply how the logic plays out. The time_limit is either the server-time of when the Trigger came in, or the timestamp when Sopel started processing it. RuleMetrics.started_at, if it's set already from a different trigger within the same message, will necessarily be after the time_limit because Sopel starts handlers after receiving the message. (No duh?)

The main issue, I think, is that Sopel tries to rate-limit callables that aren't supposed to be rate-limited. I've gotten lost in the little flowers at the base of one single tree in a forest of thousands, metaphorically speaking, but the Sopelunk through some of this logic was interesting…

* — I had to git bisect skip 9c90af5 because it threw an error about comparing aware & naive datetimes, so perhaps the intermediate state should have been squashed out of the PR history. Oops.

@dgw
Copy link
Member Author

dgw commented Oct 10, 2024

Had a good chat about this with @Exirel just now, and I think we're on the way to solving this. Things did indeed get a little muddled during development of #2434 but we can definitely un-muddle them! Starting with using the Rule's checks for whether it's rate-limited, instead of asking the RuleMetrics directly.

@Exirel Exirel removed the Needs Triage Issues that need to be reviewed and categorized label Oct 12, 2024
@Exirel Exirel added this to the 8.0.1 milestone Oct 12, 2024
@Exirel
Copy link
Contributor

Exirel commented Oct 12, 2024

Found the issue and the fix. See the related PR!

@dgw dgw closed this as completed in #2629 Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Things to squish; generally used for issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants