Predict *RateLimitError, return immediately without network call (9700x speedup when rate limit exceeded). #347

dmitshur · 2016-04-22T10:25:11Z

This (minor) optimization is possible because the client keeps track of the rate limits from previous API calls, and GitHub's rate limiting model is known in advance, and is predictable.

In situations where it's possible to predict that the network API call will result in a *RateLimitError, simply return it immediately instead of making the network API call to GitHub's servers. This is both faster for the client (by around 9700 times in my benchmarks) because it avoids the network round-trip latency, and better for GitHub's servers that won't need to needlessly reject network requests.

Note that this optimization only kicks in when the client is already exceeding the rate limit and it hasn't been reset yet. It has no effect during most normal usage when rate limit is not exceeded.

Benchmarks

I wrote a small benchmark (it assumes you've already exhausted the GitHub API rate limit before you begin):

func BenchmarkClientDo_rateLimitError(b *testing.B) {
    gh := github.NewClient(nil)
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        _, _, err := gh.Repositories.Get("shurcooL", "vfsgen")
        if _, ok := err.(*github.RateLimitError); !ok {
            log.Println("no *github.RateLimitError error, but expected it")
        }
    }
}

Results before this PR (with a latency to api.github.com of around 85 ms):

$ go test -bench=. -benchmem
testing: warning: no tests to run
PASS
BenchmarkClientDo_rateLimitError-8        20      97057828 ns/op       55922 B/op        126 allocs/op
ok          2.609s

Results after this PR:

go test -bench=. -benchmem
testing: warning: no tests to run
PASS
BenchmarkClientDo_rateLimitError-8    200000         10005 ns/op        2763 B/op         34 allocs/op
ok          3.576s

In this limited scenario, it's a speedup by 9700 times.

Helps #152 and #153.

gmlewis · 2016-04-22T16:28:12Z

github/github.go

+// category returns the rate limit category of the endpoint, determined by Request.URL.Path.
+func category(path string) rateLimitCategory {
+	switch {
+	default:


Curious... why do you list default: first? I don't remember seeing it that way before.

It's a style I've iterated towards after using switch statements more for multiplexing across multiple (usually non-overlapping) branching decisions in Go. I prioritize the order of cases, returning coreCategory and searchCategory, to be more consistent with how they're defined, over the conditions that lead to those return values.

I read it as:

For all paths, return coreCategory, except when it has /search/ prefix, then return searchCategory.

I personally like it because I think the cases themselves are more important than the conditions to trigger them, but I know this is a less common style. If people feel this is worse, I can revert to a more traditional style of sorting by conditions (putting default last).

For other people reading, I want to note that the order of case expressions in Go is important... they are evaluated left-to-right and top-to-bottom and the first one that matches wins (except for default which wins when no others match).
https://golang.org/ref/spec#Switch_statements

Keeping it this way is fine with me.

dmitshur · 2016-04-24T07:59:09Z

@gmlewis, I've addressed your current feedback, PTAL.

gmlewis · 2016-04-24T16:08:44Z

Thanks, @shurcooL! LGTM 👍
Awaiting review by @willnorris

See google/go-github#347.

dmitshur · 2016-04-30T04:51:37Z

@willnorris Should I take the longer time to review to mean you're not as confident about this change and not sure how to respond to it, or have you simply not gotten around to it yet?

This change, at a high level, is very much in line with what you discussed/proposed in these two comments:

#152 (comment)
#153 (comment)

I think the high level design is solid.

About the implementation, I actually took a lot of time to come up with how to achieve the desired behavior, because I knew there were more than 1 sets of rate limits, depending on the route. But I think the final solution I came up with is pretty decent and works. But of course I welcome any potential suggestions for how to make the implementation even cleaner.

Let me know if I can do anything to help here.

willnorris · 2016-05-02T17:25:43Z

just a case of me being out of town when the PR came in, and then busy with other things. Looking at it now.

willnorris · 2016-05-02T17:41:34Z

github/github.go

@@ -323,7 +324,7 @@ func parseRate(r *http.Response) Rate {
 // current rate.
 func (c *Client) Rate() Rate {
 	c.rateMu.Lock()
-	rate := c.rate
+	rate := c.rateLimits[c.mostRecent]


this seems potentially problematic, since if a single client is being used for both search and non-search calls, the "most recent category" could get switched out from under you by another go routine before you have a chance to call Rate. What if Rate always returns the core rate (like it did before), and we provide a different mechanism to access to other rates? Or alternatively, we just instruct callers to pull that rate off the response? checkRateLimitBeforeDo would still handle both categories as it does currently.

actually yeah, we don't really need to provide another mechanism... we already instruct users to call client.RateLimits for the most up to date values. That should be sufficient, no?

if a single client is being used for both search and non-search calls, the "most recent category" could get switched out from under you by another go routine before you have a chance to call Rate.

If I'm not mistaken, that cannot happen. Both rateLimits and mostRecent are in a critical section protected by rateMu mutex. That mutex is very important to prevent this problem. Is that not the case?

Oh, I think I see what you meant. You weren't referring to a race condition in the code, but a higher-level "logical" race condition. E.g., a client makes a core category API call and then calls Rate expecting to see rate limit for that category, but may instead get rate limit for the search category.

I agree that's potentially bad and should be considered. I'll think about this and get back to you later.

One observation is that each API call already returns a Response which contains the rate limit from that very API call. So there's really no need for client to call Rate. You mentioned that in "we just instruct callers to pull that rate off the response".

What if Rate always returns the core rate (like it did before)

Now that I think about it more, are you sure that's what Rate currently does? Doesn't it already always return the latest rate limit from either of two categories, whichever was called last? In other words, this PR might be currently preserving previous behavior.

If you're sure that the current Rate only contains core category rate limits, can you show me where that happens and what prevents a call to search from updating rate?

It looks to me that search API calls would update it also. See here:

go-github/github/github.go

Lines 350 to 352 in 81ea1e5

c.rateMu.Lock()

c.rate = response.Rate

c.rateMu.Unlock()

go-github/github/search.go

Line 157 in 81ea1e5

return s.client.Do(req, result)

yep, looks like you're right... search is already updating the client rate, which leads to this same problem. Maybe we should just deprecate this method entirely since it's likely to cause problems, and just instruct people to use the rate limit on the response, or to call client.RateLimits? What do you think?

Maybe we should just deprecate this method entirely since it's likely to cause problems, and just instruct people to use the rate limit on the response, or to call client.RateLimits? What do you think?

I am absolutely in favor of doing that, given those 2 better alternatives exist, and the current behavior is pretty unreliable (it can give you core or search rate limit, effectively randomly).

Sounds good. Could you go ahead and update the doc for this method then as part of this PR?

Done in 13c2701. (I've also squashed the previous commits that were implementing code review suggestions to have a clean, logical history.)

In GitHub API, there are currently two categories of rate limits: core and search. Previously, the client only tracked the rate limit from the most recent API call. This change makes it track both rate limit categories. This will be useful in the following commit.

When possible to do so reliably, i.e., the client knows that the rate limit is exceeded and reset time is in the future, immediately return *RateLimitError without making network calls to GitHub API. Add a unit test covering RateLimits method populating client.rateLimits. Remove commented out code. Helps #152 and #153.

Client.Rate() method is unreliable in situations when API calls are made by others that may hit different rate limit categories (core and search). Each API call already returns a Response struct that contains an accurate Rate, so that's a better mechanism to access this information. See relevant discussion at #347 (comment).

willnorris · 2016-05-04T15:27:28Z

looks great, thanks!

dmitshur · 2016-05-04T15:48:57Z

Awesome, thanks! It's not every day one gets a chance to submit a PR with a 9700x perf improvement. :P

This is a followup to #555. mostRecent was created specifically to support Rate method in #347. That method is now gone (removed in #555), so mostRecent is unused and can be safely removed.

This is a followup to google#555. mostRecent was created specifically to support Rate method in google#347. That method is now gone (removed in google#555), so mostRecent is unused and can be safely removed.

gmlewis reviewed Apr 22, 2016
View reviewed changes

dmitshur added a commit to shurcooL/play that referenced this pull request Apr 25, 2016

Benchmark for predictive *RateLimitError for go-github.

fa28e46

See google/go-github#347.

willnorris reviewed May 2, 2016
View reviewed changes

dmitshur added 3 commits May 3, 2016 14:59

willnorris merged commit 13c2701 into google:master May 4, 2016

dmitshur deleted the predictive-RateLimitError branch May 4, 2016 15:45

dmitshur mentioned this pull request Jun 28, 2016

Optimization opportunity to reuse service struct allocations in github.Client #389

Closed

dmitshur mentioned this pull request Sep 21, 2016

Where is the best layer to add throttling to avoid being marked for abuse? #431

Closed

dmitshur mentioned this pull request Feb 18, 2017

Remove unused Client.mostRecent. #559

Merged

dmitshur mentioned this pull request Mar 4, 2017

Return rate limit information immediately after checkRateLimitBeforeDo #572

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predict *RateLimitError, return immediately without network call (9700x speedup when rate limit exceeded). #347

Predict *RateLimitError, return immediately without network call (9700x speedup when rate limit exceeded). #347

dmitshur commented Apr 22, 2016 •

edited

Loading

gmlewis Apr 22, 2016

dmitshur Apr 22, 2016 •

edited

Loading

gmlewis Apr 22, 2016

dmitshur commented Apr 24, 2016

gmlewis commented Apr 24, 2016

dmitshur commented Apr 30, 2016

willnorris commented May 2, 2016

willnorris May 2, 2016

willnorris May 2, 2016

dmitshur May 2, 2016

dmitshur May 2, 2016 •

edited

Loading

dmitshur May 2, 2016

willnorris May 3, 2016

dmitshur May 3, 2016

willnorris May 3, 2016

dmitshur May 3, 2016 •

edited

Loading

willnorris commented May 4, 2016

dmitshur commented May 4, 2016 •

edited

Loading

Predict *RateLimitError, return immediately without network call (9700x speedup when rate limit exceeded). #347

Predict *RateLimitError, return immediately without network call (9700x speedup when rate limit exceeded). #347

Conversation

dmitshur commented Apr 22, 2016 • edited Loading

Benchmarks

Choose a reason for hiding this comment

dmitshur Apr 22, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitshur commented Apr 24, 2016

gmlewis commented Apr 24, 2016

dmitshur commented Apr 30, 2016

willnorris commented May 2, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitshur May 2, 2016 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dmitshur May 3, 2016 • edited Loading

Choose a reason for hiding this comment

willnorris commented May 4, 2016

dmitshur commented May 4, 2016 • edited Loading

dmitshur commented Apr 22, 2016 •

edited

Loading

dmitshur Apr 22, 2016 •

edited

Loading

dmitshur May 2, 2016 •

edited

Loading

dmitshur May 3, 2016 •

edited

Loading

dmitshur commented May 4, 2016 •

edited

Loading