Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix retry-after value in distributed case #34

Merged
merged 7 commits into from
Dec 19, 2023

Conversation

tgeoghegan
Copy link
Contributor

@tgeoghegan tgeoghegan commented Dec 11, 2023

The basic approach in #30 -- to sync around the timestamp of the oldest
event -- is sound, but the implementation was flawed, because the spot
before the cursor is not always the oldest event in the ring buffer. We
now correctly compute that value while counting events in the window
(which we had to do in order to sync event counts anyway). Additionally,
this commit adds tests for the distributed rate limiter which simulate a
peer by constructing a ringBufferRateLimiter, writing it out to
storage, and then starting up a caddytest.Tester on that storage.

@tgeoghegan
Copy link
Contributor Author

tgeoghegan commented Dec 11, 2023

Test fails on Ubuntu because it can't find the instance UUID file:

 caddytest.go:98: failed to load config: {"error":"loading config: loading new config: loading http app module: provision http: server one: setting up route handlers: route 0: loading handler modules: position 0: loading module 'rate_limit': provision http.handlers.rate_limit: open /home/runner/.local/share/caddy/instance.uuid: no such file or directory"}

The instance ID is needed to write/read distributed rate limit state, so I will need to find a way to provide an instance ID to a caddytest.Tester. It seems the real problem is that while the file will get created on demand by Caddy, it expects that /home/runner/.local/share/caddy/ (or rather, $XDG_DATA_HOME/caddy) will already exist. I'm guessing there's some path through a real Caddy executable that will create this, but caddytest.Tester won't? So a simple fix here would be to have the test just do os.MkdirAll(caddy.AppDataDir()), but that feels kind of gross.

The failure on Windows is a little stranger:

     caddytest.go:98: failed to load config: {"error":"loading config: decoding request body: invalid character 'U' in string escape code"}
    caddytest.go:102: failed ensuring config is running: invalid character 'U' in string escape code

I am thinking this is due to the storage configuration stanza I use, where we fmt.Sprintf a path into the config file. I'm guessing that t.TempDir returns something like C:\Users\runner\blah\blah\tmp\blah, and something in Caddy thinks that first \U is an escape? But I don't really know what the "right" way to handle Windows paths in Go and/or Caddy configurations is.

@tgeoghegan tgeoghegan marked this pull request as draft December 12, 2023 01:24
@tgeoghegan
Copy link
Contributor Author

Making this a draft while I work through the problems with this change.

The basic approach in mholt#30 -- to sync around the timestamp of the oldest
event -- is sound, but the implementation was flawed, because the spot
before the cursor is not always the oldest event in the ring buffer. We
now correctly compute that value while counting events in the window
(which we had to do in order to sync event counts anyway). Additionally,
this commit adds tests for the distributed rate limiter which simulate a
peer by constructing a `ringBufferRateLimiter`, writing it out to
storage, and then starting up a `caddytest.Tester` on that storage.
@tgeoghegan tgeoghegan force-pushed the distributed-limiter-tests branch from 92dbab3 to 42ad060 Compare December 12, 2023 20:03
tgeoghegan added a commit to divviup/janus that referenced this pull request Dec 12, 2023
- use a semaphore
- cache loaded test config

additionally, since the Caddy rate limiter will soon provide a useful
retry-after (mholt/caddy-ratelimit#34), we now
check that in this test.
@tgeoghegan
Copy link
Contributor Author

I fixed the test failures by escaping the path on Windows and creating caddy.AppDataDir() on Ubuntu. I'm not sure that the latter is a great solution, though, and am open to suggestions on how to do better. I also fixed another bug around determining the oldest event in a rate limiter and accordingly added ringbuffer_test.go. I think this is ready for review.

Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking quite good. Thank you! Is this ready to be merged? How's it working for you?

@tgeoghegan
Copy link
Contributor Author

We've had this change in our integration tests for a week or so now, and deployed to a staging environment since late last week, and it appears to be working as intended. I think it should be OK to merge, especially since the most serious consequence of this being broken is just that clients will get an inaccurate retry-after. But they were unconditionally getting 0 before, so even that isn't so awful.

@mholt mholt merged commit 8aeaea3 into mholt:master Dec 19, 2023
3 checks passed
@mholt
Copy link
Owner

mholt commented Dec 19, 2023

Awesome, that's great to hear! Thanks for working on this. It was something that I wasn't sure the best approach but I think this is a good one. For how simple and elegant it is, it gets us a long way.

@mholt mholt mentioned this pull request Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants