Allow ScoreUploader
to process replays concurrently
#235
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
As things stand,
ScoreUploader
is inherently single threaded. Today, @peppy noticed that replay uploads have started piling up:and @ThePooN pointed out that this wasn't exactly new, and started happening a few days before already as bursts of pending uploads that then go back down eventually, which also has the effect of increasing the memory footprint of the spectator server instance:
With no apparent direct cause (as in, no apparent deploy or change that broke it), the working theory is that something may be throttling the upload process (apparently it was doing ~1 score per second which seems way too slow) and maybe throwing more threads at it may help. So this diff attempts to do that, using channels (think
BlockingCollection
but async).I admittedly don't have much experience working with
Channel<T>
so I'm not super sure how well this is gonna behave. One notable thing to read in the docs of that class is this:I'm not sure I understand the full implications of whatever that means; I know what backpressure means, but depending on how it's done, it might be decent, or it might be not. This probably needs more testing in general but I'm not super sure how to test this sort of thing because this is the sort of thing you only can observe working properly under load, but I wrote it and it might be useful so I'm PRing it to see if there's at least interest to pursue this further and maybe do some stress testing on.
The degree of concurrency is controlled via an envvar (
REPLAY_UPLOAD_THREADS
). Yes, this means that changing the degree means restarting the instance. That was considered acceptable in discussion.(As an aside I considered merging
REPLAY_UPLOAD_THREADS
withSAVE_REPLAYS
to haveREPLAY_UPLOAD_THREADS=0
mean "replays off" but it's annoying because tests depend onSAVE_REPLAYS
being toggleable in runtime, which just works if it's done as it was inEnqueue()
but is quite annoying to do if you need to kill/spin up threads on demand on the reader end just for tests, and I wasn't super sure that was a reasonable thing to do to begin with, so I didn't in the end. Can reconsider if that would be considered nicer.)