-
-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add lock around save in pusher to avoid race condition #1665
Conversation
I think I'm okay with this. The race condition is decreased from "the entire time it takes to upload the .gem file to s3" down to the time between
It's possible for Putting the lock in @dwradcliffe are you okay with this? @jules2689 can you Final question: how should we handle checksum mismatches if this (much smaller) race condition gets triggered, and we end up with an s3/db mismatch? |
@indirect I'm not sure it's worth spending a ton of time preventing mismatches. I think it would be worth more to us to simply fix them automatically. If possible, we could detect when a checksum mismatch happens and fix in a background job We could also run an hourly task that takes all gems uploaded in the last hour and checks the checksum? |
True. At that point, we could maybe even just have every save enqueue a background job for execution in 5 minutes that calls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I am still thinking why we can't use DB locking for this. What do you mean by that? |
@arthurnn the problem we are solving here is that jobs (particularly from parallelized CI systems like Travis matrix builds) are attempting to simultaneously upload the same version twice. This means that we will not find and use a previous row, but create 2 new rows as per https://github.com/jules2689/rubygems.org/blob/ddc02e8951acb084e15fbadfa36fcaa379efbdb6/app/models/pusher.rb#L86-L109 This means that the row lock would not be effective as we would just make 2 of them and they would be totally separate from each other - I think, I'm not the best versed in DB locks so I could easily be mistaken :) |
Indeed we don't want to upload two independent versions for the same spec. I think the code will have to change a bit, to wrap most of the Sorry for the back and forth, the memecache solution is a good one and it works. I just think we can solve this with a db locking, which is the level we want to synchronize data at. |
I originally tried that path, so if we can get this going then Im 👍
Technically it does already through background jobs - I think. Happy to refactor this though |
☔ The latest upstream changes (presumably 101ce1b) made this pull request unmergeable. Please resolve the merge conflicts. |
I've been away for the past few weeks at RubyKaigi. I intend to look at this again this week or next. |
Superseded by #1681 |
The problem
When gems are pushed at approximately the same time, 2 simultaneous request will happen. These requests end up both pushing an identical version to Rubygems. When Bundler tries to validate the version, things go 💥 and the SHA256 checksum fails.
This often happens in matrix builds in Travis
How this fixes it
A lock around the functionality was discussed in the linked issues. I chose to take this path.
I did not use an active record lock as we do not want to prevent other changes to the gem (like download stats, edits, etc) while something is uploading. Instead, we lock the
save
functionality directly.This means that a call to
save
must first be able to get a lock and will 409 CONFLICT otherwise.Should the system crash during a save, the lock will expire automatically in 10 minutes as a backup method.
Questions
Rails.cache
locking is supposedly not the best, I have little experience with it. Maybe it gives us enough benefit without additional overhead?Issues this solves
Fixes #1564
Fixes #1551
cc @indirect @dwradcliffe