-
-
Notifications
You must be signed in to change notification settings - Fork 917
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Persist records before uploading to s3 #1723
Conversation
test/unit/parallel_pusher_test.rb
Outdated
require 'test_helper' | ||
|
||
class ParallelPusherTest < ActiveSupport::TestCase | ||
self.use_transactional_tests = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to be here because general test flow doesn't clear records created in Thread.new
which use different db connection.
test/unit/parallel_pusher_test.rb
Outdated
@cutter1 = Pusher.new(@user, @gem1) | ||
@cutter2 = Pusher.new(@user, @gem2) | ||
|
||
t1 = Thread.new do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this threading introducing random order of execution of gem processing making this test "unstable"? Isn't there any chance to use ActiveSupport Latch
to control the order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it would definitely make it better. Thanks @simi
ActiveSupport::Concurrency::Latch
was deprecated in 5.1 and was replaced by Concurrent::CountDownLatch
rails/rails#20866. I have updated test to use the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any difference with current code and the old one (using thread join). I think you should be able to use Latch
to stop thread 1 somewhere, start thread 2 and continue in thread 1 to simulate the exact order of critical operations resulting in race condition at some points (which is the reason of this PR). Maybe I'm wrong. I try to submit my code to compare if welcomed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to submit my code to compare if welcomed.
Yes, please 🙏 Even an example code of what you are trying to say would be fine as well.
I didn't even know about ActiveSupport::Concurrency::Latch
until you mentioned it.
c3ebae9
to
e5993fb
Compare
An alternate and perhaps more elegant approach to solving this would have been distributed lock. We would have to use redis or zookeeper and I don't think we are adding those dependencies. People seem to have use memcache (which we already use) for it as well but memache isn't really meant for it. |
I think it would be fine to pair this with a cron job that checks recent gem versions to see if the file is actually on s3. then if the file is not on s3, delete the version row, so the user can try again? |
7c0bbcc
to
0af0f29
Compare
I don't think we need it. |
☔ The latest upstream changes (presumably ffffdeb) made this pull request unmergeable. Please resolve the merge conflicts. |
2ead2fa
to
72a6ce0
Compare
I'm very happy that this includes a test that exercises the "two simultaneous pushes" code path. Thanks for working on this! @bundlerbot r+ |
📌 Commit 72a6ce0 has been approved by |
Persist records before uploading to s3 When same gem version is pushed in parallel, upload to s3 is run for both the process. Process which commits transaction in Rubygems#update_attributes_from_gem_specification! first gets to set version.sha256, while process which uploads to s3 last gets to write gem file. These two processes need not be same and hence lead of sha mismatch. If we persist version record before uploading to s3, other processes will get record not unique violation and not overwrite gem file in s3. [Failing test on master](https://travis-ci.org/sonalkr132/rubygems.org/jobs/376936366#L1309). Fixes #1564 Fixes #1551
💔 Test failed - status-travis |
When same gem version is pushed in parallel, upload to s3 is run for both the process. Process which commits trancation in Rubygems#update_attributes_from_gem_specification! first gets to set version.sha256, while process which uploads to s3 last gets to write gem file. These two processes need not be same and hence lead of sha mismatch. If we persist version record before uploading to s3, other processes will get record not unique violation and not overwrite gem file in s3.
e4a958c
to
651b6f4
Compare
transactional_test is set to false because general test flow doesn't clear records created in Thread.new which use different db connection.
fixed it. It was failing because I was setting |
When same gem version is pushed in parallel, upload to s3 is run for both the process. Process which commits transaction in Rubygems#update_attributes_from_gem_specification! first gets to set version.sha256, while process which uploads to s3 last gets to write gem file. These two processes need not be same and hence lead of sha mismatch.
If we persist version record before uploading to s3, other processes will get record not unique violation and not overwrite gem file in s3.
Failing test on master.
Fixes #1564
Fixes #1551