Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persist records before uploading to s3 #1723

Merged
merged 2 commits into from
Jun 25, 2018

Conversation

sonalkr132
Copy link
Member

When same gem version is pushed in parallel, upload to s3 is run for both the process. Process which commits transaction in Rubygems#update_attributes_from_gem_specification! first gets to set version.sha256, while process which uploads to s3 last gets to write gem file. These two processes need not be same and hence lead of sha mismatch.
If we persist version record before uploading to s3, other processes will get record not unique violation and not overwrite gem file in s3.

Failing test on master.
Fixes #1564
Fixes #1551

require 'test_helper'

class ParallelPusherTest < ActiveSupport::TestCase
self.use_transactional_tests = false
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be here because general test flow doesn't clear records created in Thread.new which use different db connection.

@cutter1 = Pusher.new(@user, @gem1)
@cutter2 = Pusher.new(@user, @gem2)

t1 = Thread.new do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this threading introducing random order of execution of gem processing making this test "unstable"? Isn't there any chance to use ActiveSupport Latch to control the order?

Copy link
Member Author

@sonalkr132 sonalkr132 May 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it would definitely make it better. Thanks @simi
ActiveSupport::Concurrency::Latch was deprecated in 5.1 and was replaced by Concurrent::CountDownLatch rails/rails#20866. I have updated test to use the same.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any difference with current code and the old one (using thread join). I think you should be able to use Latch to stop thread 1 somewhere, start thread 2 and continue in thread 1 to simulate the exact order of critical operations resulting in race condition at some points (which is the reason of this PR). Maybe I'm wrong. I try to submit my code to compare if welcomed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I try to submit my code to compare if welcomed.

Yes, please 🙏 Even an example code of what you are trying to say would be fine as well.
I didn't even know about ActiveSupport::Concurrency::Latch until you mentioned it.

@sonalkr132 sonalkr132 force-pushed the race-create branch 3 times, most recently from c3ebae9 to e5993fb Compare May 10, 2018 18:48
@sonalkr132
Copy link
Member Author

An alternate and perhaps more elegant approach to solving this would have been distributed lock. We would have to use redis or zookeeper and I don't think we are adding those dependencies. People seem to have use memcache (which we already use) for it as well but memache isn't really meant for it.

@indirect
Copy link
Member

I think it would be fine to pair this with a cron job that checks recent gem versions to see if the file is actually on s3. then if the file is not on s3, delete the version row, so the user can try again?

@sonalkr132 sonalkr132 force-pushed the race-create branch 3 times, most recently from 7c0bbcc to 0af0f29 Compare May 12, 2018 15:47
@sonalkr132
Copy link
Member Author

sonalkr132 commented May 12, 2018

a cron job that checks recent gem versions to see if the file is actually on s3

I don't think we need it. Pusher#save destroys the version if push to s3 failed. version.destroy in rescue was added (87312bc) in good faith but before this PR it was a dud. I have added a test to ensure verison is deleted if upload to s3 fails.
If there is some other way, we can end with empty s3 object then let me know.

@bundlerbot
Copy link
Collaborator

☔ The latest upstream changes (presumably ffffdeb) made this pull request unmergeable. Please resolve the merge conflicts.

@sonalkr132 sonalkr132 force-pushed the race-create branch 3 times, most recently from 2ead2fa to 72a6ce0 Compare May 26, 2018 18:35
@indirect
Copy link
Member

I'm very happy that this includes a test that exercises the "two simultaneous pushes" code path. Thanks for working on this!

@bundlerbot r+

@bundlerbot
Copy link
Collaborator

📌 Commit 72a6ce0 has been approved by indirect

@bundlerbot
Copy link
Collaborator

⌛ Testing commit 72a6ce0 with merge 72e6bd7...

bundlerbot added a commit that referenced this pull request Jun 25, 2018
Persist records before uploading to s3

When same gem version is pushed in parallel, upload to s3 is run for both the process. Process which commits transaction in Rubygems#update_attributes_from_gem_specification! first gets to set version.sha256, while process which uploads to s3 last gets to write gem file. These two processes need not be same and hence lead of sha mismatch.
If we persist version record before uploading to s3, other processes will get record not unique violation and not overwrite gem file in s3.

[Failing test on master](https://travis-ci.org/sonalkr132/rubygems.org/jobs/376936366#L1309).
Fixes #1564
Fixes #1551
@bundlerbot
Copy link
Collaborator

💔 Test failed - status-travis

When same gem version is pushed in parallel, upload to s3 is run for
both the process. Process which commits trancation in
Rubygems#update_attributes_from_gem_specification! first gets to set
version.sha256, while process which uploads to s3 last gets to write
gem file. These two processes need not be same and hence lead of sha
mismatch.
If we persist version record before uploading to s3, other processes
will get record not unique violation and not overwrite gem file in s3.
@sonalkr132 sonalkr132 force-pushed the race-create branch 2 times, most recently from e4a958c to 651b6f4 Compare June 25, 2018 06:06
transactional_test is set to false because general test flow doesn't
clear records created in Thread.new which use different db connection.
@sonalkr132
Copy link
Member Author

Test failed - status-travis

fixed it. It was failing because I was setting @fs = nil in teardown, when it should have been RubygemFs.mock! as used in test_helper.rb

@sonalkr132 sonalkr132 merged commit e22d721 into rubygems:master Jun 25, 2018
@sonalkr132 sonalkr132 deleted the race-create branch June 25, 2018 06:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants