Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

versions checksums do not match /info for existing gems #1566

Closed
wjordan opened this issue Feb 11, 2017 · 12 comments
Closed

versions checksums do not match /info for existing gems #1566

wjordan opened this issue Feb 11, 2017 · 12 comments
Assignees
Labels

Comments

@wjordan
Copy link

wjordan commented Feb 11, 2017

According to my understanding (forgive me if this is incorrect), rubygems.org contains an API controller that serves gem info in the compact-index format, which is used by Bundler's compact index client. One of the format's requirements is that a gem entry in the versions endpoint contains an md5 hash that matches the digest generated by the contents of the /info/[gem_name] endpoint.

It appears that the set of 'info' checksums contained in the existing versions list hosted by rubygems.org is incorrect. I don't know if this is because the 'info' format changed slightly, or due to some other environment/platform issue when originally generating the checksums.

For example, the compact_index gem (most recently published Aug 27 2016) checksum in /versions does not match the contents of /info/compact_index:

$ curl -sSL https://rubygems.org/versions | grep '^compact_index ' | cut -d' ' -f3
b783ed44aa4dd8448f921c3468bedef8
$ curl -sSL https://rubygems.org/info/compact_index | md5sum | cut -d' ' -f1
ee28c243860cfe0a878432fb6bb01905

However, the rails gem (most recently published Feb 10 2017) checksum in /versions (the most recent entry) does match the contents of /info/rails:

$ curl -sSL https://rubygems.org/versions | grep '^rails ' | cut -d' ' -f3
ffd97f57dceec4b7aae69e3533045ea4
0608af4a864044a81bf1429d5b79da54
22d0b7e7079d50e5de30b3bd429f449f
964abbafe8cd512aeb08494f5ff82fcb
4fee8f57698e156c31bafe8f947df205
$ curl -sSL https://rubygems.org/info/rails | md5sum | cut -d' ' -f1
4fee8f57698e156c31bafe8f947df205

This is an issue because the checksum mismatches are causing bundler to send redundant HTTP GET requests to /info/[gem_name] endpoints for the affected files, resulting in longer bundle install times and extra load on the rubygems.org server:

DEBUG_COMPACT_INDEX=1 bundle --verbose
Running `bundle install --verbose` with bundler 1.14.3
[...]
Looking up gems ["rails"]
[Bundler::CompactIndexClient] dependencies(["rails"])
[Bundler::CompactIndexClient] update_info(rails)
[Bundler::CompactIndexClient] skipping updating info for rails since the versions checksum matches the local checksum

Looking up gems ["compact_index"]
[Bundler::CompactIndexClient] dependencies(["compact_index"])
[Bundler::CompactIndexClient] update_info(compact_index)
[Bundler::CompactIndexClient] updating info for compact_index since the versions checksum b783ed44aa4dd8448f921c3468bedef8 != the local checksum ee28c243860cfe0a878432fb6bb01905
[Bundler::CompactIndexClient] update(/home/will/.bundle/cache/compact_index/rubygems.org.443.29b0360b937aa4d161703e6160654e47/info/compact_index, info/compact_index)
HTTP GET https://index.rubygems.org/info/compact_index
HTTP 304 Not Modified https://index.rubygems.org/info/compact_index

Since the direct issue is invalid info-checksum content on rubygems.org I've raised this issue in this repo. However, if the original source/format of these invalid info-checksums is known, a suitable workaround could be added to the Bundler source code to allow backwards compatibility with these 'legacy' checksums without having to modify the existing versions file.

@segiddins
Copy link
Member

This is a known bug with the rails app we've yet to figure out the root cause of, sadly 😔

@wjordan
Copy link
Author

wjordan commented Feb 15, 2017

Is it possible that the same root cause is responsible for #1551?

If the root cause for the mismatched hashes stored in the current versions file can't be determined (which would allow the existing checksums to be supported as-is through a backwards-compatibility workaround), the only reasonable solution would be to declare 'checksum bankruptcy' by expiring the versions list as @indirect suggested in a comment.

I think it's worth doing, considering the vast majority of gems are currently affected by this issue.

To give a rough estimate of how many gems are affected, I examined a small sample of all gems on rubygems.org (all gems beginning with ab), and compared the hash stored in the /versions against the MD5 sum of the content returned by /info/[gem_name]. Only 28 gems had matching hashes, while 151 had invalid hashes.

@segiddins
Copy link
Member

@wjordan we've expired & re-generated the versions file before in the hope that it would fix the issue and it hasn't

@wjordan
Copy link
Author

wjordan commented Feb 15, 2017

@segiddins Have you ever regenerated the info_checksum columns on all the gems stored in the database? Running rake update_versions_file won't update them, and rake backfill_info_checksum won't update existing values since #1348 (Jun 27 2016).

Note that expiring and regenerating the versions file alone won't automatically repair these bad info checksums, it will only create a freshly-materialized view containing the same invalid values.

@sonalkr132
Copy link
Member

This happened after we updated compact_index. compact_index 0.11.0 does not add ruby:>= 0,rubygems:>= 0 if ruby or ruybgem versions is >=0 and thus changed info response from:

"---\n ...
0.9.0 |checksum:fb240b8eccd251c4aa1d7e247c086a2bea7110fc72dd88f28ae2674820e4b56d,ruby:>= 0,rubygems:>= 0\n0.9.1 |checksum:b3994bda999498b57b3ecf55e80405e165b6ae486d58cd77d4ecddf6fec10104,ruby:>= 0,rubygems:>= 0\n0.9.3 |checksum:e8fabb0f75ca92e1212d74ac95956e4829957622794ff15790fbb35e4108aa15,ruby:>= 0,rubygems:>= 0\n0.9.4 |checksum:78ccd6f22c33b51a25b55be79533a911272b466b935844171355e9755c41944d,ruby:>= 0,rubygems:>= 0\n0.10.0 |checksum:c0d6b9b7c9aaae3b9cfcc24cdc3a4e38a792634ad96f9b3048ee1d243a342778,ruby:>= 0,rubygems:>= 0\n0.11.0 |checksum:5a34f018a3c71e0cc5f8437af8d3c289f5b4855dcc5d6f5f0a7b535097e2af90,ruby:>= 0,rubygems:>= 0\n"

to:

"---\n ...
0.9.0 |checksum:fb240b8eccd251c4aa1d7e247c086a2bea7110fc72dd88f28ae2674820e4b56d\n0.9.1 |checksum:b3994bda999498b57b3ecf55e80405e165b6ae486d58cd77d4ecddf6fec10104\n0.9.3 |checksum:e8fabb0f75ca92e1212d74ac95956e4829957622794ff15790fbb35e4108aa15\n0.9.4 |checksum:78ccd6f22c33b51a25b55be79533a911272b466b935844171355e9755c41944d\n0.10.0 |checksum:c0d6b9b7c9aaae3b9cfcc24cdc3a4e38a792634ad96f9b3048ee1d243a342778\n0.11.0 |checksum:5a34f018a3c71e0cc5f8437af8d3c289f5b4855dcc5d6f5f0a7b535097e2af90\n"

We should have recalculated info_checksum when we updated.

@sonalkr132 sonalkr132 added the bug label Feb 22, 2017
@sonalkr132
Copy link
Member

@wjordan we fixed the incorrect info_checksum and generated a new versions file. Can you please confirm that issue is fixed?

@indirect
Copy link
Member

To be fair, I'm not 100% sure I deployed the new versions file correctly. Hopefully 😬

@segiddins
Copy link
Member

segiddins commented Mar 28, 2017

I'm still getting a 304 on capistrano (versions file has a86b34d1424874f8e46fbdd1e442cf72, actual is e617af9a61d834329b6790842b157e43)

@indirect
Copy link
Member

blahhh, seems like running the rake task didn't correct the checksums?

@sonalkr132
Copy link
Member

As of now, these 258 gems have incorrect info_checksum. It's better than 125,350 we had before fixing info_checksum.
Capistrano has released versions after 2016-08-30 05:16:23, so it wasn't covered in versions we fixed.
I will update when I find out more. May be #1551 is related.

@indirect
Copy link
Member

@sonalkr132 awesome, thank you for checking on this 👍

@sonalkr132
Copy link
Member

Sorry about the delay in getting this resolved. We found two issues, namely:

  • requirement order was changing between version push for gems which have listed same gem as requirement multiple times (Ex: net-ssh:>= 0 net-ssh:>= 2.0.14 in capistrano 2.5.10). (Fixed in Order gem dependencies in info response by id #2374)
  • some versions have unresolvable dependency when they were pushed but later the unreasonable dependency was pushed to rubygems.org as well (discussed here)

We fixed checksum of all public version in an adhoc rake task. As of now none of the gems listed in versions.list endpoint has any mismatch. There are 8 mismatched in DB for gems which have no public versions.

Thank you for reporting this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants