Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fallback to archive.org URLs for failed downloads of FOSS packages #2284

Merged
merged 1 commit into from
Feb 16, 2018

Conversation

HebaruSan
Copy link
Member

@HebaruSan HebaruSan commented Feb 14, 2018

Background

SpaceDock broke today, which is fun. VITAS reports that it looks like a denial of service attack.

The NetKAN bot has been uploading all permissively-licensed mods to https://archive.org/details/kspckanmods for quite a while now. Many or most of the SpaceDock downloads that are now failing, could in principle fall back to archive.org URLs. Such a feature could also help us deal with GitHub's download throttling or SpaceDock's certificate expirations.

Changes

Now if the primary download fails for a permissively licensed mod, we try to find it on archive.org as a fallback.

Internally this involves defining a fallbackUrl property in DownloadTarget and NetAsyncDownloaderDownloadPart, both populated from a new CkanModule.InternetArchiveDownload property, which itself checks a new License.Redistributable property based on a new copy of the list from NetKAN-bot. Then if a download fails, we check whether it has a fallback URL, and if so and if we haven't already tried it, we try it. If the fallback fails as well, then we continue with the normal steps for a failed download.

Fixes #1682.

@HebaruSan HebaruSan added Pull request Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...) Network Issues affecting internet connections of CKAN labels Feb 14, 2018
Copy link
Member

@techman83 techman83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! I've wanted this feature for ages, but never quite had the time to get back into C#!

/// Return an archive.org URL for this download, or null if it's not there.
/// The filenames look a lot like the filenames in Net.Cache, but don't be fooled!
/// Here it's the first 8 characters of the SHA1 of the DOWNLOADED FILE, not the URL!
/// </summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the hash of that file will remain consistent. The url may not.

Produces a filename based of the first 8 digits in sha1 hash,
the 'identifier' and the 'version' in the metadata if the
download_hash exists. Returns '0' if there is no download hash
or has an content type other than zip/gz/tar/tar.gz.

There are some tests that ensure the correct filenames are generated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, either way makes enough sense to me. I just wanted to note it explicitly since we have two different 8-digit hexadecimal filename prefixes floating around, and it's not easy to tell that they're (supposed to be) different at a glance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, actually makes sense for URL in the cache. If the url changes, you probably do want to re-download.

: null;
}
}

/// <summary>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some tests of the filename generation could be useful, but I wouldn't treat this as a blocker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Infrastructure Issues affecting everything around CKAN (the GitHub repos, build process, CI, ...) Network Issues affecting internet connections of CKAN
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants