-
Notifications
You must be signed in to change notification settings - Fork 487
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DCOS-46615 - Retry downloading pkgpanda pkg if file size incorrect #4549
DCOS-46615 - Retry downloading pkgpanda pkg if file size incorrect #4549
Conversation
This repo has @mesosphere-mergebot integration. You can perform the following commands by submitting a comment. Submit a comment with content "@mesosphere-mergebot help" to view more detailed help text and examples. Be sure the have a look at the mergebot documentation, too.
|
@mesosphere-mergebot bump-ee |
Enterprise Bump PR: mesosphere/dcos-enterprise/pull/4694 |
89fd184
to
5f70d07
Compare
@mesosphere-mergebot bump-ee |
Enterprise Bump mesosphere/dcos-enterprise/pull/4694 updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start
pkgpanda/test_util.py
Outdated
mock_server_port = _get_free_port() | ||
mock_server = MockHTTPDownloadServer(('localhost', mock_server_port), MockDownloadServerRequestHandler) | ||
|
||
mock_server_thread = Thread(target=mock_server.serve_forever) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Threads in python are where the wheels come off for me. After some googling around this seems to be idiomatic code, but things like spawning threads that never terminate make me uneasy.
5f70d07
to
2a20875
Compare
@mesosphere-mergebot bump-ee |
Enterprise Bump mesosphere/dcos-enterprise/pull/4694 updated. |
2a20875
to
b695d36
Compare
@mesosphere-mergebot bump-ee |
Enterprise Bump mesosphere/dcos-enterprise/pull/4694 updated. |
b695d36
to
4c38f2d
Compare
@mesosphere-mergebot bump-ee |
Enterprise Bump mesosphere/dcos-enterprise/pull/4694 updated. |
@mesosphere-mergebot label Ready for Review |
@gpaul @adamtheturtle Can you re-review? I removed the useless legacy code, worked in all of your suggestions and switched to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💪 Looks great!
@mesosphere-mergebot bump-ee |
Your pull request's branch is not based on the most recent version of |
AWS S3, from which we download pre-built pkgpanda packages, might sometimes end the connection before the package has been fully downloaded. (See aws/aws-sdk-js#312 and boto/boto3#540 for more context on this) In our case that leads to invalid `tar.xz` archives which cause `unexpected end of input` errors and result in faulty builds of DC/OS (`dcos_generate_config.sh`) which contain these invalid archives and then result in install-time errors for users. See this comment for an investigation into how this can affect installation errors: https://jira.mesosphere.com/browse/DCOS_OSS-4097?focusedCommentId=219259&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-219259 This change here adds a sanity check to the `download` function used by pkgpanda. It checks whether the final file size (after flushing the file to disk) matches the `Content-Length` we get from S3. If it doesn't match, it sleep for 2, 4, 8, ... seconds and then retries until it either downloads the file completely or the retries are exhausted. In that case, it raises an exception.
b3287aa
4c38f2d
to
b3287aa
Compare
@mesosphere-mergebot bump-ee |
Enterprise Bump mesosphere/dcos-enterprise/pull/4694 updated. |
@mesosphere-mergebot sync |
Mergebot Sync SummarySync actions for https://github.com/mesosphere/dcos-enterprise/pull/4694:mergebot/review/approved/min_2: Approval Required from: @gpaul . Changes were requested in this PR. mergebot/enterprise/review/approved/min_2: Received 2 approvals. Aggregate Status Check: {'context': {'EE-Integration-Test/E2E/group-1': ('success', '2019-02-26T15:41:46Z'), 'EE-Integration-Test/E2E/group-2': ('success', '2019-02-26T16:12:47Z'), 'EE-Integration-Test/E2E/group-3': ('success', '2019-02-26T15:41:27Z'), 'EE-Integration-Test/E2E/group-4': ('success', '2019-02-26T15:48:27Z'), 'mergebot/has_ship-it': ('success', '2019-02-26T17:15:40Z'), 'mergebot/review/approved/min_2': ('success', '2019-02-26T17:10:57Z'), 'teamcity/dcos/build/aws/cloudformation/custom/permissive': ('success', '2019-02-26T14:50:25Z'), 'teamcity/dcos/build/aws/cloudformation/custom/strict': ('success', '2019-02-26T14:50:19Z'), 'teamcity/dcos/build/dcos': ('success', '2019-02-26T14:46:30Z'), 'teamcity/dcos/build/tox': ('success', '2019-02-26T14:35:16Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/1': ('success', '2019-02-26T15:28:22Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/2': ('success', '2019-02-26T15:55:55Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/3': ('success', '2019-02-26T15:44:54Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/4': ('success', '2019-02-26T15:49:55Z')}, 'state': 'success'} |
@mesosphere-mergebot sync |
Mergebot Sync SummarySync actions for https://github.com/mesosphere/dcos-enterprise/pull/4694:mergebot/review/approved/min_2: Received 3 approvals. mergebot/enterprise/review/approved/min_2: Received 2 approvals. Aggregate Status Check: {'context': {'EE-Integration-Test/E2E/group-1': ('success', '2019-02-26T15:41:46Z'), 'EE-Integration-Test/E2E/group-2': ('success', '2019-02-26T16:12:47Z'), 'EE-Integration-Test/E2E/group-3': ('success', '2019-02-26T15:41:27Z'), 'EE-Integration-Test/E2E/group-4': ('success', '2019-02-26T15:48:27Z'), 'mergebot/has_ship-it': ('success', '2019-02-26T17:15:40Z'), 'mergebot/review/approved/min_2': ('success', '2019-02-26T17:10:57Z'), 'teamcity/dcos/build/aws/cloudformation/custom/permissive': ('success', '2019-02-26T14:50:25Z'), 'teamcity/dcos/build/aws/cloudformation/custom/strict': ('success', '2019-02-26T14:50:19Z'), 'teamcity/dcos/build/dcos': ('success', '2019-02-26T14:46:30Z'), 'teamcity/dcos/build/tox': ('success', '2019-02-26T14:35:16Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/1': ('success', '2019-02-26T15:28:22Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/2': ('success', '2019-02-26T15:55:55Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/3': ('success', '2019-02-26T15:44:54Z'), 'teamcity/dcos/test/aws/onprem/static/strict/group/4': ('success', '2019-02-26T15:49:55Z')}, 'state': 'success'} |
@mesosphere-mergebot merge-it |
Enterprise Bump PR mesosphere/dcos-enterprise/pull/4694, and the current PR #4549 are now merged. |
High-level description
AWS S3, from which we download pre-built pkgpanda packages, might sometimes end the connection before the package has been fully downloaded. (See aws/aws-sdk-js#312 and boto/boto3#540 for more context on this)
In our case that leads to invalid
tar.xz
archives which causeunexpected end of input
errors and result in faulty builds of DC/OS (dcos_generate_config.sh
) which contain these invalid archives and then result in install-time errors for users. See this comment for an investigation into how this can affect installation errors: https://jira.mesosphere.com/browse/DCOS_OSS-4097?focusedCommentId=219259&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-219259This change here adds a sanity check to the
download
function used by pkgpanda. It checks whether the final file size (after flushing the file to disk) matches theContent-Length
we get from S3. If it doesn't match, it sleep for 2, 4, 8, ... seconds and then retries until it either downloads the file completely or the retries are exhausted. In that case, it raises an exception.Corresponding DC/OS tickets (obligatory)
These DC/OS JIRA ticket(s) must be updated (ideally closed) in the moment this PR lands:
Related tickets (optional)
Other tickets related to this change:
Checklist for all PRs
CHANGES.md
or explain why this is not a user-facing change: it's internal behaviour that's not exposed to the userChecklist for component/package updates:
If you are changing components or packages in DC/OS (e.g. you are bumping the sha or ref of anything underneath
packages
), then in addition to the above please also include:PLEASE FILL IN THE TEMPLATE ABOVE / DO NOT REMOVE ANY SECTIONS ABOVE THIS LINE
Instructions and review process
What is the review process and when will my changes land?
All PRs require 2 approvals using GitHub's pull request reviews.
Reviewers should be:
It is best to proactively ask for 2 reviews by @mentioning the candidate reviewers in the PR comments area. The responsibility is on the developer submitting the PR to follow-up with reviewers and make sure a PR is reviewed in a timely manner. Once a PR has 2 ship-it's, no red reviews, and all tests are green it will be included in the next train.