-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mikedld #9662
Mikedld #9662
Conversation
If timeout (or some network error?) occurs while waiting for a pod initialization or termination event, e.g. when build takes a long time, skaffold becomes stuck and never finishes the operation. Use retry watcher to handle the errors gracefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @alphanota, reviewers, and future viewers!
I'm Gemini, currently reviewing this pull request. In the meantime, I'm providing a summary of its content to help everyone quickly get up to speed.
From what I can gather, this PR aims to improve the WaitForPod
functions in Skaffold's Kubernetes integration by adding retry logic to the pod watcher. This should make Skaffold more resilient to temporary network issues or API server hiccups when waiting for pods to reach a specific state.
Here's a breakdown of the changes:
- Added
newPodsWatcher
function: This function creates a new pod watcher that useswatchtools.NewRetryWatcher
to retry watch requests on failure. This is the core change that introduces the retry logic. - Modified
WaitForPodSucceeded
,WaitForPodInitialized
, andWaitForDeploymentToStabilize
: These functions now use thenewPodsWatcher
function to create their pod watchers, enabling retry functionality. - Added import statements: Necessary imports for
cache
andwatchtools
were added topkg/skaffold/kubernetes/wait.go
. - Updated test cases: Test cases in
pkg/skaffold/kubernetes/wait_test.go
were updated to account for the new retry logic and ensure correctness. This includes adding reactors for list requests to simulate initial list responses from the Kubernetes API server. - Removed
vendor/github.com/letsencrypt/boulder/errors/errors.go
: This file was completely removed. It's unclear from the context alone why this was done, so further investigation may be needed to understand the implications. - Removed
vendor/github.com/letsencrypt/boulder/sa/proto/sa.proto
: This file was completely removed. It's unclear from the context alone why this was done, so further investigation may be needed to understand the implications. - Removed
vendor/gopkg.in/go-jose/go-jose.v2/.gitcookies.sh.enc
,vendor/gopkg.in/go-jose/go-jose.v2/.gitignore
,vendor/gopkg.in/go-jose/go-jose.v2/.travis.yml
,vendor/gopkg.in/go-jose/go-jose.v2/CHANGELOG.md
,vendor/gopkg.in/go-jose/go-jose.v2/CONTRIBUTING.md
,vendor/gopkg.in/go-jose/go-jose.v2/LICENSE
,vendor/gopkg.in/go-jose/go-jose.v2/README.md
,vendor/gopkg.in/go-jose/go-jose.v2/asymmetric.go
,vendor/gopkg.in/go-jose/go-jose.v2/cipher/cbc_hmac.go
,vendor/gopkg.in/go-jose/go-jose.v2/cipher/concat_kdf.go
,vendor/gopkg.in/go-jose/go-jose.v2/cipher/ecdh_es.go
,vendor/gopkg.in/go-jose/go-jose.v2/cipher/key_wrap.go
,vendor/gopkg.in/go-jose/go-jose.v2/crypter.go
,vendor/gopkg.in/go-jose/go-jose.v2/doc.go
,vendor/gopkg.in/go-jose/go-jose.v2/encoding.go
,vendor/gopkg.in/go-jose/go-jose.v2/json/LICENSE
,vendor/gopkg.in/go-jose/go-jose.v2/json/README.md
,vendor/gopkg.in/go-jose/go-jose.v2/json/decode.go
,vendor/gopkg.in/go-jose/go-jose.v2/json/encode.go
,vendor/gopkg.in/go-jose/go-jose.v2/json/indent.go
,vendor/gopkg.in/go-jose/go-jose.v2/json/scanner.go
,vendor/gopkg.in/go-jose/go-jose.v2/json/stream.go
,vendor/gopkg.in/go-jose/go-jose.v2/json/tags.go
,vendor/gopkg.in/go-jose/go-jose.v2/jwe.go
,vendor/gopkg.in/go-jose/go-jose.v2/jwk.go
,vendor/gopkg.in/go-jose/go-jose.v2/jws.go
,vendor/gopkg.in/go-jose/go-jose.v2/opaque.go
,vendor/gopkg.in/go-jose/go-jose.v2/shared.go
,vendor/gopkg.in/go-jose/go-jose.v2/signing.go
,vendor/gopkg.in/go-jose/go-jose.v2/symmetric.go
: These files were removed. It's unclear from the context alone why this was done, so further investigation may be needed to understand the implications.
Please note: The removal of the vendor files is significant and requires further investigation to determine its impact on the project's dependencies and functionality. I recommend carefully reviewing this aspect of the PR.
I'll be providing a more detailed review shortly.
Meanwhile, enjoy this haiku:
Pods wait, slow to start,
Retry logic, swift and sure,
Skaffold builds anew.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR refactors the pod watching mechanism in Skaffold to use a retrying watcher, which is a positive change for robustness. However, the removal of several files related to Boulder and go-jose v2 requires more explanation and justification. The lack of detail in the PR title and description makes it difficult to understand the full scope and impact of these changes. Please provide more context and address the specific points raised in the review comments.
Fixes: #nnn
Related: Relevant tracking issues, for context
Merge before/after: Dependent or prerequisite PRs
Description
User facing changes (remove if N/A)
Follow-up Work (remove if N/A)