Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure port forwards do not select terminating pods and are ready before use #9628

Closed
wants to merge 78 commits into from

Conversation

jbohanon
Copy link
Contributor

@jbohanon jbohanon commented Jun 17, 2024

Description

This PR aims to fix flakiness in testing code which relies on port-forwards.

Code changes

  • Check that pods are not in terminating status before attempting to port-forward.
    • The cli portforwarder does this by using a util to make sure no pods show Terminating from a kubectl get
    • The api portforwarder does this by making sure the pod resource doesn't have a deletionTimestamp

Context

Port forwards in Gloo have been historically flaky. There are two main root causes for this that were uncovered in this investigation:

  • The function that starts the port-forward could return nil (no error) before the port-forward was fully initialized. This would lead to errors when the calling code tried to dial on the forwarded address immediately after receiving the nil return.
  • The pod selection code allowed for pods in Terminating status to be selected as the target for the port-forward. This would lead to errors when the selected Terminating pod was finally killed and the port-forward failed.

Testing steps

TODO

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works

@solo-changelog-bot
Copy link

Issues linked to changelog:
#9353

@github-actions github-actions bot added the keep pr updated signals bulldozer to keep pr up to date with base branch label Jun 17, 2024
Copy link

github-actions bot commented Jun 17, 2024

Visit the preview URL for this PR (updated for commit 86c477b):

https://gloo-edge--pr9628-jbohanon-port-forwar-nw9o18rd.web.app

(expires Wed, 07 Aug 2024 15:13:11 GMT)

🔥 via Firebase Hosting GitHub Action 🌎

Sign: 77c2b86e287749579b7ff9cadb81e099042ef677

@sam-heilbron sam-heilbron added work in progress signals bulldozer to keep pr open (don't auto-merge) and removed keep pr updated signals bulldozer to keep pr up to date with base branch labels Jul 25, 2024
@jbohanon jbohanon added keep pr updated signals bulldozer to keep pr up to date with base branch and removed work in progress signals bulldozer to keep pr open (don't auto-merge) labels Jul 30, 2024
@jbohanon jbohanon removed the keep pr updated signals bulldozer to keep pr up to date with base branch label Jul 31, 2024
@jbohanon
Copy link
Contributor Author

jbohanon commented Aug 1, 2024

Closing as the proposed changes have caused timeouts in kubernetes e2e tests, and this has been deprioritized due to the relative pain of the flake being low.

The work to get this done is still being tracked by #9353

@jbohanon jbohanon closed this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants