Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s_cloud_beta1] Adding support for ssh using kubectl port-forward to access k8s instance #2412
[k8s_cloud_beta1] Adding support for ssh using kubectl port-forward to access k8s instance #2412
Changes from 20 commits
3021576
8afe74a
e87bcd6
be18925
82e1dd6
94e56f5
cd970b2
99edb40
d44a153
714cce0
d475914
8434636
2760922
1ece560
ae96627
9ae2f27
102ec10
0385dda
526ec35
6b54887
f229080
1e1a201
82b7182
d0a4461
60ee3c7
21b2d18
760b226
fa1136f
72ed006
12cd002
efef2ab
da940ff
c333b11
826430d
276f310
01bd78b
b017ec5
f225c38
0944ce9
beebedf
b61e824
eef84f0
3d977a0
84abe99
c0d7ecf
cb37ac1
2a4cc0d
05b3fb8
6e7c511
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this method, I would err on the side on over-documenting. It would be good to add details here on:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romilbhardwaj I wrote the doc that resolves all two bullet points. Please take a look!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the enum
KubernetesNetworkingMode
everywhere to avoid hardcoding strings 'nodeport' and 'portforward'? We can also ask the user to use the same string in the config file and then read directly.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romilbhardwaj I'm wondering if we should create an
enum
for service types as well since there are currently some places(setup_kubernetes_authentication
,setup_sshjump_svc
) usingNodePort
andClusterIP
as a hardcoded string. What do you think?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think it'll be good to have
KubernetesServiceType
enumThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you run
pytest tests/test_smoke.py --kubernetes -k "not TestStorageWithCredentials"
to make sure everything, including file_mounts, work correctly? I have manually verified, but going forward we want to run Kubernetes smoke tests for k8s PRs :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romilbhardwaj Currently, passing all the tests besides the ones requiring GPUs for this branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this? We ideally want to operate with whatever
KUBECONFIG
var is set by the user. I know some parts of our code won't support customKUBECONFIG
, but we should avoid adding new code that puts strict dependency on a kubeconfig path set by us.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Got them removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying this our and it silently failed to ssh for a long time before I realized I don't have socat installed.
Is it possible to check if
socat
is installed at the start of the script, raise an error if its not and propagate this error cleanly up to the user? Otherwise, we may want to add a check ifsocat
is installed elsewhere in our code...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romilbhardwaj I added a check for
socat
installation and displays error msg and exit at the beginning of the script if it's not installed so that it shows the msg when users attempts tossh <k8s-instance-name>
withoutsocat
installed.But it doesn't seem like there's a clean way to handle this exit and raise an error msg for every possible ssh session runs within skypilot. So I added another check for
socat
installation inauthentication.py/setup_kubernetes_authentication.py
when 'port-forward' mode is being setup.Running
sky launch
:Running
ssh <k8s-instance-name>
:Running
sky exec
:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On GKE clusters, This PR is currently using the external IP of nodes (
sky@34.16.78.81
):This will not work if the nodes are behind a firewall. It should instead of be connecting to
sky@127.0.0.1
since the the port-forward is running locally.You may need to update get_external_ip to accept an arg with the
KubernetesNetworkingMode
, which would return 127.0.0.1 if the mode is port-forward, else use the existing logic if the mode is nodeport.