Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Re-enable flaky kubectl plugin e2e test in kubectl_ray_job_submit_test.go #3124

7 changes: 7 additions & 0 deletions .buildkite/setup-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,12 @@ mv linux-amd64/helm /usr/local/bin/helm
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update

# Install python 3.11 and pip
apt-get update
apt-get install -y python3.11 python3-pip

# Install requirements
pip install --break-system-packages ray[default]==2.41.0

# Bypass Git's ownership check due to unconventional user IDs in Docker containers
git config --global --add safe.directory /workdir
6 changes: 4 additions & 2 deletions kubectl-plugin/test/e2e/kubectl_ray_job_submit_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@ var _ = Describe("Calling ray plugin `job submit` command on Ray Job", func() {
})

It("succeed in submitting RayJob", func() {
Skip("Skip this test as it is failing on CI")
killKubectlCmd := exec.Command("pkill", "-9", "kubectl")
_ = killKubectlCmd.Run()
Comment on lines +34 to +35
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changes needed?

There are 2 reason make the test fail.

  • There are some kubectl process occupied the 8265 port that is required by the kubectl job submit. But for the independence of tests, we should not expect the test cause this problem to kill the process properly every time, so added the pkill kubectl before running kubectl job submit test.
  • The ray is not installed on build kite env, so modified the setup-env.sh to install.

Copy link
Member

@MortalHappiness MortalHappiness Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also example which tests occupy this port. This can be done in follow-up PRs. Once we find the test, we need to kill the kubectl process at the end of the test.

cmd := exec.Command("kubectl", "ray", "job", "submit", "--namespace", namespace, "-f", rayJobFilePath, "--working-dir", kubectlRayJobWorkingDir, "--", "python", entrypointSampleFileName)
output, err := cmd.CombinedOutput()

Expand Down Expand Up @@ -67,7 +68,8 @@ var _ = Describe("Calling ray plugin `job submit` command on Ray Job", func() {
})

It("succeed in submitting RayJob with runtime environment set with working dir", func() {
Skip("Skip this test as it is failing on CI")
killKubectlCmd := exec.Command("pkill", "-9", "kubectl")
_ = killKubectlCmd.Run()
runtimeEnvFilePath := path.Join(kubectlRayJobWorkingDir, runtimeEnvSampleFileName)
cmd := exec.Command("kubectl", "ray", "job", "submit", "--namespace", namespace, "-f", rayJobNoEnvFilePath, "--runtime-env", runtimeEnvFilePath, "--", "python", entrypointSampleFileName)
output, err := cmd.CombinedOutput()
Expand Down
Loading