-
Notifications
You must be signed in to change notification settings - Fork 953
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: Can not re-apply a same job when the old pods are Terminating #2284
Comments
Yes, this is a bug about dependsOn.
|
/assign @hwdef |
same issue #2130 |
Oh, thx. Can you tell me how to restart the volcano-controller? |
kubectl delete po -n volcano-system -l app=volcano-controller |
/close |
@hwdef: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What happened:
I use
kubectl apply -f hvd-job-torch-fairmot-2gpu.yaml
applied a job with 1 master pod(cpu task) and 2 worker pods(gpu task), andkubectl get pod
can output the pods. After the job run, the pods status is Running. Then I usekubectl delete -f hvd-job-torch-fairmot-2gpu.yaml
delete the job, the pods status are Terminating. Before all the pod deleted comletely, I re-apply the same job as before, then the old pods will be deleted with a few moment, but the new pods can not started. Although I delete the job and re-apply the same job, the pods can not start.The output of
kubectl describe -f hvd-job-torch-fairmot-2gpu.yaml
as fllow,What you expected to happen:
Can apply the same job and
kubectl get pods
normally output pods status when the old pods status are Terminating.How to reproduce it (as minimally and precisely as possible):
kubectl apply -f hvd-job-torch-fairmot-2gpu.yaml
, then the job can run normally, and pods status are Running.kubectl delete -f hvd-job-torch-fairmot-2gpu.yaml
, then the pods status are Terminating.Anything else we need to know?:
Environment:
kubectl version
): 1.21.3uname -a
): 3.10.0-957.5.1.el7.x86_64kubectl apply -f volcano-development.yaml
The text was updated successfully, but these errors were encountered: