-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Leave Resource Request Empty Instead of Requesting 0 #1045
Comments
In the documentation it states: Obviously if the platform has a LimitRange set that will prevent 0 to be the minimum the task will not run. This should be fixed by either setting the request as the same value as the container image with the largest resource request or by not setting a value at all, if there's a LimitRange defining a minimum, let the LimitRange apply it. In the example above, none of the images have a request defined, but it seems the LimitRange has applied to value to the first container created and hence the second will be set to 0. cc/ @siamaksade |
IIRC, the reason we set the requests to 0 explicitly in the fix for #598 is that if you leave the request unset, the Pod still ends up summing the default values for the containers, and values used for limits become the values for requests (see #598 (comment) for some discussion of this). I think the best fix here would be to load the LimitRange for Containers somewhere in pod.go, and then change |
That would be good, as if a Task runs all the steps as concurrent containers all the memory will be added up. Some of them are using this image I'm using the Pipeline defined in this tutorial, so it's really easy to reproduce. |
All requests you set on steps in a Task other than the largest one will be replaced with 0. Depending on exactly how the minimum values for a LimitRange for a Container are enforced (per-container, or per-pod?) It might mean that we actually need to set the minimum on every container. If we did that, then to still have #598 fixed, we'd need to change the largest request from whatever it is (say X), to At that point this all starts looking pretty complicated, so I wonder if there is a better way to control how Pod-level resource requests are computed to fix #598 without having to modify what users have specified. |
I've verified I can set the requests on every container on every step to prevent them to be calculated and henced zeroed. Also, I've verified that the containers the framework adds (and hence I can not explicitly set the requests) are all zeroed, except for the last one I described Here's the pod definition and in line 333 you can see the last container without this computed request. I guess this is a bug. LimitRanges are enforced by cluster-administrators to make sure that the resources are not wasted. Even without discussion whether having a minimum request is a good or bad idea, the reality is that it happens and there's nothing that we can do about it other than understand this fact and compute resources for the TaskRun pod in a mindful way. |
I think they would all be zeroed except for the one with the largest request. In your example pod definition, all requests are zeroed except for the one container which requested 6mi of memory. Is that what you mean?
Yeah, I'm not sure why no requests are set for the nop container, or why the nop container is even present. I think I can see why that would have been broken before ace3ab6#diff-9177f963c9c62cb8e53097a81c571619L293, since the nop container was added after the resource requests were modified, but it looks like the nop container is no longer added as of that commit, released in 0.5.0. What version of Tekton are you running?
I think we are in agreement here, if you have any thoughts on how to fix it without regressing #598, they would be welcome. Right now the only idea I have is what I mentioned in #1045 (comment). |
LimitRange is used and although a best practice for namespaces where Tekton Tasks are running is to definitely not set a I agree that we could certainly get fancy wrt calculating an upper limit on request e.g. For the lower limit though we should update our request zeroing to explicitly set the request to the LimitRange minimum or 0 if not present. Leaving it blank will use either the LimitRange default or an internal implementation specific default which typically are higher. |
I think it’s a good idea to honor a I think it’s a better approach to have the default be empty and trust the |
The problem is that if set the request to the LimitRange minimum or the default request, #598 becomes a problem again, because now the default/minimum values will be combined to compute the resources required for the Pod even though the containers will only execute one at a time, so the pod requires more resources than it will ever use at a single point in time. That might be fine if you use the minimum values, because the additional resources required will probably not be significantly different than the maximum value. For example, with that change, say we have 6 steps for a Task, the max memory request is 1Gi, and the LimitRange minimum is 10Mi, then the Pod for the Task will request 6.05Gi even though it only requires 6Gi. If you use the default request values, not the minimums, then the excess could be a lot worse, since the default is probably significantly higher than the minimum. It seems like the fundamental issue is that we are doing tricky things that are opaque to Kubernetes to make |
My apologies. I didn’t fully go through the discussions in #598 and now better understand the proposal from @skaegi. If the goal here is a short term fix, I think it sounds reasonable. I think a question is if it makes sense to require resources be defined for a Task. However, this seems like it would be a poor user experience. Not sure if there had been any discussions around that. |
IMHO there should be the ability for the user to explicitly set this values
in every container that tekton adds to the pod, so he can fix the problem
if it arises. Limit range is something that typically a user/developer
doesn't control and a problem not being able to schedule a pipeline can be
really tricky to fix. The user/developer should be able to fix these as
much as possible.
I think that as long as the user can explicitly set this value instead of
renting on the LimitRange in case of problems, then leaving the request as
0 or non-0 should not be a big issue, as it's solvable by the user in case
of problems.
|
/assign |
I recently went through the pipelines-tutorial in the OpenShift org and encountered the following error message:
This occurred because my OpenShift project had a
LimitRange
with the following specifications:Since there were no specifications on the container cpu of resource request for the pipeline, the resource request was set to 0. This caused the
pipelinerun
as part of the tutorial to fail.If no amount had been requested, the
LimitRange
's specifications would have been used for thetask
s' containers, and I think this should be the approach moving forward. I believe the source in question can be found here.In general, I think the error message should be improved to better reflect the situation.
Missing or invalid Task pipelines-tutorial/s2i-java-8
is not what was wrong with thepipelinerun
. The issue was due to the cpu resource request being 0, which was less than theLimitRange
's min value.More on how I ran into this issue can be found here.
Expected Behavior
Error message should not have
Missing or invalid Task pipelines-tutorial/s2i-java-8
as reasonpipelinerun
failed. It should focus more on theLimitRange
aspect of the issue.The default value for a resource request should be the minimum of a
LimitRange
instead of 0.Actual Behavior
Error message in issue description.
Steps to Reproduce the Problem
LimitRange
minimum of greater than 0 on an OpenShift 4 cluster namespace and trigger apipelinerun
with the default resource request.Additional Info
This was run on a Red Hat Product Demo System (RHPDS) OpenShift 4.1 cluster.
The text was updated successfully, but these errors were encountered: