-
Notifications
You must be signed in to change notification settings - Fork 159
Add pod start time #322
Add pod start time #322
Conversation
/assign @imjasonh |
I really want to add this build example as part of this PR as it shows Ideally the following example should fail because pods are never created and build times out after 3 minutes. The problem is that as the build times out, pods are terminated and pod termination is unsuccessful (because pod does not exist) and build status is not updated to failed which causes build to remain in
One way to approach this problem is instead of just bubbling of all pod termination errors, it might be better to filter out |
pkg/builder/nop/builder_test.go
Outdated
if bs.StartTime.IsZero() { | ||
t.Errorf("bs.StartTime; want non-zero, got %v", bs.StartTime) | ||
t.Errorf("status.StartTime; want non-zero, got %v", bs.StartTime) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you update the other error messages for consistency? Or return this to bs.
if err != nil { | ||
t.Errorf("Error parsing duration") | ||
} | ||
|
||
build.Status.StartTime.Time = metav1.Now().Time.Add(-addBuffer) | ||
// Set creation time to Now()-10min | ||
build.Status.CreationTime.Time = metav1.Now().Time.Add(buffer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously this was -addBuffer
, now it's buffer
, can you explain why this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It made more sense to call the variable as buffer. Ideally I want to subtract the buffer time to trigger timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also just realized that line is not required for test setup. Let me update.
- Set creation time to now()-buffer to trigger timeout when updating build status
I think we should filter out not found errors from |
@imjasonh : I have addressed your comments. Let me know if I missed anything. |
pkg/builder/cluster/builder.go
Outdated
@@ -21,16 +21,16 @@ import ( | |||
"fmt" | |||
"sync" | |||
|
|||
v1alpha1 "github.com/knative/build/pkg/apis/build/v1alpha1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we separate these first-party imports (anything from knative/build
) from the vendored imports from other packages?
@@ -92,8 +92,7 @@ func IsTimeout(status *v1alpha1.BuildStatus, buildTimeout string) bool { | |||
return false | |||
} | |||
} | |||
|
|||
return time.Since(status.StartTime.Time).Seconds() > timeout.Seconds() | |||
return time.Since(status.CreationTime.Time).Seconds() > timeout.Seconds() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time spent scheduling the pod shouldn't count toward the build's timeout, should this consider status.StartTime
instead of status.CreationTime
?
Unless we were thinking this would be a useful way to kill a build that was unschedulable, in which case we might need to introduce two separate timeouts.
In GCB we only enforce timeouts once the build starts, since it's not the user's fault if it takes us a long time to start their build. We've considered enforcing a timeout on queued time, but I don't think any user has asked for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't enforce timeout for allocating resources then build will remain in pending
state forever. I was planning to follow up with a PR to introduce another timeout for allocating resources(queue timeout). Probably default this timeout to 10min and user doesn't need to worry about using this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, if that's on your radar for future changes that sounds good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created: #325 to follow up with changes discussed in this PR.
test/columns.txt
Outdated
@@ -1,2 +1,2 @@ | |||
NAME BUILDER TYPE STATUS START END | |||
.metadata.name .status.builder .status.conditions[*].state .status.conditions[*].status .status.startTime .status.completionTime | |||
NAME BUILDER TYPE STATUS CREATED START END |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to add columns to show startup duration and build execution duration? I don't know how powerful this template language is, if it's hard then don't worry about it, we can try to add it later.
Also, we can remove the BUILDER
columns now that there's only one builder implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the build has started but pod hasn't been scheduled then is it okay to have negative duration? Might be better off calculating that in status if we need that info.
Also, we can remove the BUILDER columns now that there's only one builder implementation.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should avoid a negative duration, since that would be pretty hard to interpret. I'm fine putting this off for some future change where we add startup and build duration to the status
and make it easy to display here. Don't worry about it for now.
- Re-arrange imports - Remove builder from custom columns txt
The following is the coverage report on pkg/.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ImJasonH, shashwathi The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/test pull-knative-build-integration-tests |
2 similar comments
/test pull-knative-build-integration-tests |
/test pull-knative-build-integration-tests |
* Rename startTime -> creationTime * add build actual start time to status * Add comments and fix tests * Refactor start time function * Fix error msgs * Update test setup - Set creation time to now()-buffer to trigger timeout when updating build status * Remove unnecessary test setup for timeout * Add example with bad node selector * Address comments - Re-arrange imports - Remove builder from custom columns txt
* Rename startTime -> creationTime * add build actual start time to status * Add comments and fix tests * Refactor start time function * Fix error msgs * Update test setup - Set creation time to now()-buffer to trigger timeout when updating build status * Remove unnecessary test setup for timeout * Add example with bad node selector * Address comments - Re-arrange imports - Remove builder from custom columns txt
Fixes #321
Proposed Changes