Skip to content
This repository has been archived by the owner on Sep 5, 2019. It is now read-only.

Add pod start time #322

Merged
merged 9 commits into from
Sep 4, 2018
Merged

Conversation

shashwathi
Copy link
Contributor

Fixes #321

Proposed Changes

  • Rename current status.StartTime to status.CreationTime
  • Add pod start time as status.StartTime

@shashwathi
Copy link
Contributor Author

/assign @imjasonh

@shashwathi
Copy link
Contributor Author

I really want to add this build example as part of this PR as it shows startTime set to null and creationTime set to build created time.

Ideally the following example should fail because pods are never created and build times out after 3 minutes. The problem is that as the build times out, pods are terminated and pod termination is unsuccessful (because pod does not exist) and build status is not updated to failed which causes build to remain in Unknown state.

apiVersion: build.knative.dev/v1alpha1
kind: Build
metadata:
  name: test-build-with-badselector
  labels:
    expect: failed
spec:
  timeout: 3m
  steps:
  - name: step1
    image: ubuntu
    command: ["/bin/bash"]
    args: ["-c", "sleep 15"]
  nodeSelector:
    disk: "fake-ssd" 

One way to approach this problem is instead of just bubbling of all pod termination errors, it might be better to filter out Not Found errors. Another easy approach would be to update build status before calling Terminate() function(not a fan of this idea).

thoughts ? ideas?
@imjasonh @pivotal-nader-ziada @dprotaso

if bs.StartTime.IsZero() {
t.Errorf("bs.StartTime; want non-zero, got %v", bs.StartTime)
t.Errorf("status.StartTime; want non-zero, got %v", bs.StartTime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you update the other error messages for consistency? Or return this to bs.

if err != nil {
t.Errorf("Error parsing duration")
}

build.Status.StartTime.Time = metav1.Now().Time.Add(-addBuffer)
// Set creation time to Now()-10min
build.Status.CreationTime.Time = metav1.Now().Time.Add(buffer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously this was -addBuffer, now it's buffer, can you explain why this is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It made more sense to call the variable as buffer. Ideally I want to subtract the buffer time to trigger timeout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also just realized that line is not required for test setup. Let me update.

Shash Reddy added 2 commits August 29, 2018 17:17
- Set creation time to now()-buffer to trigger timeout when updating build status
@imjasonh
Copy link
Member

I think we should filter out not found errors from Terminate, which I imagine could happen for other reasons too.

@shashwathi
Copy link
Contributor Author

shashwathi commented Sep 4, 2018

@imjasonh : I have addressed your comments. Let me know if I missed anything.

@@ -21,16 +21,16 @@ import (
"fmt"
"sync"

v1alpha1 "github.com/knative/build/pkg/apis/build/v1alpha1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we separate these first-party imports (anything from knative/build) from the vendored imports from other packages?

@@ -92,8 +92,7 @@ func IsTimeout(status *v1alpha1.BuildStatus, buildTimeout string) bool {
return false
}
}

return time.Since(status.StartTime.Time).Seconds() > timeout.Seconds()
return time.Since(status.CreationTime.Time).Seconds() > timeout.Seconds()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Time spent scheduling the pod shouldn't count toward the build's timeout, should this consider status.StartTime instead of status.CreationTime ?

Unless we were thinking this would be a useful way to kill a build that was unschedulable, in which case we might need to introduce two separate timeouts.

In GCB we only enforce timeouts once the build starts, since it's not the user's fault if it takes us a long time to start their build. We've considered enforcing a timeout on queued time, but I don't think any user has asked for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't enforce timeout for allocating resources then build will remain in pending state forever. I was planning to follow up with a PR to introduce another timeout for allocating resources(queue timeout). Probably default this timeout to 10min and user doesn't need to worry about using this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, if that's on your radar for future changes that sounds good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created: #325 to follow up with changes discussed in this PR.

test/columns.txt Outdated
@@ -1,2 +1,2 @@
NAME BUILDER TYPE STATUS START END
.metadata.name .status.builder .status.conditions[*].state .status.conditions[*].status .status.startTime .status.completionTime
NAME BUILDER TYPE STATUS CREATED START END
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add columns to show startup duration and build execution duration? I don't know how powerful this template language is, if it's hard then don't worry about it, we can try to add it later.

Also, we can remove the BUILDER columns now that there's only one builder implementation.

Copy link
Contributor Author

@shashwathi shashwathi Sep 4, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the build has started but pod hasn't been scheduled then is it okay to have negative duration? Might be better off calculating that in status if we need that info.

Also, we can remove the BUILDER columns now that there's only one builder implementation.

Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should avoid a negative duration, since that would be pretty hard to interpret. I'm fine putting this off for some future change where we add startup and build duration to the status and make it easy to display here. Don't worry about it for now.

- Re-arrange imports
- Remove builder from custom columns txt
@knative-metrics-robot
Copy link

The following is the coverage report on pkg/.
Say /test pull-knative-build-go-coverage to run the coverage report again

File Old Coverage New Coverage Delta
pkg/builder/cluster/builder.go 78.6% 78.4% -0.3
pkg/builder/nop/builder.go 79.2% 80.0% 0.8

Copy link
Member

@imjasonh imjasonh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@knative-prow-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ImJasonH, shashwathi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@shashwathi
Copy link
Contributor Author

/test pull-knative-build-integration-tests

2 similar comments
@shashwathi
Copy link
Contributor Author

/test pull-knative-build-integration-tests

@shashwathi
Copy link
Contributor Author

/test pull-knative-build-integration-tests

@knative-prow-robot knative-prow-robot merged commit 66a754c into knative:master Sep 4, 2018
@shashwathi shashwathi deleted the add-pod-start-time branch September 4, 2018 21:16
vdemeester pushed a commit to vdemeester/knative-build that referenced this pull request Apr 3, 2019
* Rename startTime -> creationTime

* add build actual start time to status

* Add comments and fix tests

* Refactor start time function

* Fix error msgs

* Update test setup

- Set creation time to now()-buffer to trigger timeout when updating build status

* Remove unnecessary test setup for timeout

* Add example with bad node selector

* Address comments

- Re-arrange imports
- Remove builder from custom columns txt
vdemeester pushed a commit to vdemeester/knative-build that referenced this pull request Apr 3, 2019
* Rename startTime -> creationTime

* add build actual start time to status

* Add comments and fix tests

* Refactor start time function

* Fix error msgs

* Update test setup

- Set creation time to now()-buffer to trigger timeout when updating build status

* Remove unnecessary test setup for timeout

* Add example with bad node selector

* Address comments

- Re-arrange imports
- Remove builder from custom columns txt
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants