Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing build fix #2406

Merged

Conversation

mik-dass
Copy link
Contributor

What kind of PR is this?

/kind flake

What does does this PR do / why we need it:

Which issue(s) this PR fixes:

Fixes #1981

How to test changes / Special notes to the reviewer:

@mik-dass mik-dass added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. Required by Prow. label Nov 21, 2019
@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. Required by Prow. flake Categorizes issue or PR as related to a flaky test. size/M labels Nov 21, 2019
@mik-dass
Copy link
Contributor Author

[odo]  ✗  waited 4m0s but couldn't find running pod matching selector: 'deploymentconfig=backend-app'

/retest

@mik-dass
Copy link
Contributor Author

Container setup exited with code 1, reason Error

/retest

@mik-dass
Copy link
Contributor Author

error: could not run steps: step integration-e2e-benchmark failed: template pod "integration-e2e-benchmark" failed: the pod ci-op-hgs9sfdi/integration-e2e-benchmark failed after 44m39s (failed containers: setup, test): ContainerFailed one or more containers exited

/retest

@mik-dass
Copy link
Contributor Author

mik-dass commented Nov 26, 2019

error: could not run steps: step [release:latest] failed: release "release-latest" failed: the pod ci-op-v48j03kz/release-latest failed after 47s (failed containers: release): ContainerFailed one or more containers exited

/retest

@mik-dass
Copy link
Contributor Author

mik-dass commented Dec 5, 2019

error: could not run steps: step integration-e2e-benchmark failed: template pod "integration-e2e-benchmark" failed: the pod ci-op-3q88t7ms/integration-e2e-benchmark failed after 11m46s (failed containers: setup, test): ContainerFailed one or more containers exited

/retest

@mik-dass
Copy link
Contributor Author

mik-dass commented Dec 9, 2019

/test v4.3-integration-e2e-benchmark

@mik-dass mik-dass changed the title [WIP] Testing build fix Testing build fix Dec 10, 2019
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. Required by Prow. label Dec 10, 2019
@mik-dass mik-dass removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. Required by Prow. label Dec 10, 2019
@@ -1682,23 +1683,36 @@ func (c *Client) WaitForBuildToFinish(buildName string) error {
return errors.Wrapf(err, "unable to watch build")
}
defer w.Stop()
timeout := time.After(5 * time.Minute)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be a constant? (high above in occlient.go)

@@ -1672,7 +1672,8 @@ func (c *Client) StartBuild(name string) (string, error) {
}

// WaitForBuildToFinish block and waits for build to finish. Returns error if build failed or was canceled.
func (c *Client) WaitForBuildToFinish(buildName string) error {
func (c *Client) WaitForBuildToFinish(buildName string, stdout io.Writer) error {
following := false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need more description on this, bit confused why we're setting following to false then true when going through the channel loop.

if err != nil {
return err
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of this above looks good!

@@ -1845,6 +1859,7 @@ func (c *Client) FollowBuildLog(buildName string, stdout io.Writer) error {
}

rd, err := c.buildClient.RESTClient().Get().
Timeout(5*time.Minute).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be a constant

@mik-dass
Copy link
Contributor Author

@cdrage Fixed

Copy link
Member

@dharmit dharmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't understand the code in this PR. 😞

func (c *Client) WaitForBuildToFinish(buildName string) error {
func (c *Client) WaitForBuildToFinish(buildName string, stdout io.Writer) error {
// following indicates if we have already setup the following logic
following := false
glog.V(4).Infof("Waiting for %s build to finish", buildName)

w, err := c.buildClient.Builds(c.Namespace).Watch(metav1.ListOptions{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is not touched in the proposed PR but can we add a comment that describes what this all about? At least I didn't understand a thing here. 😞

@@ -1682,23 +1685,37 @@ func (c *Client) WaitForBuildToFinish(buildName string) error {
return errors.Wrapf(err, "unable to watch build")
}
defer w.Stop()
timeout := time.After(OcBuildTimeout)
for {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't understand what's going on in this for loop either. 😞

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can help here a bit. So here first thing to observe is the select statement on a quick TLDR: select statements are like channel collectors - listen on multiple channels at once, more here https://tour.golang.org/concurrency/5.

now the Build().Watch() returns a channel which returns messages on as per the pod status. Now the change that has been done here is
we wait for the pod to start before we start following the logs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as for the need for the following, The w.ResultChan() can send multiple messages of type BuildPhaseRunning. But we dont have to follow logs multiple times, just the first time should be good enough. Hence this flag.
Think of it as a singleton

@mik-dass
Copy link
Contributor Author

@dharmit Updated with more comments

@mik-dass
Copy link
Contributor Author

[odo]  ✗  invalid configuration: [context was not found for specified context: tmeibqyxnu/api-ci-op-t9t1htld-00a90-origin-ci-int-aws-dev-rhcloud-com:6443/developer, cluster has no server defined]
[odo] Please login to your server: 

/retest

@girishramnani
Copy link
Contributor

/approve

@openshift-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: girishramnani

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. Required by Prow. label Dec 27, 2019
@kadel
Copy link
Member

kadel commented Jan 31, 2020

/test all

@kadel
Copy link
Member

kadel commented Jan 31, 2020

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. Required by Prow. label Jan 31, 2020
@openshift-merge-robot openshift-merge-robot merged commit ec73cab into redhat-developer:master Jan 31, 2020
@mik-dass mik-dass deleted the build_timeout_fix branch February 3, 2020 06:08
@rm3l rm3l added the estimated-size/M (10-20) Rough sizing for Epics. About 1 sprint of work for one person label Jun 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. Required by Prow. estimated-size/M (10-20) Rough sizing for Epics. About 1 sprint of work for one person flake Categorizes issue or PR as related to a flaky test. lgtm Indicates that a PR is ready to be merged. Required by Prow.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flake] Failed to update config to git component deployed
8 participants