Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run os_must_gather only in case of failure #457

Merged

Conversation

dasm
Copy link
Contributor

@dasm dasm commented Aug 16, 2023

Run os_must_gather only in case of a failure which happened during "run" step in ZUUL.
By doing so, we're reducing post-run job by about 450-600s, also preserving disk space.

As a pull request owner and reviewers, I checked that:

  • Appropriate testing is done and actually running
  • Appropriate documentation exists and/or is up-to-date:
    • README in the role
    • Content of the docs/source is reflecting the changes

@dasm dasm requested a review from pablintino August 16, 2023 21:07
@dasm dasm requested review from cjeanner and removed request for Sandeepyadav93 and psathyan August 16, 2023 21:07
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/548f96a749844e1fbb477dac6250fc2a

✔️ noop SUCCESS in 0s
✔️ ci-framework-crc-podified-galera-deployment SUCCESS in 37m 14s
ci-framework-crc-podified-edpm-baremetal FAILURE in 13m 39s
ci-framework-crc-podified-edpm-deployment FAILURE in 20m 45s
✔️ cifmw-end-to-end-nobuild-tagged SUCCESS in 35m 12s

ci/playbooks/edpm/run.yml Outdated Show resolved Hide resolved
ci/playbooks/edpm/run.yml Outdated Show resolved Hide resolved
ci_framework/playbooks/99-logs.yml Outdated Show resolved Hide resolved
@pablintino
Copy link
Collaborator

Just asking, there is no zuul fact set that tells us the result of the job?

@cjeanner
Copy link
Collaborator

Just asking, there is no zuul fact set that tells us the result of the job?

I'd rather avoid that, too zuul centric IMHO.

@dasm dasm force-pushed the log_gathering branch 2 times, most recently from d40ca54 to 3d9e882 Compare August 17, 2023 20:34
@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/7f905f3b8aea42908404287b302c0cdb

✔️ noop SUCCESS in 0s
✔️ ci-framework-crc-podified-galera-deployment SUCCESS in 36m 54s
ci-framework-crc-podified-edpm-baremetal FAILURE in 11m 28s
✔️ ci-framework-crc-podified-edpm-deployment SUCCESS in 52m 42s
✔️ cifmw-end-to-end-nobuild-tagged SUCCESS in 39m 22s

deploy-edpm.yml Show resolved Hide resolved
rescue:
- name: Get CRC logs if os_must_gather failed
ansible.builtin.import_tasks: crc.yml
always:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC @cjeanner you want to skip any logs collection on successful run.
I created a block around os_must_gather and crc.yml with edpm.yml . I hope it's going to allow us to spare a disk space and time execution as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we probably don't need the always block for the edpm.yml inclusion.

But that's more cosmetics at this point. Should do the trick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I wouldn't add always block, it couldn't be "disabled" when successful test run.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm yep, by adding the same when condition :).

@raukadah
Copy link
Contributor

recheck

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/f872fc24777c4069bb83497e11c8399d

✔️ noop SUCCESS in 0s
✔️ ci-framework-crc-podified-galera-deployment SUCCESS in 29m 30s
ci-framework-crc-podified-edpm-baremetal FAILURE in 1h 12m 25s
ci-framework-crc-podified-edpm-deployment FAILURE in 44m 37s
✔️ cifmw-doc SUCCESS in 2m 17s
✔️ cifmw-end-to-end-nobuild-tagged SUCCESS in 35m 26s
✔️ cifmw-molecule-artifacts SUCCESS in 5m 11s

@dasm
Copy link
Contributor Author

dasm commented Aug 22, 2023

@cjeanner can you give me anouther round of review?

Copy link
Collaborator

@cjeanner cjeanner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 23, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cjeanner

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Run os_must_gather only in case of a failure which happened during "run"
step in ZUUL.
By doing so, we're reducing post-run job by about 450-600s, also
preserving disk space.
@pablintino
Copy link
Collaborator

/lgtm

@softwarefactory-project-zuul
Copy link

Build failed (check pipeline). Post recheck (without leading slash)
to rerun all jobs. Make sure the failure cause has been resolved before
you rerun jobs.

https://review.rdoproject.org/zuul/buildset/85b52f2e5ed148e9bb3f85d9b67ac113

✔️ noop SUCCESS in 0s
✔️ ci-framework-crc-podified-galera-deployment SUCCESS in 38m 46s
ci-framework-crc-podified-edpm-baremetal FAILURE in 30m 35s
✔️ ci-framework-crc-podified-edpm-deployment SUCCESS in 1h 07m 21s
✔️ cifmw-doc SUCCESS in 1m 45s
✔️ cifmw-end-to-end-nobuild-tagged SUCCESS in 39m 57s
✔️ cifmw-molecule-artifacts SUCCESS in 4m 50s

@dasm
Copy link
Contributor Author

dasm commented Aug 24, 2023

recheck

@openshift-merge-robot openshift-merge-robot merged commit 54d80e3 into openstack-k8s-operators:main Aug 24, 2023
2 checks passed
raukadah added a commit that referenced this pull request Aug 25, 2023
#457
enables running os_must_gather role in case of failure but also
addes resuce task to run crc and edpm log collection.

Since os_must_gather role runs must_gather tool to do the log
collection and if must_gather fails, it also resuces operation
to collect more logs on the node. The task associated with calling
os_must_gather role is always going to get skipped if the job
passes and we have no log collection for edpm and podified.

Developers have no logs to verify/check few things in passed
jobs.

This pr fixes the same by running crc task file in case of passing
job and os_must_gather role in case of failed job. We should
always run edpm task files to collect edpm logs.

Signed-off-by: Chandan Kumar <raukadah@gmail.com>
@raukadah raukadah mentioned this pull request Aug 25, 2023
1 task
raukadah added a commit that referenced this pull request Aug 25, 2023
#457
enables running os_must_gather role in case of failure but also
addes resuce task to run crc and edpm log collection.

Since os_must_gather role runs must_gather tool to do the log
collection and if must_gather fails, it also resuces operation
to collect more logs on the node. The task associated with calling
os_must_gather role is always going to get skipped if the job
passes and we have no log collection for edpm and podified.

Developers have no logs to verify/check few things in passed
jobs.

This pr fixes the same by running crc task file in case of passing
job and os_must_gather role in case of failed job. We should
always run edpm task files to collect edpm logs.

Signed-off-by: Chandan Kumar <raukadah@gmail.com>
raukadah added a commit that referenced this pull request Aug 25, 2023
#457
enables running os_must_gather role in case of failure but also
addes resuce task to run crc and edpm log collection.

Since os_must_gather role runs must_gather tool to do the log
collection and if must_gather fails, it also resuces operation
to collect more logs on the node. The task associated with calling
os_must_gather role is always going to get skipped if the job
passes and we have no log collection for edpm and podified.

Developers have no logs to verify/check few things in passed
jobs.

This pr fixes the same by running crc task file in case of passing
job and os_must_gather role in case of failed job. We should
always run edpm task files to collect edpm logs.

Signed-off-by: Chandan Kumar <raukadah@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants