Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRT-1702: record risk analysis results in BQ #28926

Merged
merged 2 commits into from
Jul 23, 2024

Conversation

sosiouxme
Copy link
Member

@sosiouxme sosiouxme commented Jul 8, 2024

This hooks into the existing RA process to write autodl artifacts resulting in both a log of any attempts to request RA (and how they went) in one table as well as another table with the results themselves from RA for later aggregated analysis.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 8, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 8, 2024

@sosiouxme: This pull request references TRT-1702 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 8, 2024

@sosiouxme: This pull request references TRT-1702 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.17.0" version, but no target version was set.

In response to this:

This hooks into the existing RA process to write autodl artifacts resulting in both a log of any attempts to request RA (and how they went) in one table as well as another table with the results themselves from RA for later aggregated analysis.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from p0lyn0mial and soltysh July 8, 2024 12:59
@openshift-ci openshift-ci bot added the vendor-update Touching vendor dir or related files label Jul 8, 2024
endTime := time.Now()
duration := endTime.Sub(startTime)
logrus.Infof("Call to sippy finished after: %s", duration)
cancelFn() // cancel the context timeout we just used.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you call this via defer on line 189, is it intentional to call it again?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looks like the same call was previously at line 121.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's how it was when i started, just moved slightly. i wasn't sure i understood the context well enough to change. i gather it's to clean up the timeout, and if it hasn't already timed out, keep it from firing in the middle of the rest of the code (but what would happen if it did?). the defer does seem superfluous as there's no way it could make it through an iteration without calling cancelFn() but... calling it twice seems unlikely to cause problems?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep caught that after my first comment. It is how it was so let's leave it.

Rows: []map[string]string{},
}
dataFile.Rows = append(dataFile.Rows, map[string]string{
"TestName": "", // no test name means overall risk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally it is probably all the same but wondering if we should explicitly say 'Overall' here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a matter of taste, as long as you're confident there will never be a test named "Overall". I was considering separating these out with another boolean column, or even in a separate table from regular test results. I am open to any of these. To me, if there's to be a special value that says "not an individual test", it makes sense for it to be the empty string.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up separating these into two tables after all, since they had different enriched content.

pkg/riskanalysis/cmd.go Outdated Show resolved Hide resolved
@neisw
Copy link
Contributor

neisw commented Jul 17, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jul 17, 2024
Copy link
Contributor

openshift-ci bot commented Jul 17, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neisw, sosiouxme

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 17, 2024
@sosiouxme sosiouxme added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Jul 17, 2024
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD a6249cb and 2 for PR HEAD 7ee51c3 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD fe2c855 and 1 for PR HEAD 7ee51c3 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 780ab35 and 0 for PR HEAD 7ee51c3 in total

Copy link
Contributor

openshift-ci bot commented Jul 21, 2024

@sosiouxme: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-openstack-ovn 7ee51c3 link false /test e2e-openstack-ovn
ci/prow/e2e-aws-ovn-single-node-upgrade 7ee51c3 link false /test e2e-aws-ovn-single-node-upgrade
ci/prow/e2e-aws-ovn-ipsec-serial 7ee51c3 link false /test e2e-aws-ovn-ipsec-serial

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

/hold

Revision 7ee51c3 was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 21, 2024
@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 7ee51c3

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-ipsec-serial High
[sig-arch] events should not repeat pathologically for ns/openshift-authentication-operator
This test has passed 100.00% of 52 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.17-e2e-aws-ovn-serial' 'periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-serial'] in the last 14 days.
---
operator conditions authentication
This test has passed 100.00% of 52 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.17-e2e-aws-ovn-serial' 'periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-serial'] in the last 14 days.
---
[bz-Monitoring] clusteroperator/monitoring should not change condition/Available
This test has passed 100.00% of 52 runs on jobs ['periodic-ci-openshift-release-master-nightly-4.17-e2e-aws-ovn-serial' 'periodic-ci-openshift-release-master-ci-4.17-e2e-aws-ovn-serial'] in the last 14 days.

Open Bugs
monitoring ClusterOperator should not blip Available=Unknown on client rate limiter

@sosiouxme
Copy link
Member Author

not sure what the bot is smoking, since blocking tests all passed. looks like it's trying to retest but there are none required? anyway i don't see a need for the hold.

the RA looks scary but there's a lot going on and these can't possibly be related to my change.

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 22, 2024
@sosiouxme
Copy link
Member Author

hm. yeah. the bots seem to have mixed ideas about what tests are required.
/test e2e-metal-ipi-ovn-ipv6

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 90338d5 and 2 for PR HEAD 7ee51c3 in total

@openshift-merge-bot openshift-merge-bot bot merged commit 946048d into openshift:master Jul 23, 2024
21 of 24 checks passed
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

Distgit: openshift-enterprise-tests
This PR has been included in build openshift-enterprise-tests-container-v4.17.0-202407230841.p0.g946048d.assembly.stream.el9.
All builds following this will include this PR.

@sosiouxme sosiouxme deleted the 20240705-ra-recording branch July 23, 2024 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants