Package-specific gating policies may not work correctly with critpath packages #4259

AdamWill · 2021-09-24T19:49:19Z

In https://pagure.io/fedora-ci/docs/pull-request/71#comment-157488 , @psss pointed out a problem with the approach I came up with to query different Greenwave "decision contexts" for critpath and non-critpath updates; it may cause problems with package-specific gating policies.

Based on how things previously worked, it's reasonable for a package-specific gating policy to specify the decision context "bodhi_update_push_testing" and expect the policy to always apply when an update containing the package is gated for testing (ditto for stable). But with my change, this is no longer true. If the update contains a critpath package, the package-specific gating policy will not be applied.

I checked Greenwave's API, and it does not allow for querying on multiple "decision contexts" in a single request. It also does not seem to allow for policies to 'include' other policies, which would be helpful here (I may submit this as a suggestion to Greenwave). So with current Greenwave behaviour, I can think of two fixes off the top of my head:

Have Bodhi query Greenwave for both relevant contexts on critpath updates, and combine the results
Require package-specific gating policies to specify both decision contexts

The first has the benefit of not requiring any changes or 'special knowledge' for package-specific gating policies. However, it would mean Bodhi would run more Greenwave queries for critpath updates, generating more load on Greenwave and possibly slowing down Bodhi performance. It also turns out to be a bit tricky to implement: specifically, it would require some re-engineering of how Update.get_test_gating_info() and the things that use it work (although, in fact, that method is already not really 'safe' for updates containing more subjects than the Greenwave limit).

The second means we don't actually have to change Bodhi's behaviour any further, but would require us to go through all existing package gating policies and adjust them for this, and update documentation to try and make sure future package-specific policies specify both contexts. It could be a bit more convenient if Greenwave supported wildcards for decision contexts as it does for product versions, but it currently does not.

I'm filing the issue as I'm not sure which approach will be best here overall. I don't want to put the work into 1) if we decide 2) would be the best bet. Or if someone can come up with a third approach I haven't thought of yet. So, what do people think? Thanks!

The text was updated successfully, but these errors were encountered:

AdamWill · 2021-09-24T19:50:42Z

@hluk for thoughts from a Greenwave perspective.

psss · 2021-09-27T08:54:41Z

From my point of view, the option 2 is not a good approach as it means that many (tens or even hundreds) package configurations would need to be updated now and also upon introducing any similar decision context change in the future. The gating configuration is already now quite user unfriendly, I'd say, so I would discourage to make the things even worse.

The essential point of the gating config for a component should be IMO specifying which test/plan (resultsdb-testcase) should be used for which gate (decision context). That's it. Anything else sounds to me as a duplication of information which should be stored somewhere else (e.g. whether or not the component is on the critical path).

In the bright future we imagine with @mvadkert and friends, enabling of the gating could be as simple as specifying the gate name in the tmt plan. This is how a simple plan could look like:

execute:
    script: ./run-the-test.sh
gate:
  - add-build-to-update

But these dreams will need some more time to implement. @mvadkert, I guess you might also want to share your thoughts here.

thrix · 2021-10-04T10:03:00Z

from my POW the solution 1. is a more clear one to the user, so I agree we should aim for that if possible. Moreover, the current behaviour was introduced quite silently and it was not at all clear the gating will break for users after this change.

AdamWill · 2021-10-04T15:44:08Z

@thrix well, I did detail this approach in a devel@ thread which was quite widely read/discussed. Indeed I hadn't thought of this consequence of the approach, though, so it was not explicitly listed.

AdamWill · 2021-10-12T21:25:53Z

@psss for the record:

The essential point of the gating config for a component should be IMO specifying which test/plan (resultsdb-testcase) should be used for which gate (decision context). That's it. Anything else sounds to me as a duplication of information which should be stored somewhere else (e.g. whether or not the component is on the critical path).

This wouldn't really be "storing information" about that in the per-package gating config, it would just be acknowledging the possibility that an update including the package could be on the critical path. It's actually impossible to 'know' this at that level, because critpath status is "infectious" - an entire update is or is not critpath. If package foo is not in the critpath, an update containing only foo is not critical path, but if a build of foo happened to be in an update that also contained, say, kernel, that update would be critpath, and Bodhi would query on the critpath decision context...

Anyway, I got a bit more time to think about this today, and unfortunately my "include" idea wouldn't really work, because it's not a policy that would need to include another policy, it's a decision context that would need to include a decision context, and that just wouldn't make a lot of sense in the overall design here. The decision contexts aren't defined 'independently' anywhere you could store that kind of information, they are just strings in the policies, I just can't envisage a way to stuff this concept into the approach that would make sense.

I'll try and find a few hours to work on the 'have Bodhi query on both contexts' approach if I can.

AdamWill · 2021-10-15T17:14:02Z

since several people have actually been bitten by this problem (and because I'm getting nowhere fast trying to fix bugs in KDE's package manager...) I'm kicking this one up the priority list today. I think I can do something that'll be OK for the 'query both contexts' approach in Bodhi, it'll need a single I think little-used API endpoint's return value to change. Just got to adjust all the tests, now.

As discussed in fedora-infra#4259 and also reported in https://pagure.io/fedora-ci/general/issue/263 , my previous change to have Bodhi query on a different Greenwave 'decision context' for critical path updates - allowing us to gate critpath updates on openQA test results - caused a problem with package-specific gating policies. The instructions and examples for writing these only include the original decision contexts (bodhi_update_push_testing and bodhi_update_push_stable), they do not also recommend including the _critpath contexts, and this is indeed how all real-world package-specific policies have been written. This has resulted in package-specific policies being effectively ignored when the package in question is a part of a critical path update, because Bodhi is querying on the _critpath context but the package-specific policy doesn't include it. To avoid this problem, this change makes Bodhi query greenwave on *both* relevant contexts for critical path updates. The logic for parsing the results doesn't need to change because we were already coping with multiple queries for the case where an update has too many subjects for a single query. This does change the behaviour of one method - Update.get_test_gating_info() - and one API endpoint - /updates/{id}/get-test-results . These now both return lists of Greenwave decision dicts, rather than a single Greenwave decision dict. There's no way I can see to maintain behaviour here, and the existing behaviour was already wrong for updates with lots of packages (this method and endpoint should have been adjusted when we initially implemented batched requests for such updates, but they were missed). I don't think the method or the endpoint were commonly used. Signed-off-by: Adam Williamson <awilliam@redhat.com>

As discussed in #4259 and also reported in https://pagure.io/fedora-ci/general/issue/263 , my previous change to have Bodhi query on a different Greenwave 'decision context' for critical path updates - allowing us to gate critpath updates on openQA test results - caused a problem with package-specific gating policies. The instructions and examples for writing these only include the original decision contexts (bodhi_update_push_testing and bodhi_update_push_stable), they do not also recommend including the _critpath contexts, and this is indeed how all real-world package-specific policies have been written. This has resulted in package-specific policies being effectively ignored when the package in question is a part of a critical path update, because Bodhi is querying on the _critpath context but the package-specific policy doesn't include it. To avoid this problem, this change makes Bodhi query greenwave on *both* relevant contexts for critical path updates. The logic for parsing the results doesn't need to change because we were already coping with multiple queries for the case where an update has too many subjects for a single query. This does change the behaviour of one method - Update.get_test_gating_info() - and one API endpoint - /updates/{id}/get-test-results . These now both return lists of Greenwave decision dicts, rather than a single Greenwave decision dict. There's no way I can see to maintain behaviour here, and the existing behaviour was already wrong for updates with lots of packages (this method and endpoint should have been adjusted when we initially implemented batched requests for such updates, but they were missed). I don't think the method or the endpoint were commonly used. Signed-off-by: Adam Williamson <awilliam@redhat.com>

AdamWill mentioned this issue Oct 16, 2021

Greenwave: query both contexts for critpath updates (#4259) #4266

Merged

mattiaverga linked a pull request Oct 16, 2021 that will close this issue

Greenwave: query both contexts for critpath updates (#4259) #4266

Merged

mergify bot closed this as completed in #4266 Oct 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Package-specific gating policies may not work correctly with critpath packages #4259

Package-specific gating policies may not work correctly with critpath packages #4259

AdamWill commented Sep 24, 2021

AdamWill commented Sep 24, 2021

psss commented Sep 27, 2021

thrix commented Oct 4, 2021

AdamWill commented Oct 4, 2021

AdamWill commented Oct 12, 2021 •

edited

Loading

AdamWill commented Oct 15, 2021

Package-specific gating policies may not work correctly with critpath packages #4259

Package-specific gating policies may not work correctly with critpath packages #4259

Comments

AdamWill commented Sep 24, 2021

AdamWill commented Sep 24, 2021

psss commented Sep 27, 2021

thrix commented Oct 4, 2021

AdamWill commented Oct 4, 2021

AdamWill commented Oct 12, 2021 • edited Loading

AdamWill commented Oct 15, 2021

AdamWill commented Oct 12, 2021 •

edited

Loading