Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Package-specific gating policies may not work correctly with critpath packages #4259

Closed
AdamWill opened this issue Sep 24, 2021 · 6 comments · Fixed by #4266
Closed

Package-specific gating policies may not work correctly with critpath packages #4259

AdamWill opened this issue Sep 24, 2021 · 6 comments · Fixed by #4266

Comments

@AdamWill
Copy link
Contributor

In https://pagure.io/fedora-ci/docs/pull-request/71#comment-157488 , @psss pointed out a problem with the approach I came up with to query different Greenwave "decision contexts" for critpath and non-critpath updates; it may cause problems with package-specific gating policies.

Based on how things previously worked, it's reasonable for a package-specific gating policy to specify the decision context "bodhi_update_push_testing" and expect the policy to always apply when an update containing the package is gated for testing (ditto for stable). But with my change, this is no longer true. If the update contains a critpath package, the package-specific gating policy will not be applied.

I checked Greenwave's API, and it does not allow for querying on multiple "decision contexts" in a single request. It also does not seem to allow for policies to 'include' other policies, which would be helpful here (I may submit this as a suggestion to Greenwave). So with current Greenwave behaviour, I can think of two fixes off the top of my head:

  1. Have Bodhi query Greenwave for both relevant contexts on critpath updates, and combine the results
  2. Require package-specific gating policies to specify both decision contexts

The first has the benefit of not requiring any changes or 'special knowledge' for package-specific gating policies. However, it would mean Bodhi would run more Greenwave queries for critpath updates, generating more load on Greenwave and possibly slowing down Bodhi performance. It also turns out to be a bit tricky to implement: specifically, it would require some re-engineering of how Update.get_test_gating_info() and the things that use it work (although, in fact, that method is already not really 'safe' for updates containing more subjects than the Greenwave limit).

The second means we don't actually have to change Bodhi's behaviour any further, but would require us to go through all existing package gating policies and adjust them for this, and update documentation to try and make sure future package-specific policies specify both contexts. It could be a bit more convenient if Greenwave supported wildcards for decision contexts as it does for product versions, but it currently does not.

I'm filing the issue as I'm not sure which approach will be best here overall. I don't want to put the work into 1) if we decide 2) would be the best bet. Or if someone can come up with a third approach I haven't thought of yet. So, what do people think? Thanks!

@AdamWill
Copy link
Contributor Author

@hluk for thoughts from a Greenwave perspective.

@psss
Copy link

psss commented Sep 27, 2021

From my point of view, the option 2 is not a good approach as it means that many (tens or even hundreds) package configurations would need to be updated now and also upon introducing any similar decision context change in the future. The gating configuration is already now quite user unfriendly, I'd say, so I would discourage to make the things even worse.

The essential point of the gating config for a component should be IMO specifying which test/plan (resultsdb-testcase) should be used for which gate (decision context). That's it. Anything else sounds to me as a duplication of information which should be stored somewhere else (e.g. whether or not the component is on the critical path).

In the bright future we imagine with @mvadkert and friends, enabling of the gating could be as simple as specifying the gate name in the tmt plan. This is how a simple plan could look like:

execute:
    script: ./run-the-test.sh
gate:
  - add-build-to-update

But these dreams will need some more time to implement. @mvadkert, I guess you might also want to share your thoughts here.

@thrix
Copy link

thrix commented Oct 4, 2021

from my POW the solution 1. is a more clear one to the user, so I agree we should aim for that if possible. Moreover, the current behaviour was introduced quite silently and it was not at all clear the gating will break for users after this change.

@AdamWill
Copy link
Contributor Author

AdamWill commented Oct 4, 2021

@thrix well, I did detail this approach in a devel@ thread which was quite widely read/discussed. Indeed I hadn't thought of this consequence of the approach, though, so it was not explicitly listed.

@AdamWill
Copy link
Contributor Author

AdamWill commented Oct 12, 2021

@psss for the record:

The essential point of the gating config for a component should be IMO specifying which test/plan (resultsdb-testcase) should be used for which gate (decision context). That's it. Anything else sounds to me as a duplication of information which should be stored somewhere else (e.g. whether or not the component is on the critical path).

This wouldn't really be "storing information" about that in the per-package gating config, it would just be acknowledging the possibility that an update including the package could be on the critical path. It's actually impossible to 'know' this at that level, because critpath status is "infectious" - an entire update is or is not critpath. If package foo is not in the critpath, an update containing only foo is not critical path, but if a build of foo happened to be in an update that also contained, say, kernel, that update would be critpath, and Bodhi would query on the critpath decision context...

Anyway, I got a bit more time to think about this today, and unfortunately my "include" idea wouldn't really work, because it's not a policy that would need to include another policy, it's a decision context that would need to include a decision context, and that just wouldn't make a lot of sense in the overall design here. The decision contexts aren't defined 'independently' anywhere you could store that kind of information, they are just strings in the policies, I just can't envisage a way to stuff this concept into the approach that would make sense.

I'll try and find a few hours to work on the 'have Bodhi query on both contexts' approach if I can.

@AdamWill
Copy link
Contributor Author

since several people have actually been bitten by this problem (and because I'm getting nowhere fast trying to fix bugs in KDE's package manager...) I'm kicking this one up the priority list today. I think I can do something that'll be OK for the 'query both contexts' approach in Bodhi, it'll need a single I think little-used API endpoint's return value to change. Just got to adjust all the tests, now.

AdamWill added a commit to AdamWill/bodhi that referenced this issue Oct 16, 2021
As discussed in fedora-infra#4259 and also reported in
https://pagure.io/fedora-ci/general/issue/263 , my previous
change to have Bodhi query on a different Greenwave 'decision
context' for critical path updates - allowing us to gate
critpath updates on openQA test results - caused a problem with
package-specific gating policies. The instructions and examples
for writing these only include the original decision contexts
(bodhi_update_push_testing and bodhi_update_push_stable), they
do not also recommend including the _critpath contexts, and
this is indeed how all real-world package-specific policies have
been written. This has resulted in package-specific policies
being effectively ignored when the package in question is a
part of a critical path update, because Bodhi is querying on
the _critpath context but the package-specific policy doesn't
include it.

To avoid this problem, this change makes Bodhi query greenwave
on *both* relevant contexts for critical path updates. The logic
for parsing the results doesn't need to change because we were
already coping with multiple queries for the case where an
update has too many subjects for a single query.

This does change the behaviour of one method -
Update.get_test_gating_info() - and one API endpoint -
/updates/{id}/get-test-results . These now both return lists
of Greenwave decision dicts, rather than a single Greenwave
decision dict. There's no way I can see to maintain behaviour
here, and the existing behaviour was already wrong for updates
with lots of packages (this method and endpoint should have
been adjusted when we initially implemented batched requests
for such updates, but they were missed). I don't think the method
or the endpoint were commonly used.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
AdamWill added a commit to AdamWill/bodhi that referenced this issue Oct 16, 2021
As discussed in fedora-infra#4259 and also reported in
https://pagure.io/fedora-ci/general/issue/263 , my previous
change to have Bodhi query on a different Greenwave 'decision
context' for critical path updates - allowing us to gate
critpath updates on openQA test results - caused a problem with
package-specific gating policies. The instructions and examples
for writing these only include the original decision contexts
(bodhi_update_push_testing and bodhi_update_push_stable), they
do not also recommend including the _critpath contexts, and
this is indeed how all real-world package-specific policies have
been written. This has resulted in package-specific policies
being effectively ignored when the package in question is a
part of a critical path update, because Bodhi is querying on
the _critpath context but the package-specific policy doesn't
include it.

To avoid this problem, this change makes Bodhi query greenwave
on *both* relevant contexts for critical path updates. The logic
for parsing the results doesn't need to change because we were
already coping with multiple queries for the case where an
update has too many subjects for a single query.

This does change the behaviour of one method -
Update.get_test_gating_info() - and one API endpoint -
/updates/{id}/get-test-results . These now both return lists
of Greenwave decision dicts, rather than a single Greenwave
decision dict. There's no way I can see to maintain behaviour
here, and the existing behaviour was already wrong for updates
with lots of packages (this method and endpoint should have
been adjusted when we initially implemented batched requests
for such updates, but they were missed). I don't think the method
or the endpoint were commonly used.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
@mergify mergify bot closed this as completed in #4266 Oct 18, 2021
mergify bot pushed a commit that referenced this issue Oct 18, 2021
As discussed in #4259 and also reported in
https://pagure.io/fedora-ci/general/issue/263 , my previous
change to have Bodhi query on a different Greenwave 'decision
context' for critical path updates - allowing us to gate
critpath updates on openQA test results - caused a problem with
package-specific gating policies. The instructions and examples
for writing these only include the original decision contexts
(bodhi_update_push_testing and bodhi_update_push_stable), they
do not also recommend including the _critpath contexts, and
this is indeed how all real-world package-specific policies have
been written. This has resulted in package-specific policies
being effectively ignored when the package in question is a
part of a critical path update, because Bodhi is querying on
the _critpath context but the package-specific policy doesn't
include it.

To avoid this problem, this change makes Bodhi query greenwave
on *both* relevant contexts for critical path updates. The logic
for parsing the results doesn't need to change because we were
already coping with multiple queries for the case where an
update has too many subjects for a single query.

This does change the behaviour of one method -
Update.get_test_gating_info() - and one API endpoint -
/updates/{id}/get-test-results . These now both return lists
of Greenwave decision dicts, rather than a single Greenwave
decision dict. There's no way I can see to maintain behaviour
here, and the existing behaviour was already wrong for updates
with lots of packages (this method and endpoint should have
been adjusted when we initially implemented batched requests
for such updates, but they were missed). I don't think the method
or the endpoint were commonly used.

Signed-off-by: Adam Williamson <awilliam@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants