Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: conditional_mark only considers the longest match in the yaml file even when skipping. #14698

Open
arista-nwolfe opened this issue Sep 23, 2024 · 8 comments
Assignees

Comments

@arista-nwolfe
Copy link
Contributor

Issue Description

As part of #11968 there was a fix in the function find_longest_matches to now correctly find only the longest matching entry in tests_mark_conditions.yaml.

However, this bugfix now exposes the issue that multiple length'd skip conditions only result in the longest skip occurring and causing tests to be run on undesired SKUs.

Results you see

E.G. looking specifically at generic_config_updater/test_ecn_config_update.py::test_ecn_config_updates

This test has 2 matching entries in tests_mark_conditions.yaml:

generic_config_updater:
  skip:
    reason: 'generic_config_updater is not a supported feature for T2'
    conditions:
      - "'t2' in topo_name"
generic_config_updater/test_ecn_config_update.py::test_ecn_config_updates:
  skip:
    reason: "This test is not run on this asic type, topology, or version currently"
    conditions_logical_operator: "OR"
    conditions:
      - "asic_type in ['cisco-8000']"
      - "topo_type in ['m0', 'mx']"
      - "release in ['202211']"

But now because of the fix we'll only use the latter rule because it's longer.
Meaning we no longer skip this test on T2 devices even though none of the tests in this generic_config_updater folder should run on T2.

In order to support this with the new "longest match wins" approach we'd need to replicate the 't2' in topo_name on every generic_config_updater rule.

Results you expected to see

generic_config_updater/test_ecn_config_update.py::test_ecn_config_updates should be skipped on T2 topologies.

Is it platform specific

generic

Relevant log output

No response

Output of show version

No response

Attach files (if any)

No response

@kenneth-arista
Copy link
Contributor

@arlakshm

@yutongzhang-microsoft yutongzhang-microsoft self-assigned this Sep 24, 2024
@yutongzhang-microsoft
Copy link
Contributor

Hi, @arista-nwolfe , this behavior is intentional and follows the longest matching rule. Our approach is as follows:

  1. Collect all marks along the matching path.
  2. If a mark is unique among the matches, it is applied.
  3. If not, the conditions from the longest matching path take precedence.

@arista-nwolfe
Copy link
Contributor Author

Hi, @arista-nwolfe , this behavior is intentional and follows the longest matching rule. Our approach is as follows:

  1. Collect all marks along the matching path.
  2. If a mark is unique among the matches, it is applied.
  3. If not, the conditions from the longest matching path take precedence.

I agree that this was a bug that was fixed. The problem is all the rules in the tests_mark_conditions.yaml were made with the assumption that this bug exists and the behavior is we consider all rules for a given test (not just the longest one).
So fixing this bug without fixing all the rules to no longer make this assumption will result in many tests incorrectly running on platforms/topologies which they shouldn't be running on.

@yutongzhang-microsoft
Copy link
Contributor

Hi, @arista-nwolfe , this behavior is intentional and follows the longest matching rule. Our approach is as follows:

  1. Collect all marks along the matching path.
  2. If a mark is unique among the matches, it is applied.
  3. If not, the conditions from the longest matching path take precedence.

I agree that this was a bug that was fixed. The problem is all the rules in the tests_mark_conditions.yaml were made with the assumption that this bug exists and the behavior is we consider all rules for a given test (not just the longest one). So fixing this bug without fixing all the rules to no longer make this assumption will result in many tests incorrectly running on platforms/topologies which they shouldn't be running on.

No, this is not a bug.

@kenneth-arista
Copy link
Contributor

The intent of #11968 was not to introduce bugs. However, there is now a negative interaction between the change and the existing behavior of conditional rules. On one hand a bug was fixed, but on the other hand it broke existing behavior that was built on top buggy code.

To make progress, we need to also update tests_mark_conditions.yaml before introducing incompatible behavior.

@arlakshm
Copy link
Contributor

@wangxin, @yxieca for viz... can we discuss in test subgroup on what the next steps to resolve this issue

@yutongzhang-microsoft
Copy link
Contributor

Hi, @arista-nwolfe, @kenneth-arista, sorry for the inconvenience.

There was an incorrect behavior in finding the matching entries, deviating from our intended logic before. And in #11968, Arista fixed this issue, which align with our original expectations.

However, this bugfix will cause some inconsistency behavior with previous logic(unexpected because of the bug). To keep consistency, I raised PR #14912 and it will be cherry picked into all branches so the behaviour will be align with previous.

And also, we have the new logic of conditional mark #14395, #14515, we will also cherry pick these two PRs into all branches. After all PRs being cherry picked, the behavior will align with our expectations and previous.

@arista-nwolfe
Copy link
Contributor Author

arista-nwolfe commented Oct 10, 2024

Hi, @arista-nwolfe, @kenneth-arista, sorry for the inconvenience.

There was an incorrect behavior in finding the matching entries, deviating from our intended logic before. And in #11968, Arista fixed this issue, which align with our original expectations.

However, this bugfix will cause some inconsistency behavior with previous logic(unexpected because of the bug). To keep consistency, I raised PR #14912 and it will be cherry picked into all branches so the behaviour will be align with previous.

And also, we have the new logic of conditional mark #14395, #14515, we will also cherry pick these two PRs into all branches. After all PRs being cherry picked, the behavior will align with our expectations and previous.

Sorry @yutongzhang-microsoft I didn't see this in time, I left a comment in the submitted review: #14395 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

4 participants