WIP: system test parallelization: two-pass approach #23275

edsantiago · 2024-07-15T12:16:39Z

Split system tests into two: those that can be run in
parallel, and those that can't. Run tests in two passes.
This requires eliminating the per-test leak check and
teardown. I think that's okay.

Tests that can run in parallel:

use unique container/pod/volume/network names
- bonus: added a way to track names to their test,
  so the leak test at end can be useful
do not run 'podman rm -a' or 'rmi -a'
do not run 'podman ps/images' and expect precise output

Signed-off-by: Ed Santiago santiago@redhat.com

None

openshift-ci · 2024-07-15T12:16:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: edsantiago

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [edsantiago]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

edsantiago · 2024-07-15T12:17:49Z

THIS IS NOT EVEN CLOSE TO DREAMING ABOUT MERGING!

@Luap99 I think this approach holds promise. I would like to spend some time pursuing it. Before I do so, WDYT?

Luap99 · 2024-07-15T13:52:53Z

@Luap99 I think this approach holds promise. I would like to spend some time pursuing it. Before I do so, WDYT?

Just some quick thoughts, will add more once I am back.

Syntax wise this seems to be better as there were many files I could not run parallel because one or two tests. Assuming this now runs things parallel across files as well it should utilize the cpu better.

However I do not see how this addresses the functional issues from my PR.
How are we going to debug flakes? There is nothing in the logs for timings etc... It is practically impossible to correlate problematic test interactions (this can be done well in e2e tests as we have a full log with timings)
Reviewing tests for possible conflicts will be hard and we will fail from time to time causing extra flakes.

I think there are nice gains here but honestly I am no longer sure that the ongoing maintenance will not cause to much work on all maintainers.

edsantiago · 2024-07-16T19:06:49Z

Okay..... I'm really favorably impressed with this approach. The two-pass requirement sucks, and debugging failures is really hard, but I think the benefits (so far) are outweighing those negatives. Running lots and lots of different tests in parallel, not just from one file, is finding a lot of bugs.

CI is likely to fail because of #23282. This is still very much a WIP. My plan is to break out much of the safename work, commit that separately in individual reviewable PRs, in order to minimize the changes in this one.

packit-as-a-service · 2024-07-16T19:33:34Z

Cockpit tests failed for commit e79fca4. @martinpitt, @jelly, @mvollmer please check.

Luap99 · 2024-07-17T16:20:20Z

@edsantiago Ok let's do this then. I will try to fix all the related podman bugs which you reported in the next days.

Luap99 · 2024-07-18T11:37:01Z

re tag name:
I would prefer parallel over para as this makes it more clear to readers. And I don't see a problem if the tag name is a bit longer.

edsantiago · 2024-07-18T11:46:01Z

Full name: my concern is typos. I know that we'll get occasional "parralel" or "parrallel" misspellings and those are hard to catch in review. I've been letting my brain think about this in the background and still haven't come up with any ideas.

The other consideration is a string that's easily greppable in source code and command-line history. ^Rpara (for rerunning tests) is pretty useless. Maybe ci:parallel and just try really hard to catch typos in review?

Luap99 · 2024-07-18T12:25:14Z

Maybe ci:parallel and just try really hard to catch typos in review?

ci:parallel SGTM. Another reason for something like codespell to be part of the actual CI checks.
I am not too concerned about typos, it is not like they would break anything. Also most people likely copy the thing from another test and would not really think about it to much anyway I think.

Luap99 · 2024-07-18T12:27:35Z

In general I would good to get some docs in test/system/README.md that descripe how this parallel mode works and what test can/cannot run in parallel (--all,--latest, output checks like podman ps empty output, etc... )

Luap99 · 2024-07-18T12:31:29Z

Also another flake I saw locally.

   [14:18:02.492492892] $ /home/pholzing/go/src/github.com/containers/podman/bin/podman __completeNoDesc  system connection remove arg
   [14:18:02.522931040] m_t114-lgdlrt8i
   m_t114-lgdlrt8i-root
   :4
   Completion ended with directive: ShellCompDirectiveNoFileComp
   #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
   #|     FAIL: Unexpected non-Debug output line: m_t114-lgdlrt8i
   #| expected: \[Debug\]
   #|   actual: m_t114-

I know what is wrong with that and will do another PR to fix that.

Luap99 · 2024-07-18T12:51:40Z

Also another flake I saw locally.

   [14:18:02.492492892] $ /home/pholzing/go/src/github.com/containers/podman/bin/podman __completeNoDesc  system connection remove arg
   [14:18:02.522931040] m_t114-lgdlrt8i
   m_t114-lgdlrt8i-root
   :4
   Completion ended with directive: ShellCompDirectiveNoFileComp
   #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
   #|     FAIL: Unexpected non-Debug output line: m_t114-lgdlrt8i
   #| expected: \[Debug\]
   #|   actual: m_t114-

I know what is wrong with that and will do another PR to fix that.

Fix in #23326

packit-as-a-service · 2024-10-10T00:34:16Z

Cockpit tests failed for commit c313488. @martinpitt, @jelly, @mvollmer please check.

packit-as-a-service · 2024-10-10T01:19:43Z

Cockpit tests failed for commit 3b6a5c2. @martinpitt, @jelly, @mvollmer please check.

...try to trace them back to the culprit tests Signed-off-by: Ed Santiago <santiago@redhat.com>