Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v5.2-rhel] Fix podman stop and podman run --rmi #24197

Conversation

TomSweeneyRedHat
Copy link
Member

@TomSweeneyRedHat TomSweeneyRedHat commented Oct 8, 2024

This started off as an attempt to make podman stop on a container started with --rm actually remove the container, instead of just cleaning it up and waiting for the cleanup process to finish the removal.

In the process, I realized that podman run --rmi was rather broken. It was only done as part of the Podman CLI, not the cleanup process (meaning it only worked with attached containers) and the way it was wired meant that I was fairly confident that it wouldn't work if I did a podman stop on an attached container run with --rmi. I rewired it to use the same mechanism that podman run --rm uses, so it should be a lot more durable now, and I also wired it into podman inspect so you can tell that a container will remove its image.

Tests have been added for the changes to podman run --rmi. No tests for stop on a run --rm container as that would be racy.

Fixes #22852
Fixes RHEL-39513

For this branch
Backport of: 458ba5a
Fixes: https://issues.redhat.com/browse/RHEL-61667

Does this PR introduce a user-facing change?

None

Copy link
Contributor

openshift-ci bot commented Oct 8, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: TomSweeneyRedHat

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 8, 2024
@github-actions github-actions bot added the kind/api-change Change to remote API; merits scrutiny label Oct 8, 2024
Copy link

Ephemeral COPR build failed. @containers/packit-build please check.

@giuseppe
Copy link
Member

giuseppe commented Oct 8, 2024

Could you add a reference to the commit that this is a backport of 458ba5a?

Have you used git cherry-pick to produce this patch? It would add the reference automatically if you specify -x.

@TomSweeneyRedHat
Copy link
Member Author

I did do a git cherry-pick 458ba5a8afedae15acfec0a1f64195f41d80edcf from the command line, but was unaware of the -x option. I will look into that and try to remember to use that going forward.

This started off as an attempt to make `podman stop` on a
container started with `--rm` actually remove the container,
instead of just cleaning it up and waiting for the cleanup
process to finish the removal.

In the process, I realized that `podman run --rmi` was rather
broken. It was only done as part of the Podman CLI, not the
cleanup process (meaning it only worked with attached containers)
and the way it was wired meant that I was fairly confident that
it wouldn't work if I did a `podman stop` on an attached
container run with `--rmi`. I rewired it to use the same
mechanism that `podman run --rm` uses, so it should be a lot more
durable now, and I also wired it into `podman inspect` so you can
tell that a container will remove its image.

Tests have been added for the changes to `podman run --rmi`. No
tests for `stop` on a `run --rm` container as that would be racy.

Fixes containers#22852
Fixes RHEL-39513

For this branch
Backport of: [458ba5a](containers@458ba5a)
Fixes: https://issues.redhat.com/browse/RHEL-61667

Signed-off-by: Matt Heon <mheon@redhat.com>
Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>
@TomSweeneyRedHat
Copy link
Member Author

@edsantiago or @mheon
I've reached the one hour of bashing my forehead. Do either of you know what's going south with the test? Snippet below. I'm not seeing a difference in the code changes between upstream and this cherry-pick. Especially given this is v5.2-rhel from main, I thought this would fly through.

[+0061s] not ok 60 [030] podman run --rmi in 2278ms
[+0061s] # (from function `bail-now' in file test/system/helpers.bash, line 184,
[+0061s] #  from function `die' in file test/system/helpers.bash, line 884,
[+0061s] #  from function `run_podman' in file test/system/helpers.bash, line 508,
[+0061s] #  in test file test/system/030-run.bats, line 207)
[+0061s] #   `run_podman run --rmi $NONLOCAL_IMAGE /bin/true' failed
[+0061s] #
[+0061s] # [12:31:11.655114770] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw image exists quay.io/libpod/testimage:00000004
[+0061s] # [12:31:11.698809195] [ rc=1 (expected) ]
[+0061s] #
[+0061s] # [12:31:11.727133002] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw run --name /c_t60-nmrszo3h quay.io/libpod/testimage:00000004 /bin/true
[+0061s] # [12:31:12.321681255] Trying to pull quay.io/libpod/testimage:00000004...
[+0061s] # Getting image source signatures
[+0061s] # Copying blob sha256:cecc78ee407586f0c9eec844d60b8d40698a35e8d3d59f11a45a6b19a21fb7f0
[+0061s] # Copying config sha256:560956ac186f909c45731174e2d2b62180b638aa44e16739d02f0fb39f7b4ffb
[+0061s] # Writing manifest to image destination
[+0061s] #
[+0061s] # [12:31:12.333349194] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw image exists quay.io/libpod/testimage:00000004
[+0061s] #
[+0061s] # [12:31:12.393528259] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw run --rmi --rm quay.io/libpod/testimage:00000004 /bin/true
[+0061s] # [12:31:12.893309398] time="2024-10-10T12:31:12-05:00" level=warning msg="image used by 0cf44d70159b738a005a2ae09c9baa6c388e05ad9aab3a54c1cb5a26e2616054: image is in use by a container: consider listing external containers and force-removing image"
[+0061s] #
[+0061s] # [12:31:12.919668571] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw image exists quay.io/libpod/testimage:00000004
[+0061s] #
[+0061s] # [12:31:12.971773221] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw rm /c_t60-nmrszo3h
[+0061s] # [12:31:13.026270551] c_t60-nmrszo3h
[+0061s] #
[+0061s] # [12:31:13.037488360] # /var/tmp/go/src/github.com/containers/podman/bin/podman-remote --url unix:///tmp/CI_gD8q/podman_tmp_XSKw run --rmi quay.io/libpod/testimage:00000004 /bin/true
[+0061s] # [12:31:13.508448488] time="2024-10-10T12:31:13-05:00" level=warning msg="quay.io/libpod/testimage:00000004: image not known"
[+0061s] # #/vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
[+0061s] # #| FAIL: Command succeeded, but issued unexpected warnings
[+0061s] # #\^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[+0061s] # # [teardown]

@edsantiago
Copy link
Member

It's only on remote. My first suspicion was that there's another commit missing, and 80639df looked like a possible candidate, but I can't cherrypick it and am running into diminishing returns.

@edsantiago
Copy link
Member

8e78028 silences the warning. I don't know if that's the right solution, but it's good enough for me.

Since commit 458ba5a the cleanup process now removes the image as
well, thus the removal is racy and it will cause an error here.

The code tried to ignore the error with errors.Is() but this never works
across the remote API. However the API already has a ignore option so
juts use that and fix the error message so that we can easily find the
root cause and I do not have to guess where the log was written.

Fixes containers#23719

Signed-off-by: Paul Holzinger <pholzing@redhat.com>
(cherry picked from commit 8e78028)
Signed-off-by: tomsweeneyredhat <tsweeney@redhat.com>
@TomSweeneyRedHat
Copy link
Member Author

Thx @edsantiago , I've added that commit and 🤞

@TomSweeneyRedHat
Copy link
Member Author

Great Call @edsantiago !

Happy green test buttons. This is for a RHEL 9.5 ZeroDay fix, can I get a few eyeballs on this please?

@mheon
Copy link
Member

mheon commented Oct 14, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 14, 2024
@openshift-merge-bot openshift-merge-bot bot merged commit c03b5f3 into containers:v5.2-rhel Oct 14, 2024
55 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/api-change Change to remote API; merits scrutiny lgtm Indicates that a PR is ready to be merged. release-note-none
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants