Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rest api calls ran immediately after events might lose data #10812

Closed
KKoukiou opened this issue Jun 29, 2021 · 12 comments · Fixed by #10823
Closed

rest api calls ran immediately after events might lose data #10812

KKoukiou opened this issue Jun 29, 2021 · 12 comments · Fixed by #10823
Assignees
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@KKoukiou
Copy link
Contributor

KKoukiou commented Jun 29, 2021

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
A recent RHEL-9 test image refresh seems to break one test from cockpit-podman. Within that refresh podman changed:

  • from: 3.1.0-0.15.el9
  • to: 3.3.0-0.6.el9

The failure in question is happening with the following test scenario:

  • test pulls an image
  • UI should update with the new image

Since the update mentioned above the new image does not appear in the UI however. Unfortunately this is not a deterministic failure, it fails however quite persistently, like 80% of the time.

In cockpit-podman the REST API is used, and it listens to events, (in this case for the image pull) in order to fully re-fetch the list of images once this event is received.

From the systemctl status podman, I can see the REST API calls done after the pull event is received.
From there is clear that, the images listing happening directly after the 'pull' event is received (something like: curl -s --unix-socket /run/podman/podman.sock http://d/v1.12/libpod/images/json?) does not include the new image.

I tried the past 2hours to find a better CLI reproducer but I did not succeed.

Output of podman version:

3.3.0-0.6.el9

Do you have any idea what might have caused that race? I am happy to test any patches in our VM images, and I am also happy to show someone how to run cockpit-podman test locally, in order to reproduce this issue.

I took a look in the shortlog, this commit raised my attention:

commit 4f4a440afd521913745c314f2bebd0da45d9668e
Merge: 9cc609360 9822c3309
Author: OpenShift Merge Robot <openshift-merge-robot@users.noreply.github.com>
Date:   Thu May 27 19:42:04 2021 +0200

    Merge pull request #10489 from vrothberg/fix-data-race
    
    create libimage-events channel in main routine

Any ideas are welcome!

@jwhonce
Copy link
Member

jwhonce commented Jun 29, 2021

@vrothberg PTAL

@vrothberg
Copy link
Member

Thanks for reaching out. I wonder what 3.3.0-0.6.el9 is supposed to be? Podman 3.3.0 doesn't exist. It looks more like a package version.

@KKoukiou, can you paste the full output of podman version?

@vrothberg
Copy link
Member

Do I understand the issue correctly?

  • pull foo
  • receive image event
  • list images (no foo listed?)

@KKoukiou
Copy link
Contributor Author

For your interest I was able to reproduce from the CLI:

One one terminal tab I run:

[root@localhost ~]# stdbuf -oL curl -s -XGET --unix-socket /run/podman/podman.sock http://d/v1.12/libpod/events | xargs -n1 -d '\n' sh -ec 'curl -s -XGET --unix-socket /run/podman/podman.sock http://d/v1.12/libpod/images/json'

On another terminal tab I run:

[root@localhost ~]# curl -s -XPOST --unix-socket /run/podman/podman.sock http://d/v1.12/libpod/images/pull?reference=localhost:5000/my-busybox
{"stream":"Trying to pull localhost:5000/my-busybox:latest...\n"}
{"stream":"Getting image source signatures\n"}
{"stream":"Copying blob sha256:70ce06140e31e6a661d10c1857bd19494627a430a4424936aff6c8ebc4e86be8\n"}
{"stream":"Copying config sha256:f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f\n"}
{"stream":"Writing manifest to image destination\n"}
{"stream":"Storing signatures\n"}
{"images":["f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f"],"id":"f0b02e9d092d905d0d87a8455a1ae3e9bb47b4aa3dc125125ca5cd10d6441c9f"}

Since the image pull will generate a pull event the image list handler from the event listening command will run. The output it generated is:

[{"Id":"961769676411f082461f9ef46626dd7a2d1e2b2a38e6a44364bcbecf51e66dd4","ParentId":"","RepoTags":["quay.io/libpod/alpine:latest"],"RepoDigests":["sha256:fa93b01658e3a5a1686dc3ae55f170d8de487006fb53a28efcd12ab0710a2e5f","sha256:634a8f35b5f16dcf4aaa0822adc0b1964bb786fca12f6831de8ddc45e5986a00"],"Created":1566332395,"Size":5847966,"SharedSize":0,"VirtualSize":5847966,"Labels":null,"Containers":0,"Names":["quay.io/libpod/alpine:latest"],"Digest":"sha256:fa93b01658e3a5a1686dc3ae55f170d8de487006fb53a28efcd12ab0710a2e5f","History":["quay.io/libpod/alpine:latest"]},{"Id":"2d4f4b5309b1e41b4f83ae59b44df6d673ef44433c734b14c1c103ebca82c116","ParentId":"","RepoTags":["quay.io/cockpit/registry:2"],"RepoDigests":["sha256:eb072440e6939e2982739a7b64c16f6b37f2caf52cd39db6bd504d8f7505cbda"],"Created":1592504405,"Size":26788530,"SharedSize":0,"VirtualSize":26788530,"Labels":null,"Containers":2,"Names":["quay.io/cockpit/registry:2"],"Digest":"sha256:eb072440e6939e2982739a7b64c16f6b37f2caf52cd39db6bd504d8f7505cbda","History":["quay.io/cockpit/registry:2"]}]

Not that the newly generated image from image pull it not part of the images listed!
If I run it again of course the image appears in the output.

@KKoukiou
Copy link
Contributor Author

Do I understand the issue correctly?

* pull foo

* receive image event

* list images (no foo listed?)

Exactly, see the CLI reproducer above.

@KKoukiou
Copy link
Contributor Author

Thanks for reaching out. I wonder what 3.3.0-0.6.el9 is supposed to be? Podman 3.3.0 doesn't exist. It looks more like a package version.

@KKoukiou, can you paste the full output of podman version?

I was also suprised since I did not see such tag:
Here is the output:

podman version 3.3.0-dev

@KKoukiou
Copy link
Contributor Author

KKoukiou commented Jun 29, 2021

Thanks for reaching out. I wonder what 3.3.0-0.6.el9 is supposed to be? Podman 3.3.0 doesn't exist. It looks more like a package version.

@KKoukiou, can you paste the full output of podman version?

According to the changelog it is:

* Tue Jun 15 2021 Jindrich Novy <jnovy@redhat.com> - 3.3.0-0.6
- update to the latest content of https://github.com/containers/podman/tree/master
  (https://github.com/containers/podman/commit/e2f51ee)

https://gitlab.com/redhat/centos-stream/rpms/podman/-/blob/c9s/podman.spec#L406

@vrothberg
Copy link
Member

Thanks, @KKoukiou. Could the issue be caused by commit 86610c7? Could try reproducing after a git reset --hard 86610c7852ee~?

@vrothberg
Copy link
Member

Did the cockpit tests pass with Podman v3.2?

@KKoukiou
Copy link
Contributor Author

Did the cockpit tests pass with Podman v3.2?

There are no podman-3-2.0-0.*el9 builds available on brew so I can't verify that. If you want to me to check a specific rpm feel free to send me a link, I will on #podman IRC channel on libera today (kkoukiou nick)

@vrothberg
Copy link
Member

Okay, I see the problem now (I caused it). I will wrap up a fix.

vrothberg added a commit to vrothberg/common that referenced this issue Jun 30, 2021
Some users rely on events being written *after* the operation ran.
Hence, defer all event writes.

Context: containers/podman/issues/10812
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
@vrothberg
Copy link
Member

containers/common#655

@vrothberg vrothberg self-assigned this Jun 30, 2021
@vrothberg vrothberg added the In Progress This issue is actively being worked by the assignee, please do not work on this at this time. label Jun 30, 2021
vrothberg added a commit to vrothberg/libpod that referenced this issue Jun 30, 2021
Make sure that image events are written *after* execution.

Fixes: containers#10812
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/libpod that referenced this issue Jun 30, 2021
Make sure that image events are written *after* execution.

Fixes: containers#10812
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
vrothberg added a commit to vrothberg/common that referenced this issue Jul 1, 2021
Some users rely on events being written *after* the operation ran.
Hence, defer all event writes.

Context: containers/podman/issues/10812
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
rugk pushed a commit to rugk/podman that referenced this issue Jul 9, 2021
Make sure that image events are written *after* execution.

Fixes: containers#10812
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 21, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 21, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
In Progress This issue is actively being worked by the assignee, please do not work on this at this time. kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants