Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

APIv2: Events streaming behaves weirdly with socket activation #5599

Closed
marusak opened this issue Mar 24, 2020 · 4 comments · Fixed by #5777
Closed

APIv2: Events streaming behaves weirdly with socket activation #5599

marusak opened this issue Mar 24, 2020 · 4 comments · Fixed by #5777
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@marusak
Copy link
Contributor

marusak commented Mar 24, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind bug

Description
This is collection of observations. I want to point out some finding. Also it does not work greatly with cockpit http client and I am not sure yet if this is problem on cockpit or podman side. I hope that we actually may figure out what is happening and see, what side needs fixing.

I am using current master build. Related #5158

  1. Enable and start podman.socket (systemctl enable --now podman.socket)
  2. Notice that podman.service is inactive (dead) (this is of course expected and correct)
  3. Start listening on events. sudo curl --unix-socket /run/podman/podman.sock http://d/v1.24/events
  4. Notice that podman.service is activating (start). ((1) Never running though - this is one interesting observation)
  5. Wait for a bit to make sure it keeps streaming. After minute or two do some action and see it correctly logs it. (like podman run fedora and there are 6 events (from create to cleanup)) That is all correct.
  6. Stop the events streaming (by ctrl+C).
  7. Notice that podman.service never dies (2). It stays in activating (start) forever. (Well, at least for around 10minutes, not sure how this works, maybe there is some exponential backof?). I would guess that when no one is listening, the service should die.

Now, when I try to do the same with cockpit http client. (3)
Points 1. and 2. are the same. I create a new connection to /run/podman/podman.sock and call /events endpoint and attach streaming callback. It works for 5 seconds, and then it disconnects. It does not recognize that there is open connection. Looking into systemctl status podman I see it is in activating (start) for those 5 seconds and then goes back to inactive (dead). Streaming with Cockpit works fine when the podman system service does not die (started with -t 0).
I am not sure if this is problem in cockpit http client and in podman not recognizing connection.

If it does not ring any bells (mostly to @jwhonce), than I try to create some simple reproducer with Cockpit. Or if there is some way how I can provide some logs, please let me know.

EDIT: A few more info, how cockpit actually works:

  1. It creates cockpit.channel with options: {"method":"GET","path":"/v1.12/events?","superuser":true,"payload":"http-stream2","unix":"/run/podman/podman.sock"}. http-stream2 documentation.
@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Mar 24, 2020
@marusak
Copy link
Contributor Author

marusak commented Mar 26, 2020

och, so after some fun debugging I found reproducer that does not need cockpit in any way.

  1. systemctl enable --now podman.socket
  2. make sure that podman.service is inactive
  3. create executable file:
 $ cat podman_rep.sh 
sudo curl --unix-socket /run/podman/podman.sock http://d/v1.24/info
sudo curl --unix-socket /run/podman/podman.sock http://d/v1.24/libpod/images/json &
sudo curl --unix-socket /run/podman/podman.sock http://d/v1.24/events
  1. /sudo ./podman_rep.sh

See that the last command won't stay around and dies in 5 seconds. If you comment out the first or the middle command, it stays around forever and when you do some podman action it will log events.
(Every time you start it make sure that podman.service is not running).

I was only able to reproduce it that one command finished, then another started and before it finishes we start to listen on events. (Thats why the middle command is run in background so I can start to listen on events before it lists all images - it is great if you have a few images locally so it is not too fast:) )

@marusak
Copy link
Contributor Author

marusak commented Mar 26, 2020

@jwhonce or @baude can you please try this if it reproduces for you as well or if it is a very specific timing issue that is happening on my machine. Thanks!

@marusak
Copy link
Contributor Author

marusak commented Apr 8, 2020

@jwhonce were you able to reproduce this or find any problem that could cause this behaviour?

@jwhonce
Copy link
Member

jwhonce commented Apr 8, 2020

@marusak I was able to reproduce the issue with your snippet. I'm debugging issue now. This is one of those infamous "That should never happen issues." The debugging output shows the server exiting with a live connection. I'll be sure and tag this issue in the PR with the fix.

jwhonce added a commit to jwhonce/podman that referenced this issue Apr 9, 2020
* Move connection tracking into APIServer using ConnState()
* Remove Connection counters from CLI code
* Update events handler to support client not closing connection
* Improve logging messages

Fixes containers#5599

Signed-off-by: Jhon Honce <jhonce@redhat.com>
jwhonce added a commit to jwhonce/podman that referenced this issue Apr 9, 2020
* Move connection tracking into APIServer using ConnState()
* Remove Connection counters from CLI code
* Update events handler to support client not closing connection
* Improve logging messages

Fixes containers#5599

Signed-off-by: Jhon Honce <jhonce@redhat.com>
jwhonce added a commit to jwhonce/podman that referenced this issue Apr 9, 2020
* Move connection tracking into APIServer using ConnState()
* Remove Connection counters from CLI code
* Update events handler to support client not closing connection
* Improve logging messages

Fixes containers#5599

Signed-off-by: Jhon Honce <jhonce@redhat.com>
jwhonce added a commit to jwhonce/podman that referenced this issue Apr 9, 2020
* Move connection tracking into APIServer using ConnState()
* Remove Connection counters from CLI code
* Update events handler to support client not closing connection
* Improve logging messages

Fixes containers#5599

Signed-off-by: Jhon Honce <jhonce@redhat.com>
jwhonce added a commit to jwhonce/podman that referenced this issue Apr 13, 2020
* Move connection tracking into APIServer using ConnState()
* Remove Connection counters from CLI code
* Update events handler to support client not closing connection
* Improve logging messages

Fixes containers#5599

Signed-off-by: Jhon Honce <jhonce@redhat.com>
snj33v pushed a commit to snj33v/libpod that referenced this issue May 31, 2020
* Move connection tracking into APIServer using ConnState()
* Remove Connection counters from CLI code
* Update events handler to support client not closing connection
* Improve logging messages

Fixes containers#5599

Signed-off-by: Jhon Honce <jhonce@redhat.com>
@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 23, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants