i3 freezes when ipc client becomes unresponsive #2280

cornerman · 2016-04-06T10:53:08Z

Output of i3 --moreversion 2>&- || i3 --version:

Binary i3 version: 4.12-21-g66d9c98 (2016-04-04, branch "next") © 2009 Michael Stapelberg and contributors
Running i3 version: 4.12-21-g66d9c98 (2016-04-04, branch "next") (pid 16730)
Loaded i3 config: /home/cornerman/.i3/config (Last modified: Wed 06 Apr 2016 01:41:05 PM CEST, 32 seconds ago)

The i3 binary you just called: /usr/bin/i3
The i3 binary you are running: i3

URL to a logfile as per http://i3wm.org/docs/debugging.html:

http://logs.i3wm.org/logs/5728116278296576.bz2

What I did:

Register an IPC event in the statusline generator. Then switch to fullscreen mode or hide i3bar, so the bar is hidden and the status command gets a SIGSTOP. Afterwards trigger the registered IPC event a few times.

What I saw:

i3 freezes while trying to send the ipc event to the stopped client, which does not read new events anymore. Here write returns an EAGAIN error because it would block with the full socket. Therefore, it will run in an infinite loop.

What I expected instead:

i3 should keep running and should not depend on the health of its IPC client. If sending an event fails, it should be ignored. Even though, one could argue that the statusline generator might not be the right place to register events, this happens for any client that becomes unresponsive.

The text was updated successfully, but these errors were encountered:

Airblader · 2016-04-06T11:22:28Z

This is a duplicate of #1876, but we now have an actual case where this causes problems. And it sounds like a reasonable case to me. @stapelberg What do you think?

stapelberg · 2016-04-06T19:28:52Z

One solution might be to re-architect the status line generator in such a way that the part which receives IPC events from i3 is in a separate process that is never SIGSTOP'ed. I’d recommend that for now, simply because it’ll work right away.

With regards to addressing this in i3, I’m not sure what all the possible ways to address the issue are. Some thoughts, feedback welcome:

Just stopping the write in the middle of a message (and messages do span multiple write() calls, at least on OpenBSD) does not sound like a good strategy, as clients are likely not prepared to disregard incomplete messages and the protocol was not designed to be self-synchronizing.
Shutting down the socket to allow i3 itself to make progress might not be a great idea either. Long-running tools such as time-tracking applications which log window titles/focus switch events might just be temporarily slow, but catch up once the scheduler on the loaded machine gets around to scheduling the time-tracking application again.
Buffering the data in i3 might be feasible but only seems like a stop-gap to me. It doesn’t make sense to have the buffer grow without bounds (might trigger the OOM killer) and when the buffer is full, we have the same problem as before.

Did I miss an option?

cornerman · 2016-04-07T12:01:31Z

Yeah, fixing the statusline generator in this manner is definitely the way to go, as the event processing should not be stopped. Still, i3 should stay responsive.

You are right, the first two options would be pretty hard on the client. However, I think buffering all new events for each client would be overkill. How about only buffering the part of the last event message which could not be written completely? So, whenever an event should be dispatched to the client, we first try to send the rest of its incomplete message and only then proceed with sending new events. In this way, the client might miss some messages, but I guess that is okay.

stapelberg · 2016-04-07T14:59:50Z

How about only buffering the part of the last event message which could not be written completely? So, whenever an event should be dispatched to the client, we first try to send the rest of its incomplete message and only then proceed with sending new events. In this way, the client might miss some messages, but I guess that is okay.

I agree that this suggestion is better than the points I brought up earlier. But, the crucial detail is dropping messages silently: as I mentioned above, clients which rely on getting all messages (think time trackers) would be broken by such a change.

cornerman · 2016-04-07T15:36:07Z

True, but to me, dropping messages seems like the better option than blocking the whole wm. Maybe one could add a message signaling that events were missed, to make clients aware. But I am not sure whether this really helps.

Would it make sense to store the most recent message for each event type? So, the client at least knows the current state, which might help in some cases.

Airblader · 2016-04-07T15:40:37Z

Can't we move the writing part of the IPC implementation into a separate process?

cornerman · 2016-04-07T15:53:30Z

But wouldn't this lead to a similar problem? Either block all IPC writing or buffer the failed messages, which would ultimately lead to dropping messages.

Airblader · 2016-04-07T15:55:58Z

We would be blocking, yes, but we wouldn't block the window manager.

Blocking seems reasonable to me. The client is stopped, afterall. If the client wants to do something despite being blocked, it should read messages in a separate process as well.

cornerman · 2016-04-07T15:57:34Z

But there might be more clients, which would also be blocked.

Airblader · 2016-04-07T16:06:49Z

Hm. Yeah, valid point.

stapelberg · 2016-04-10T11:05:43Z

I think there’s just no good option here. We’d need an option which is clearly better in order to justify changing the current behavior.

I’d recommend to either not SIGSTOP the process, or to implement a different policy for handling messages in a separate process which is not SIGSTOPped (think a proxy for i3ipc).

pizdjuk · 2021-01-30T11:56:06Z

I think I ran into this issue using altdesktop/i3ipc-python#174

The bug is triggered some times in a day

orestisfl · 2021-01-30T14:43:24Z

What's your i3 version?

pizdjuk · 2021-01-30T23:43:09Z

i3 version 4.17.1 (2019-08-30)

orestisfl · 2021-01-31T00:44:08Z

Please update to the latest version, your issue should be resolved. If not, open a new issue.

…

On Sun, Jan 31, 2021, 00:43 pizdjuk ***@***.***> wrote: i3 version 4.17.1 (2019-08-30) — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#2280 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABMCZPS4W3TCKMOALY5YGZLS4SKRTANCNFSM4CAD3APQ> .

i3bot added the 4.12 label Apr 6, 2016

Airblader added the bug label Apr 6, 2016

cornerman mentioned this issue Apr 7, 2016

Asynchronous modules stop processing events on SIGSTOP ultrabug/py3status#253

Closed

Airblader added the reproducible A bug that has been reviewed and confirmed by a project contributor label Apr 7, 2016

stapelberg closed this as completed Apr 10, 2016

Airblader mentioned this issue Sep 29, 2017

Seemingly random i3 freezes (SIGSTOP) #2539

Closed

stapelberg mentioned this issue Sep 29, 2017

enhancement: kill misbehaving IPC connections instead of deadlocking #2999

Closed

pizdjuk mentioned this issue Jan 30, 2021

stop application on focus lost stops i3status process too altdesktop/i3ipc-python#174

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

i3 freezes when ipc client becomes unresponsive #2280

i3 freezes when ipc client becomes unresponsive #2280

cornerman commented Apr 6, 2016

Airblader commented Apr 6, 2016

stapelberg commented Apr 6, 2016

cornerman commented Apr 7, 2016

stapelberg commented Apr 7, 2016

cornerman commented Apr 7, 2016

Airblader commented Apr 7, 2016

cornerman commented Apr 7, 2016

Airblader commented Apr 7, 2016

cornerman commented Apr 7, 2016

Airblader commented Apr 7, 2016

stapelberg commented Apr 10, 2016

pizdjuk commented Jan 30, 2021

orestisfl commented Jan 30, 2021

pizdjuk commented Jan 30, 2021

orestisfl commented Jan 31, 2021 via email

i3 freezes when ipc client becomes unresponsive #2280

i3 freezes when ipc client becomes unresponsive #2280

Comments

cornerman commented Apr 6, 2016

Airblader commented Apr 6, 2016

stapelberg commented Apr 6, 2016

cornerman commented Apr 7, 2016

stapelberg commented Apr 7, 2016

cornerman commented Apr 7, 2016

Airblader commented Apr 7, 2016

cornerman commented Apr 7, 2016

Airblader commented Apr 7, 2016

cornerman commented Apr 7, 2016

Airblader commented Apr 7, 2016

stapelberg commented Apr 10, 2016

pizdjuk commented Jan 30, 2021

orestisfl commented Jan 30, 2021

pizdjuk commented Jan 30, 2021

orestisfl commented Jan 31, 2021 via email