You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm using --redis_api option to publish messages at high volume.
I found that at about 5k messages a second average, the centrifugo.api queue in redis just grew and grew, even though my 2 centrifugo instances had vertually no CPU usage.
See this graph:
The problem is that centrifugos publish API isn't batched. Each message we must make at least 2 complete round trips to redis for publish and then addHistory call before we can move on to the next.
This works for me - now broadcasting average of 5k messages a second for several days with peaks of over 20k a second and queue lengths rarely go above the single item that was just pushed.
I'm using
--redis_api
option to publish messages at high volume.I found that at about 5k messages a second average, the
centrifugo.api
queue in redis just grew and grew, even though my 2 centrifugo instances had vertually no CPU usage.See this graph:
The problem is that centrifugos publish API isn't batched. Each message we must make at least 2 complete round trips to redis for
publish
and thenaddHistory
call before we can move on to the next.I managed to fix the problem in my case with a quick hack which was just to put the recieving goroutine here: https://github.com/centrifugal/centrifugo/blob/master/libcentrifugo/engineredis.go#L156-L179 in a loop so there were 20 of them running. Now it keeps up fine with much more CPU used:
Discussed with @FZambia on gitter, and he was concerned that this breaks order of delivery for single-node case.
So we come up with following proposal to fix this issue cleanly without full batching refactor (for now).
redis_api_num_pub_chans
which is default 1initalizeApi
and process them synchronously in ordercentrifugo.api
queue as it is for any non-publish use-case that might existcentrifugo.api.pub.[number]
I intend to have a PR for this built tomorrow.
The text was updated successfully, but these errors were encountered: