-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: changed internal architecture to avoid high memory usage #33
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: Gabriele Baldoni <gabriele.baldoni@gmail.com>
Signed-off-by: Gabriele Baldoni <gabriele.baldoni@gmail.com>
Signed-off-by: Gabriele Baldoni <gabriele.baldoni@gmail.com>
Signed-off-by: Gabriele Baldoni <gabriele.baldoni@gmail.com>
So from what I gather from a high-level architecture point of view, this polls the queue at a configured interval and sends all available data to Loki. Can you explain why you chose this design? To my understanding, this also means that if more than 512 log messages ( Previously, the background task sent off logs immediately when they were produced, if nothing else was currently being sent, or immediately after the current send. This means that the data was stored in the ` The PR (currently) doesn't do error checking of the HTTPS request, drops outstanding messages when the quit message is received, removes the exponential backoff, all of which can be fixed once we figure out the correct architecture. I'm going to try to figure out why you see the increased memory usage with the old design… |
(Sorry, I seem to have forgotten to press the send button.) |
Yes, we can log that they are dropped but nor more that that, but both channel size and sleep period can be configured, so its the user to figure out its use-case.
If there is an HTTP error and we keep the messages the memory will grow, so sooner or later they need to be drop, so better to drop immediately, an error log can be emitted.
Well seems that messages where never dropped even if they where sent, not sure why tbh. If you are able to fix the memory issues with the current (main) architecture I'm fine to test it. |
Basically, I'm interested why you chose this particular design, to understand whether it's worthwhile changing to it. E.g. I wouldn't change the design solely to fix a bug with the old implementation, I'd prefer to fix the implementation instead. |
It more straight forward compare to current (main) design, that's the only reason. |
Based on the finding form #9.
RAM usage (10minutes, same load on the process, same log level)
before:
logs on loki:
after:
logs on loki:its around 1/10th with this patch, for the same amount of data pushed to Loki.
I replaced
tokio::mpsc
withflume
because the latter does not take&mut self
in thetry_recv()
.The backoff time is configurable,
I can make also the channel size configurablechannel size configurable, still via the builder.I introduced an API breaking changeIntroducedtask::spawn(task.start()
instead oftask::spawn(task)
, but I think I can make it compatible just a matter of playing a bit with the builder.BackgroudTaskFuture
to deal with it.I can provide macOS Instrument traces if needed, did not upload them as they are 100s of MBs.