[pkg/stanza] Log converter has significant memory footprint at scale #18411

cpheps · 2023-02-07T17:18:37Z

Component(s)

pkg/stanza

Describe the issue you're reporting

The log converter logic in pkg/stanza to convert from a stanza entry to a plog Log has a large memory footprint at scale.

Doing some pprof heap dumps it appears the largest memory footprint is from making copies of the plog entries in the aggregationLoop here. It's also observed if the aggregationLoop gets blocked pushing it's logs to the flushChan here that two copies of all the plogs currently in workerItems exists in memory at a time.

I wrote out a possible solution of combining the workerLoops and aggregationLoop in a way where the stanza entries are converted into the final plog payload sent to the flushChan. This eliminated the need to aggregate plogs by resource in a separate goroutine. I ran a test against a 1 million log file sending to GCP for 5 minutes. I used two versions of the collector, the v0.70.0 release and with my condensed loop change. I've included the process memory graphs below collected by the hostmetrics receiver.

The condensed loop change uses significantly less memory over the same period of time. I've also included the pprof svg images that were collect at the end of each run. These show that the copyTo hotspot has been eliminated and less memory is used.

Given these performance increases I'd like to propose opening a PR with the condensed loop change.

Here is the code I first pass of code I wrote to condense the loops.

pprof images:
Collector v0.70.0:

Condensed loop:

The text was updated successfully, but these errors were encountered:

github-actions · 2023-02-07T18:39:35Z

Pinging code owners:

pkg/stanza: @djaglowski

See Adding Labels via Comments if you do not have permissions to add labels yourself.

djaglowski · 2023-02-07T22:31:32Z

I'm very much in favor of removing complexity from the converter.

It's not clear to me that we really should be batching within the package anyways. We removed all user-facing configuration before declaring filelog as beta. #16452 (comment) is potentially another indication that we would be better emitting immediately.

In any case, I'd be happy to review a PR for your changes.

cpheps added the needs triage New item requiring triage label Feb 7, 2023

github-actions bot added the pkg/stanza label Feb 7, 2023

atoulme added priority:p2 Medium and removed needs triage New item requiring triage labels Feb 8, 2023

cpheps mentioned this issue Feb 8, 2023

[pkg/stanza] Condense Converter logic to improve memory usage #18467

Merged

djaglowski closed this as completed in #18467 Feb 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pkg/stanza] Log converter has significant memory footprint at scale #18411

[pkg/stanza] Log converter has significant memory footprint at scale #18411

cpheps commented Feb 7, 2023

github-actions bot commented Feb 7, 2023

djaglowski commented Feb 7, 2023