Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[processor/interval] Persistence backed interval processor with support for multiple aggregation intervals #33949

Open
lahsivjar opened this issue Jul 8, 2024 · 13 comments
Labels

Comments

@lahsivjar
Copy link
Member

lahsivjar commented Jul 8, 2024

Component(s)

processor/interval

Is your feature request related to a problem? Please describe.

The current version of the interval processor does in-memory aggregations for a single given interval (more than one intervals can be added with extra-overhead as new processors). The in-memory aggregations can be lost due to crashes and can also cause high memory usage for high cardinality aggregations.

Describe the solution you'd like

I propose database-backed aggregation supporting more than one intervals. I have a working PoC for this here, it is rough around the edges and will require some work before it can be contributed but it demonstrates the proposed idea. The PoC uses pebble-db, an Log-Structured-Merge (LSM) database and provides option to aggregate over multiple intervals.

The currently linked PoC uses only the interval as a key in the database which would require us to load the full aggregated interval in-memory, however, we could provide granular keys or even simple hash-based partitioning on resource IDs to reduce memory requirements even further.

Describe alternatives you've considered

No response

Additional context

I would love to hear what the maintainers of the interval-processor think about the proposal. If the maintainers are on-board the idea, I can clean-up the PoC, get it to a feature parity with the current interval processor, and create a PR.

@lahsivjar lahsivjar added enhancement New feature or request needs triage New item requiring triage labels Jul 8, 2024
@lahsivjar lahsivjar changed the title Persistence backed interval processor with support for multiple aggregation intervals [processor/interval] Persistence backed interval processor with support for multiple aggregation intervals Jul 8, 2024
Copy link
Contributor

github-actions bot commented Jul 8, 2024

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@djaglowski
Copy link
Member

We have an established notion of a storage extension which allows components to persist data via an interface. Users can then choose the implementation of the interface (i.e. the type of storage extension) that meets their needs.

The use case you're describing seems reasonable but can we please first consider whether it could work with a storage extension? There are several examples of components using the extension - this might be a relatively straightforward one.

@lahsivjar
Copy link
Member Author

lahsivjar commented Jul 8, 2024

@djaglowski Thanks for looking over the issue.

We have an established notion of a storage extension which allows components to persist data via an interface

The current storage extensions would not be efficient for aggregations. The best we could do with the current storage extension would be to store each resource as a key-value pair and do something like update on write manually. For example, when we get something stored on the same key then we decode the data, perform merge, and write it back to the DB. Pebble, OTOH, can do periodic compactions and perform merges (aggregations) on compactions.

I think if we can implement pebble as a new storage extension then we can use it with aggregations efficiently. WDYT, does it make sense?

@djaglowski
Copy link
Member

I'm not opposed to a new pebble extension (though I don't have capacity to help make it happen). What I'm most curious about is whether the storage extension really needs to be used so aggressively. In other words, can we maintain aggregations in memory and periodically write the overall state to the storage extension (e.g. once per second)?

@lahsivjar
Copy link
Member Author

lahsivjar commented Jul 9, 2024

I'm not opposed to a new pebble extension (though I don't have capacity to help make it happen).

Got it. I can do the heavy lifting here once we have a general direction, if that would work for you.

What I'm most curious about is whether the storage extension really needs to be used so aggressively. In other words, can we maintain aggregations in memory and periodically write the overall state to the storage extension (e.g. once per second)?

No, we don't need to use it so aggressively as to hit the database for each value we get. I did a PoC for using pebble for aggregations and used a size-based batch buffer of 16MBs (ref) to keep the data in-memory before flushing it to the actual DB. In addition, before harvesting the period when it is mature we would flush all in-memory data to db.

One of the advantages of using something like pebble over a generic key-value DB would be that it would make writes fairly straightforward, without a lot of computational overhead involved in merging.

@RichieSams
Copy link
Contributor

What I'm most curious about is whether the storage extension really needs to be used so aggressively. In other words, can we maintain aggregations in memory and periodically write the overall state to the storage extension (e.g. once per second)?

I had the same thought. Would it be fine to only store to the DB every X seconds or so, so we don't have to suffer the read, deserialize, write roundtrip for every operation

@djaglowski
Copy link
Member

I'm not opposed to a new pebble extension (though I don't have capacity to help make it happen).

Got it. I can do the heavy lifting here once we have a general direction, if that would work for you.

To clarify, I don't have capacity to sponsor (review and ultimately be responsible for) another component. It's still worth proposing formally though since you may find another sponsor. Otherwise you can always host it outside of contrib and people can pull it into their builds.

@lahsivjar
Copy link
Member Author

I had the same thought. Would it be fine to only store to the DB every X seconds or so, so we don't have to suffer the read, deserialize, write roundtrip for every operation

Hmm, I understand the point now. In the PoC I posted I am still serializing the data and writing it to an in-memory batch, however, we can save the whole serialization/deserialization bit by performing some merges in-memory and then serializing the partial aggregate. This makes sense to me and we could definitely do it.

To clarify, I don't have capacity to sponsor (review and ultimately be responsible for) another component. It's still worth proposing formally though since you may find another sponsor

Got it. Thanks for the clarification.

@lahsivjar
Copy link
Member Author

so we don't have to suffer the read, deserialize, write roundtrip for every operation

A clarification to this, with pebble, in the hot path (theConsumeMetrics step) we would not read and deserialize but rather just perform serializaton followed by write. The new metric will be written as a merge operation. The deserialize, actual merge, and write will be performed during compaction or when we harvest the aggregated data.

@lahsivjar
Copy link
Member Author

An update, I am working on building a pebble-based storage extension as per @djaglowski 's suggestion earlier. The existing storage extension API is basic and using it directly would be a bit detrimental for the performance as it has: a) no range keys support b) keys are always string. I will know more once I have made some progress with the code.

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Sep 10, 2024
@lahsivjar
Copy link
Member Author

I have been a bit away from this topic but I am working on this again. Looking into the storage extension, I found the current interface for the extension lacking support for the use-case for using pebble for aggregation. If we were to use the storage extension in the current state (with the support for simple get/put/batch) then the performance would be quite bad as we require range operations on keys.

I am currently working on getting some benchmarks comparing the pebble approach with the current interval processor to get some data.

@dashpole dashpole added needs triage New item requiring triage and removed needs triage New item requiring triage labels Oct 9, 2024
@atoulme atoulme removed the needs triage New item requiring triage label Oct 12, 2024
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants