Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mimir query engine: split LimitingPool #8909

Merged
merged 5 commits into from
Aug 6, 2024

Conversation

charleskorn
Copy link
Contributor

@charleskorn charleskorn commented Aug 6, 2024

What this PR does

This PR is a refactoring of LimitingPool, which was introduced in #8270.

LimitingPool previously served two purposes:

  • tracking the current estimated memory consumption of a query and enforcing the max memory consumption limit
  • pooling slices of objects

However, there's no need to couple these two things together, and in fact, doing so makes future changes (eg. incorporating in-memory chunk bytes into the memory limit) more difficult.

This PR splits LimitingPool into two parts: MemoryConsumptionTracker and a number of global LimitingBucketedPools.

Unfortunately this introduced an import cycle between the types and pooling packages, so I've opted to merge them together into the types package.

Performance is unchanged within the margin of error of our benchmarks.

Which issue(s) this PR fixes or relates to

(none)

Checklist

  • Tests updated.
  • [n/a] Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • [n/a] about-versioning.md updated with experimental features.

@charleskorn charleskorn marked this pull request as ready for review August 6, 2024 01:57
@charleskorn charleskorn requested a review from a team as a code owner August 6, 2024 01:57
Copy link
Contributor

@jhesketh jhesketh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly concerned about the use of globals, otherwise lgtm.

We could perhaps keep some kind of LimitingPool that allows us to reach into the MemoryConsumptionTracker for chunks etc in the future.

_, err = p.Get(4, tracker)
expectedError := fmt.Sprintf("the query exceeded the maximum allowed estimated amount of memory consumed by a single query (limit: %d bytes) (err-mimir-max-estimated-memory-consumption-per-query)", limit)
require.ErrorContains(t, err, expectedError)
require.Equal(t, 8*FPointSize, tracker.CurrentEstimatedMemoryConsumptionBytes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we reach into the pool and check the capacity of the slices here? Even though we reject the Get due to the limit, we still end up performing a Get and Put against the pool so it's possible the slice has grown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm following sorry - are you suggesting we should check that the slice we took from the pool has been returned before we returned the error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although I'm not entirely sure there's a trivial way to do that.
Additionally it's interesting to confirm that the capacity grew even though we didn't use it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, although I'm not entirely sure there's a trivial way to do that.

One way to do it would be to introduce an interface for BucketedPool so we can mock it out in the test and confirm that the slice is returned, but that will introduce a performance overhead for the production code which I'd prefer to avoid.

The consequences of not returning the slice to the pool are relatively minor, especially given we don't expect to hit the limit too often.

Additionally it's interesting to confirm that the capacity grew even though we didn't use it.

Which capacity are you referring to here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which capacity are you referring to here?

Unless I'm misunderstanding, the capacity of the underlying slice will be increased by the Get, but then not used because it is Put back. (I'm not sure there is much value checking here tbh)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Get creates a new slice (or takes it from the pool) with a particular capacity, and Put puts that slice back in the pool. If something else later needs a slice of the same capacity of the discarded slice, it'll be given that slice, unless GC collects it first.

pkg/streamingpromql/types/limiting_pool.go Show resolved Hide resolved
@charleskorn charleskorn force-pushed the charleskorn/split-limitingpool branch from 1efc81b to 3575015 Compare August 6, 2024 06:33
@charleskorn charleskorn force-pushed the charleskorn/split-limitingpool branch from 3575015 to 22025b6 Compare August 6, 2024 06:34
@charleskorn charleskorn merged commit 4db1fd2 into main Aug 6, 2024
29 checks passed
@charleskorn charleskorn deleted the charleskorn/split-limitingpool branch August 6, 2024 06:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants