LFC: using rolling hyperloglog for correct working set estimation #7466

skyzh · 2024-04-22T18:00:02Z

The current LFC uses hyperloglog to estimate the working set size, which does not correctly deal with deletes, and does not capture the working set size used in a window of time. The working set estimation should be more accurate by using some algorithms like rolling hyperloglog.

(discussed briefly at the compute offsite @knizhnik, and cc @sharnoff)

sharnoff · 2024-04-22T18:03:55Z

See also: the implementation that proxy uses, here: https://github.com/neondatabase/neon/blob/e69ff3fc00ab8be31e8f69eb3726da1b83d84180/libs/metrics/src/hll.rs

(cc @conradludgate)

conradludgate · 2024-04-22T18:07:06Z

I need to update my impl to use rolling. Currently it's sample+clear, but rolling would protect from lost samples, although it smears the results so they have less temporal resolution

stradig · 2024-06-05T16:22:23Z

Here is a paper about rolling window HLL https://hal.science/hal-00465313/document

kelvich · 2024-06-13T11:39:00Z

I've chatted yesterday with @knizhnik about this -- he is now working on #7288 / https://github.com/neondatabase/cloud/issues/14202, and can take a look at this after (~ next week)

knizhnik · 2024-06-13T15:51:55Z

I will look for sliding-hyperloglog implementations.
Right now I found only one in Java: sliding-hyperloglog

But before I want to clarify hiowwe are going to use this rolling estimation of working set size? Do we need to know how much unique pages are accessed during let's say last hour? But how answer to this question can be interpreted? Assume that compute has performed N get_page requests during last hour and ~M of them are unique. Does it mean that working set size is M? Actually not. If we set LFC size to M, there is absolutely no warranty that we can avoid most of requests to PS.
May be we can make some conclusion based on N/M ratio, but it is not quite clear to me how to do it.

<y original assumption is that working set size estimation is needed to answer the question whether we can fit all working set in LFC or not. For most of our projects it is true. But certainly there can be projects which working set do not fit in local disk, doesn't;t matter how larger they are and how much space we can provide for LFC.

knizhnik · 2024-06-13T18:28:12Z

Just want to explain my point of view more precisely.

There is no sense to try to estimate working set size to extend LFC. We just add items to it and let it grow as much as needed.
There is no need to estimate working set to shrink LFC. If nobody is pretending on this space, we can keep cached data hoping that sometime it may be requested. And once somebody needs more space (LFC of other tenant or pg_temp) we can very fast shrink LFC.

The only case when estimation of working set size can help - is to chose LFC policy: should we try to keep the whole WS in LFC or it is not possible and we s would just limit size of LFC depending on number of tenants per node and other factors.

sharnoff · 2024-06-13T23:06:10Z

@knizhnik I think I understand where you're coming from -- IMO the ideas you mention are out of scope for the time being. Specifically:

There is no need to estimate working set to shrink LFC. If nobody is pretending on this space, we can keep cached data hoping that sometime it may be requested. And once somebody needs more space (LFC of other tenant or pg_temp) we can very fast shrink LFC

This would be a substantial effort under the current architecture. Right now, we rely on LFC size being proportional to CU in order to keep disk usage low, and don't have the kind of node-level involvement that would make this more feasible.

To provide further context:

But before I want to clarify hiowwe are going to use this rolling estimation of working set size? Do we need to know how much unique pages are accessed during let's say last hour?

Broadly, we plan to incorporate it into the scaling algorithm. We haven't yet decided what exact usage will look like -- expect both (a) an RFC about it, and (b) testing to refine it.

After considering this for a bit:

We could expose the raw values of the buckets and let the autoscaler-agent aggregate it into a working set size estimation over an arbitrary time window (within however long it's been looking at the metrics). This is roughly how proxy's HLL-based metrics work.

I think that approach will not work as well here -- it leaves the autoscaler-agent (or, some other component?) responsible for storing the metrics, which could be an issue for intervals longer than a few minutes. The lack of pre-aggregation would also make observability for these estimates more difficult (both in internal dashboards, and in customer-facing metrics).

So: IMO the LFC should expose metrics that are similar in spirit to load average -- basically, working set size over a small number of intervals of your choice. To keep it simple, we can start with just one (e.g., past 5 minutes?), and add more as we see fit, based on testing.

MMeent · 2024-06-18T16:53:24Z

in a window of time

Do we need arbitrary window of time, or is an open-ended window from some point in time good enough?

## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>

In general, rename: - lfc_approximate_working_set_size to - lfc_approximate_working_set_size_seconds For the "main" metrics that are actually scraped and used internally, the old one is just marked as deprecated. For the "autoscaling" metrics, we're not currently using the old one, so we can get away with just replacing it. Also, for the user-visible metrics we'll only store & expose a few different time windows, to avoid making the UI overly busy or bloating our internal metrics storage. But for the autoscaling-related scraper, we aren't storing the metrics, and it's useful to be able to programmatically operate on the trendline of how WSS increases (or doesn't!) window size. So there, we can just output datapoints for each minute. Part of neondatabase/autoscaling#872. See also #7466.

sharnoff · 2024-07-05T23:26:47Z

Now that #8068 has been merged, is this complete?

knizhnik · 2024-07-06T07:33:13Z

No, we need to wait until it is deployed, wait one week and then I will create another PR charging default version to 1.4

## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>

ololobus · 2024-07-16T16:04:14Z

This week:

Compute release without default
Konstantin: make a new PR with changing the default (to be released next week)

knizhnik · 2024-07-17T07:07:52Z

#8405

skyzh added t/feature Issue type: feature, for new features or requests c/compute Component: compute, excluding postgres itself labels Apr 22, 2024

knizhnik mentioned this issue Jun 15, 2024

implement rolling hyper-log-log algorithm #8068

Merged

5 tasks

sharnoff mentioned this issue Jun 25, 2024

Make metrics LFC number of hits, number of misses and working set size available to autoscaling algorithm neondatabase/autoscaling#872

Closed

kelvich assigned knizhnik Jun 25, 2024

sharnoff mentioned this issue Jul 5, 2024

vm-image: Expose new LFC working set size metrics #8298

Merged

ololobus mentioned this issue Jul 17, 2024

Change default version of Neon extensio to 1.4 #8405

Merged

5 tasks

lubennikovaav closed this as completed Jul 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LFC: using rolling hyperloglog for correct working set estimation #7466

LFC: using rolling hyperloglog for correct working set estimation #7466

skyzh commented Apr 22, 2024

sharnoff commented Apr 22, 2024

conradludgate commented Apr 22, 2024

stradig commented Jun 5, 2024

kelvich commented Jun 13, 2024

knizhnik commented Jun 13, 2024

knizhnik commented Jun 13, 2024

sharnoff commented Jun 13, 2024

MMeent commented Jun 18, 2024 •

edited

Loading

sharnoff commented Jul 5, 2024

knizhnik commented Jul 6, 2024

ololobus commented Jul 16, 2024

knizhnik commented Jul 17, 2024

LFC: using rolling hyperloglog for correct working set estimation #7466

LFC: using rolling hyperloglog for correct working set estimation #7466

Comments

skyzh commented Apr 22, 2024

sharnoff commented Apr 22, 2024

conradludgate commented Apr 22, 2024

stradig commented Jun 5, 2024

kelvich commented Jun 13, 2024

knizhnik commented Jun 13, 2024

knizhnik commented Jun 13, 2024

sharnoff commented Jun 13, 2024

MMeent commented Jun 18, 2024 • edited Loading

sharnoff commented Jul 5, 2024

knizhnik commented Jul 6, 2024

ololobus commented Jul 16, 2024

knizhnik commented Jul 17, 2024

MMeent commented Jun 18, 2024 •

edited

Loading