-
Notifications
You must be signed in to change notification settings - Fork 490
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LFC: using rolling hyperloglog for correct working set estimation #7466
Comments
See also: the implementation that proxy uses, here: https://github.com/neondatabase/neon/blob/e69ff3fc00ab8be31e8f69eb3726da1b83d84180/libs/metrics/src/hll.rs (cc @conradludgate) |
I need to update my impl to use rolling. Currently it's sample+clear, but rolling would protect from lost samples, although it smears the results so they have less temporal resolution |
Here is a paper about rolling window HLL https://hal.science/hal-00465313/document |
I've chatted yesterday with @knizhnik about this -- he is now working on #7288 / https://github.com/neondatabase/cloud/issues/14202, and can take a look at this after (~ next week) |
I will look for sliding-hyperloglog implementations. But before I want to clarify hiowwe are going to use this rolling estimation of working set size? Do we need to know how much unique pages are accessed during let's say last hour? But how answer to this question can be interpreted? Assume that compute has performed N get_page requests during last hour and ~M of them are unique. Does it mean that working set size is M? Actually not. If we set LFC size to M, there is absolutely no warranty that we can avoid most of requests to PS. <y original assumption is that working set size estimation is needed to answer the question whether we can fit all working set in LFC or not. For most of our projects it is true. But certainly there can be projects which working set do not fit in local disk, doesn't;t matter how larger they are and how much space we can provide for LFC. |
Just want to explain my point of view more precisely.
The only case when estimation of working set size can help - is to chose LFC policy: should we try to keep the whole WS in LFC or it is not possible and we s would just limit size of LFC depending on number of tenants per node and other factors. |
@knizhnik I think I understand where you're coming from -- IMO the ideas you mention are out of scope for the time being. Specifically:
This would be a substantial effort under the current architecture. Right now, we rely on LFC size being proportional to CU in order to keep disk usage low, and don't have the kind of node-level involvement that would make this more feasible. To provide further context:
Broadly, we plan to incorporate it into the scaling algorithm. We haven't yet decided what exact usage will look like -- expect both (a) an RFC about it, and (b) testing to refine it. After considering this for a bit: We could expose the raw values of the buckets and let the autoscaler-agent aggregate it into a working set size estimation over an arbitrary time window (within however long it's been looking at the metrics). This is roughly how proxy's HLL-based metrics work. I think that approach will not work as well here -- it leaves the autoscaler-agent (or, some other component?) responsible for storing the metrics, which could be an issue for intervals longer than a few minutes. The lack of pre-aggregation would also make observability for these estimates more difficult (both in internal dashboards, and in customer-facing metrics). So: IMO the LFC should expose metrics that are similar in spirit to load average -- basically, working set size over a small number of intervals of your choice. To keep it simple, we can start with just one (e.g., past 5 minutes?), and add more as we see fit, based on testing. |
Do we need arbitrary window of time, or is an open-ended window from some point in time good enough? |
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
In general, rename: - lfc_approximate_working_set_size to - lfc_approximate_working_set_size_seconds For the "main" metrics that are actually scraped and used internally, the old one is just marked as deprecated. For the "autoscaling" metrics, we're not currently using the old one, so we can get away with just replacing it. Also, for the user-visible metrics we'll only store & expose a few different time windows, to avoid making the UI overly busy or bloating our internal metrics storage. But for the autoscaling-related scraper, we aren't storing the metrics, and it's useful to be able to programmatically operate on the trendline of how WSS increases (or doesn't!) window size. So there, we can just output datapoints for each minute. Part of neondatabase/autoscaling#872. See also #7466.
Now that #8068 has been merged, is this complete? |
No, we need to wait until it is deployed, wait one week and then I will create another PR charging default version to 1.4 |
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>
This week:
|
The current LFC uses hyperloglog to estimate the working set size, which does not correctly deal with deletes, and does not capture the working set size used in a window of time. The working set estimation should be more accurate by using some algorithms like rolling hyperloglog.
(discussed briefly at the compute offsite @knizhnik, and cc @sharnoff)
The text was updated successfully, but these errors were encountered: