Ingester: fix wrong conversion of global into local limits #4238

duricanikolic · 2023-02-13T22:30:34Z

What this PR does

This PR fixes a bug introduced by issue #4208: conversion of global to local limits didn't take into account the actual number of ingesters in the zone of interest, but assumed that ingesters were equally distributed in each zone. As a direct consequence, scaling down ingesters in a certain zone caused the limits of that zone to be too low. This PR fixes that problem.
For example, let's assume the following:

global limit (GL) is 1M,
replication factor (RF) is 3,
zone count (ZC) is 3, for example zone-a, zone-b, zone-c
number of ingesters per zone (IA = IB = IC) is 10

With the previous approach local limit would be calculated, for all 3 zones, as

GL * RF / (IA + IB + IC) = 1M * 3 / 30 = 100K.

Similarly, the new approach would calculate the same result for each zone

GL * RF / ZC / IA = GL * RF / ZC / IB = GL * RF / ZC / IC = 100K,

Suppose that we now scale down ingesters in zone-a (IA) to 5. The previous approach would, again, assign the same local limit to all 3 zones to

GL * RF / (IA + IB + IC) = 1M * 3 / (5 + 10 + 10) = 120K,

which is wrong. The right limits are calculated by the new approach like this:

for zone-a: GL * RF / ZC / IA = 1M * 3 / 3 / 5 = 200K
for zone-b and zone-c: GL * RF / ZC / IB = GL * RF / ZC / IC = 1M * 3 / 3 / 10 = 100K .

Which issue(s) this PR fixes or relates to

Fixes #4208

Checklist

Tests updated
Documentation added
CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

pracucci

Thanks for working on this! I left some initial comments, cause I think the logic is not correct in case of zone-aware replication enabled and shuffle sharding (shard size is set).

pkg/ingester/limiter.go

pkg/ingester/limiter_test.go

pkg/ingester/limiter.go

CHANGELOG.md

pracucci

The logic LGTM, thanks! I left few last nits 🙏 Since this logic is tricky and it's in the critical path, I would like another pair of 👀 : @pstibrany could you review it too, please?

CHANGELOG.md

pkg/ingester/limiter.go

duricanikolic · 2023-02-20T16:06:11Z

The logic LGTM, thanks! I left few last nits 🙏 Since this logic is tricky and it's in the critical path, I would like another pair of 👀 : @pstibrany could you review it too, please?

The logic is indeed tricky, and I tried to cover all possible cases with a separate test. If @pstibrany finds a case that is not covered by the logic, we need to add the corresponding test as well

pkg/ingester/limiter.go

pstibrany

Looks good, but I have a question about "healthiness", and suggestion for scale-up test cases.

pkg/ingester/limiter.go

pstibrany · 2023-02-21T08:59:23Z

pkg/ingester/limiter_test.go

+			ringZonesCount:              3,
+			healthyIngestersInZoneCount: 2, // in this zone ingesters are scaled down from 3 to 2
+			shardSize:                   0,
+			expectedValue:               450, // (900 / 3 zones / 2 healthy ingesters per zone) * 3 replication factor


Shouldn't this depend on whether distributors are allowed to extend writes or not? If one ingester is restarting (ie. it is unhealthy), do we want to temporarily increase limits in other two ingesters in the zone?

I am wondering if we should use "healthiness" concept at all when computing limits. Whether ingester is healthy or not depends on the operation, but we use two distinct write operations. (Update: "healthiness" doesn't change between Write and WriteNoExtend, but in general it depends on the operation.)

I am not sure I can answer this question. We would need @pracucci 's support. When I added the HealthyInstancesInZoneCount() function to Lifecycler, I initially counted all instances, but then @pracucci suggested that we should count only the healthy ones, so I implemented it this way.

Let’s consider an example. User has 1M series limit. We have 3 zones, each with 10 ingesters, and we use RF=3. So each ingester has 100k limit.

During rollout, we stop 5 ingesters at once in single zone. So there will be only 5 healthy ingesters left in the zone.
At this point limits at each remaining ingester in the same zone is recomputed to allow 200k series.
Mimir now always uses WriteNoExtend, so if user wasn’t hitting the 100k limit yet, this is fine. But if user was hitting 100k limit before, now we temporarily allow twice as many series for the user. After rollout in the zone finishes, per-ingester limit will be set back to 100k, but series will be in memory already.

Situation is similar when shuffle sharding is in use for the user. Let’s say user has 24 ingesters in total (ie. 8 per zone). Limit per ingester if all ingesters are healhty will be 125k. During rollout we restart 5 ingesters in one zone, and only 5 will be healthy. Since 5 is less than 8 requested ingesters from the zone, temporary limit on each ingester will be 200k again.

Note that shuffle shard doesn’t take healthiness into consideration, so we’re mixing healthy with ALL ingesters when doing the check.

Given that we always use WriteNoExtend now, I am not sure why we only consider healthy ingesters when computing the limit.

Talked to Peter on Slack. I think the reason why we're not seeing it happening is because the ring client skips the entire zone if there are unhealthy instances, so we don’t write at all to ingesters in the zone we’re rollout out.

However, I'm 👍 with the change proposed by Peter, to count all instances not just healthy ones, given it looks more correct to me too.

@pstibrany @pracucci Does it mean we need to do the following things?

update dskit by replacing its current HealthyInstancesInZoneCount() by InstancesInZoneCount() and by updating its implementation

once it is merged, update this PR and use the new InstancesInZoneCount() method instead? Beside some syntactical changes (replacing "healthyIngesters" with "ingesters"), I don't see other things that should be changed here.

That looks correct to me.

I think you also need to add InstancesCount() in dskit to use when the zone-aware replication is disabled.

@pstibrany @pracucci Could you please review grafana/dskit#270?

pkg/ingester/limiter_test.go

pstibrany

Thanks!

pkg/ingester/limiter.go

pstibrany · 2023-02-21T14:46:36Z

This PR needs to be rebased before merging.

pracucci · 2023-02-22T06:46:29Z

@duricanikolic Look at the PR's diff. As you can see there are changes not related to this PR. I think you merged instead of rebase. You should rebase main. Since you have a bunch of commits, the easiest way to rebase reducing conflicts may be squash all your commits into 1 commit, and then git rebase main.

Ingester: fix wrong conversion of global into local limits Upgrade CHANGELOG.md and add TODO in go.mod as a reminder to set the right version of dskit Implementing review findings Implementing review findings Implementing review findings Implementing review findings Upgraded to the latest dskit

duricanikolic · 2023-02-22T07:27:25Z

@duricanikolic Look at the PR's diff. As you can see there are changes not related to this PR. I think you merged instead of rebase. You should rebase main. Since you have a bunch of commits, the easiest way to rebase reducing conflicts may be squash all your commits into 1 commit, and then git rebase main.

Hi @pracucci,
I have updated my PR accordingly

pracucci · 2023-02-22T08:09:06Z

pkg/ingester/limiter.go

+	// Global limit is equally distributed among all the active zones.
+	// The portion of global limit related to each zone is then equally distributed
+	// among all the ingesters belonging to that zone.
+	return ((globalLimit * l.replicationFactor) / zonesCount) / ingestersInZoneCount


Non credo sia corretto fare le operazioni di divisione in integer, perchè ad ogni operazione perdi precisione. Questo è il motivo per cui prima effettuavamo queste operazioni in floating point e poi convertivano in int il risultato.

Sorry for writing in italian. Let me translate it:

I don’t think it’s correct to do the division operations in integer, because with each operation you lose precision. This is why we previously performed these operations in floating point and then converted the result to int.

This was requested by @pstibrany. I will revert it so we ensure that we don't lose the precision

translation:

I don't think it's correct to do the division operations in integer, because with each operation you lose precision. This is why we first performed these operations in floating point and then converted the result to int.

I suggested to either simplify that expression and remove floats, or add a test to show why floats are required. While I agree about loss of precision, I doubt it will have any meaningful effect here in practice.

@duricanikolic I think Peter feels stronger than me on this, so I would get stick to what Peter suggested previously (int division). I would just add a comment explaining why it's not an issue ("Max difference between using floats or not is 1 / ingestersInZoneCount").

A question for both @pracucci and @pstibrany: If we have a global limit 1000, replication factor 2 and number of zones 3, and let's say 1 ingester in that zone, should we return 666 or 667 as the result (1000 * 2 / 3)?

Answer is: "doeasn't matter". There's no practical difference between the two. Which is Peter's point. I think int divisions are always floored (like casting float to int) so probably it's 666.

@duricanikolic I think Peter feels stronger than me on this, so I would get stick to what Peter suggested previously (int division).

No need to revert back.

A question for both @pracucci and @pstibrany: If we have a global limit 1000, replication factor 2 and number of zones 3, and let's say 1 ingester in that zone, should we return 666 or 667 as the result (1000 * 2 / 3)?

Both are fine. Returning ceil would benefit user -- allow them to write one extra series per ingester.

OK, so today I am going to test this PR on mimir-dev-09, and once it is done I will let you know, so we can merge this.

pracucci

LGTM! (modulo the nit on precision, sorry for the back and forth on this)

duricanikolic requested a review from a team as a code owner February 13, 2023 22:30

duricanikolic requested a review from pracucci February 14, 2023 17:49

pracucci reviewed Feb 20, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

duricanikolic force-pushed the yuri/issue-4208 branch from f5cdc41 to 1161274 Compare February 20, 2023 14:33

duricanikolic requested a review from pracucci February 20, 2023 14:36

pracucci approved these changes Feb 20, 2023

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

pkg/ingester/limiter.go Outdated Show resolved Hide resolved

pkg/ingester/limiter.go Outdated Show resolved Hide resolved

pkg/ingester/limiter.go Outdated Show resolved Hide resolved

pracucci reviewed Feb 20, 2023

View reviewed changes

pkg/ingester/limiter.go Outdated Show resolved Hide resolved

pkg/ingester/limiter.go Outdated Show resolved Hide resolved

pstibrany reviewed Feb 21, 2023

View reviewed changes

duricanikolic requested a review from pstibrany February 21, 2023 10:14

pstibrany approved these changes Feb 21, 2023

View reviewed changes

pkg/ingester/limiter.go Outdated Show resolved Hide resolved

duricanikolic requested a review from a team as a code owner February 21, 2023 15:16

duricanikolic added 3 commits February 22, 2023 08:22

Implementing review findings

02470a9

Fixing a failing test

dbfbf09

duricanikolic force-pushed the yuri/issue-4208 branch from 60c922c to dbfbf09 Compare February 22, 2023 07:26

pracucci reviewed Feb 22, 2023

View reviewed changes

Improve precision of local limit calculation

64014c7

pracucci approved these changes Feb 22, 2023

View reviewed changes

osg-grafana approved these changes Feb 22, 2023

View reviewed changes

pracucci merged commit bca6826 into main Feb 22, 2023

pracucci deleted the yuri/issue-4208 branch February 22, 2023 20:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingester: fix wrong conversion of global into local limits #4238

Ingester: fix wrong conversion of global into local limits #4238

duricanikolic commented Feb 13, 2023

pracucci left a comment

pracucci left a comment

duricanikolic commented Feb 20, 2023

pstibrany left a comment

pstibrany Feb 21, 2023 •

edited

Loading

duricanikolic Feb 21, 2023

pstibrany Feb 21, 2023 •

edited

Loading

pracucci Feb 21, 2023

duricanikolic Feb 21, 2023 •

edited

Loading

pstibrany Feb 21, 2023

pracucci Feb 21, 2023

duricanikolic Feb 21, 2023

pstibrany left a comment

pstibrany commented Feb 21, 2023

pracucci commented Feb 22, 2023

duricanikolic commented Feb 22, 2023 •

edited

Loading

pracucci Feb 22, 2023

pracucci Feb 22, 2023

duricanikolic Feb 22, 2023

pstibrany Feb 22, 2023

pstibrany Feb 22, 2023 •

edited

Loading

pracucci Feb 22, 2023

duricanikolic Feb 22, 2023

pracucci Feb 22, 2023

pstibrany Feb 22, 2023

duricanikolic Feb 22, 2023

pracucci left a comment

Ingester: fix wrong conversion of global into local limits #4238

Ingester: fix wrong conversion of global into local limits #4238

Conversation

duricanikolic commented Feb 13, 2023

What this PR does

Which issue(s) this PR fixes or relates to

Checklist

pracucci left a comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

duricanikolic commented Feb 20, 2023

pstibrany left a comment

Choose a reason for hiding this comment

pstibrany Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pstibrany Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

duricanikolic Feb 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pstibrany left a comment

Choose a reason for hiding this comment

pstibrany commented Feb 21, 2023

pracucci commented Feb 22, 2023

duricanikolic commented Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pstibrany Feb 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pracucci left a comment

Choose a reason for hiding this comment

pstibrany Feb 21, 2023 •

edited

Loading

pstibrany Feb 21, 2023 •

edited

Loading

duricanikolic Feb 21, 2023 •

edited

Loading

duricanikolic commented Feb 22, 2023 •

edited

Loading

pstibrany Feb 22, 2023 •

edited

Loading