edf: adding a construction option with a given number of pre-picks #31592

adisuissa · 2024-01-02T22:11:48Z

Commit Message: edf: adding a construction option with a given number of pre-picks
Additional Description:
This is a follow up on #29953, with some insights about an additional scenario that has a similar problem.
The EDF scheduler is used in 2 levels of Zone-Aware LB policies:

For choosing the locality to use (here for healthy hosts). Note that as opposed to the problem in (2), this also impacts equal-weighted localities.
For choosing the host within the chosen locality (here).

Prior suggestions (WRSQ, #29953) focused on fixing level (2), and required converting the weights to integer-precision before performing the LB policy.
While this may work for (2), assuming one can convert the weights to integers by multiplying by some factor and truncating, for (1) this approach is more challenging as when computing the "effective locality weight" its value is dependent on the over-provisioning factor and the ratio of available hosts to all hosts. An additional benefit is that the current approach can also work with slow-start.

Thus, this PR suggests a different mechanism to "perform" some random number of picks when creating the EDF scheduler. The creation process is split into two steps:

Estimate a lower-bound on the number of picks each entry will be chosen, and initialize the internal EDF priority-queue accordingly.
Perform up to N "pickAndAdd" operations, where N is the number of localities/hosts.

Note that this approach is ~equal to performing some P picks from the scheduler where P >> N (~equal up to double precision computation differences and entries order of entries with the same weight).

After this PR, the next thing is to plumb the PRNG (or a random value) into the locality-scheduler creation process, and fix the call site for (1).

Here's a short doc providing the rationale behind this work.

Risk Level: low - code not used by Envoy's main codebase
Testing: Added tests to validate equivalence with prior approach.
Docs Changes: N/A
Release Notes:N/A - will be updated when the
Platform Specific Features: N/A

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa · 2024-01-02T22:54:43Z

cc'ing all the interested stake-holders for the prior fix for comments/suggestions on this new approach
@tonya11en @wbpcode @nezdolik @htuch

adisuissa · 2024-01-03T21:43:02Z

Here's a short doc providing the rationale behind this work.

tonya11en · 2024-01-04T01:30:28Z

Thanks for the doc. This is one of those PRs I'll need to sit down and work through an example before grokking. I've got a block reserved for this tomorrow morning, so give me at least until then.

tonya11en · 2024-01-04T19:42:24Z

Alright, spent some time with it and think it's really clever. It makes sense how you optimize the rotation by skipping to some virtual time in the future and burning through whatever remainder is left.

I hacked together a "first pick" test similar to the ones done for the WRSQ approach and noticed that the selection probabilities are not quite what I expected. For example, if I have 2 hosts with weights {25, 75} the selection probabilities are not quite right and it doesn't converge on the right answer as I increase the number of iterations in the test:

hosts: {1337, 1338}
weights: {1, 99}

[ RUN      ] EdfSchedulerTest.EdfHax
Testing first picks. iters=1000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 21.1%
	expected: 25%
	relative_error: -15.6%
1338:
	observed: 78.9%
	expected: 75%
	relative_error: 5.2%

Testing first picks. iters=10000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 20.36%
	expected: 25%
	relative_error: -18.56%
1338:
	observed: 79.64%
	expected: 75%
	relative_error: 6.18667%

Testing first picks. iters=100000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 20.351%
	expected: 25%
	relative_error: -18.596%
1338:
	observed: 79.649%
	expected: 75%
	relative_error: 6.19867%

Testing first picks. iters=1000000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 20.2892%
	expected: 25%
	relative_error: -18.8432%
1338:
	observed: 79.7108%
	expected: 75%
	relative_error: 6.28107%

The unrotated scheduler exhibits the behavior that motivated this change- chooses the same element every time. What I find confusing is that the relative error doesn't change for the scheduler that does the rotations.

I tried this with weights {1, 99} as well and see the same behavior:

[ RUN      ] EdfSchedulerTest.EdfHax
Testing first picks. iters=1000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 0.9%
	expected: 1%
	relative_error: -10%
1338:
	observed: 99.1%
	expected: 99%
	relative_error: 0.10101%

Testing first picks. iters=10000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 0.9%
	expected: 1%
	relative_error: -10%
1338:
	observed: 99.1%
	expected: 99%
	relative_error: 0.10101%

Testing first picks. iters=100000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 0.648%
	expected: 1%
	relative_error: -35.2%
1338:
	observed: 99.352%
	expected: 99%
	relative_error: 0.355556%

Testing first picks. iters=1000000
unrotated EDF:
1338: 100%

rotated EDF:
1337:
	observed: 0.6105%
	expected: 1%
	relative_error: -38.95%
1338:
	observed: 99.3895%
	expected: 99%
	relative_error: 0.393434%

Any ideas about what's going on here? I can look at this some more tomorrow if not. It's possible I'm doing something weird here, so if you want to take a look at the patch it's here: tonya11en@bd34ce7

adisuissa · 2024-01-05T13:44:57Z

That's a good observation, and indeed strange.
Let me take a deeper look and add some tests that validate the correct behavior.
/wait-any

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

…ler_init Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa · 2024-01-08T14:22:02Z

Any ideas about what's going on here? I can look at this some more tomorrow if not. It's possible I'm doing something weird here, so if you want to take a look at the patch it's here: tonya11en@bd34ce7

@tonya11en thanks again for providing this test.
The underlying issue was that when there is a weight that is a multiplications of another (say 25 and 75) then the emulated picks can cause some imbalance on the first-pick.
I've solved it by augmenting the weights used, by adding an epsilon to each weight, and added a test based on the one you provided.

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

phlax · 2024-01-09T09:56:23Z

/wait-any for further review

adisuissa · 2024-01-09T14:09:01Z

@phlax IIUC adding "wait-any" prevents the reviewers from receiving PR updates on the slack channel.
I think it should only be added when waiting for the PR author to comment, no?

phlax · 2024-01-09T14:12:07Z

@adisuissa you are probably right - i tend to use it when its waiting for any comment rather than a code change

(fwiw i think we sorely need wait-review functionality)

tonya11en

GJ on fixing the first-pick skew! That was a weird one.

LGTM, modulo the minor comments I left.

source/common/upstream/edf_scheduler.h

tonya11en · 2024-01-09T18:37:08Z

source/common/upstream/edf_scheduler.h

+    EDF_TRACE("Emulated {} picks in init step, {} picks remaining for one after the other step",
+              picks_so_far, picks - picks_so_far);
+    for (; picks_so_far < picks; ++picks_so_far) {
+      scheduler.pickAndAdd(aug_calculate_weight);


Should you be passing in aug_calculate_weight or calculate_weight here? You've already perturbed the weights, so at this point you should use the original weight calculation, right?

tonya11en · 2024-01-09T18:44:30Z

test/common/upstream/edf_scheduler_test.cc

+  // should be used. If the number of weights is large, the number of iterations
+  // should be larger than 10000.
+  constexpr uint64_t iterations = 100000;


I'd make sure this is enough to avoid flakiness by running the test multiple times (I think --runs_per_test=1000). Or just set this to 1e6 and don't worry about it.

I think that having a single test that has many random values makes more sense in this case, because at the end of test, it compares the expected and observed values.

test/common/upstream/edf_scheduler_test.cc

nezdolik

Thank you, this is great! Left some nits.

source/common/upstream/edf_scheduler.h

test/common/upstream/edf_scheduler_test.cc

source/common/upstream/edf_scheduler.h

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa · 2024-01-10T16:27:35Z

This is ready for another round of reviews, thanks!

nezdolik

Few mote nits and ready to ship :)

test/common/upstream/edf_scheduler_test.cc

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

…ler_init Signed-off-by: Adi Suissa-Peleg <adip@google.com>

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa · 2024-01-12T13:57:32Z

Assigning senior maintainers to review.
This is still not plumbed in, but should be reviewed for potential impact in the future.
/assign @htuch @wbpcode

htuch

This is really clever. I want to make sure we can preserve the full reasoning for future readers as algorithms like this need a fast ramp for the uninitiated.

htuch · 2024-01-15T02:58:31Z

source/common/upstream/edf_scheduler.h

+                        }));
+
+    // Nothing to do if there are no entries.
+    if (entries.size() == 0) {


Can you refresh my memory on whether we already have an optimization here if all weights are equal somewhere earlier in the call chain?

The typical answer is "it depends":

In host selection - it does handle all weights are equal differently.

In locality selection - it doesn't handle this case.

source/common/upstream/edf_scheduler.h

test/common/upstream/edf_scheduler_test.cc

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

…ler_init Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa

I'll add the requested test.

source/common/upstream/edf_scheduler.h

adisuissa · 2024-01-16T04:01:34Z

source/common/upstream/edf_scheduler.h

+                        }));
+
+    // Nothing to do if there are no entries.
+    if (entries.size() == 0) {


The typical answer is "it depends":

In host selection - it does handle all weights are equal differently.

In locality selection - it doesn't handle this case.

source/common/upstream/edf_scheduler.h

test/common/upstream/edf_scheduler_test.cc

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

htuch · 2024-01-25T23:25:41Z

test/common/upstream/edf_scheduler_test.cc

+// equal to the weights. Trying the case of 2 weights between 0 to 100, in steps
+// of 0.001. This test takes too long, and therefore it is disabled by default.
+// If the EDF scheduler is enable, it can be manually executed.
+TEST_P(EdfSchedulerSpecialTest, DISABLED_ExhustiveValidator) {


Nit: Exhustive => Exhaustive

htuch

LGTM, thanks!

edf: adding a construction option with a given number of pre-picks

4cd9656

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa assigned tonya11en Jan 3, 2024

repokitteh-read-only bot added the waiting:any label Jan 5, 2024

adisuissa added 2 commits January 8, 2024 14:17

augment weights calculation (add epsilon) for emulated picks creation

af8e578

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

Merge remote-tracking branch 'upstream/main' into fix_locality_schedu…

45bcf49

…ler_init Signed-off-by: Adi Suissa-Peleg <adip@google.com>

repokitteh-read-only bot removed the waiting:any label Jan 8, 2024

spelling

1410cca

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

repokitteh-read-only bot added the waiting:any label Jan 9, 2024

repokitteh-read-only bot removed the waiting:any label Jan 9, 2024

tonya11en approved these changes Jan 9, 2024

View reviewed changes

nezdolik reviewed Jan 9, 2024

View reviewed changes

adisuissa added 2 commits January 10, 2024 16:16

comments

1972625

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

more comments

e66b0c6

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

phlax assigned nezdolik Jan 11, 2024

nezdolik reviewed Jan 12, 2024

View reviewed changes

test/common/upstream/edf_scheduler_test.cc Outdated Show resolved Hide resolved

test/common/upstream/edf_scheduler_test.cc Outdated Show resolved Hide resolved

test/common/upstream/edf_scheduler_test.cc Show resolved Hide resolved

adisuissa added 3 commits January 12, 2024 13:45

spelling

18ae3c8

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

Merge remote-tracking branch 'upstream/main' into fix_locality_schedu…

c9f701d

…ler_init Signed-off-by: Adi Suissa-Peleg <adip@google.com>

addressing comments

1886d95

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

repokitteh-read-only bot assigned htuch Jan 12, 2024

repokitteh-read-only bot assigned wbpcode Jan 12, 2024

htuch reviewed Jan 15, 2024

View reviewed changes

test/common/upstream/edf_scheduler_test.cc Show resolved Hide resolved

adisuissa added 2 commits January 16, 2024 04:09

comments

3460a6b

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

Merge remote-tracking branch 'upstream/main' into fix_locality_schedu…

b12af26

…ler_init Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa commented Jan 16, 2024

View reviewed changes

spelling

865f65f

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

nezdolik previously approved these changes Jan 18, 2024

View reviewed changes

validating

19cbe56

Signed-off-by: Adi Suissa-Peleg <adip@google.com>

adisuissa dismissed nezdolik’s stale review via 19cbe56 January 23, 2024 14:16

htuch reviewed Jan 25, 2024

View reviewed changes

htuch approved these changes Jan 25, 2024

View reviewed changes

htuch merged commit 45ab9cf into envoyproxy:main Jan 25, 2024
54 checks passed

This was referenced Jan 26, 2024

LB: introduce randomization in locality LB scheduler initialization #32075

Merged

lb-edf: fix lb initialization to choose from the correct set of weighted hosts #29953

Closed

adisuissa mentioned this pull request Feb 6, 2024

LB: fix randomization in host LB scheduler initialization #32233

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

edf: adding a construction option with a given number of pre-picks #31592

edf: adding a construction option with a given number of pre-picks #31592

adisuissa commented Jan 2, 2024 •

edited

Loading

adisuissa commented Jan 2, 2024

adisuissa commented Jan 3, 2024

tonya11en commented Jan 4, 2024

tonya11en commented Jan 4, 2024

adisuissa commented Jan 5, 2024

adisuissa commented Jan 8, 2024

phlax commented Jan 9, 2024

adisuissa commented Jan 9, 2024

phlax commented Jan 9, 2024

tonya11en left a comment

tonya11en Jan 9, 2024

tonya11en Jan 9, 2024

adisuissa Jan 10, 2024

nezdolik left a comment

adisuissa commented Jan 10, 2024

nezdolik left a comment

adisuissa commented Jan 12, 2024

htuch left a comment

htuch Jan 15, 2024

adisuissa Jan 16, 2024

adisuissa left a comment

adisuissa Jan 16, 2024

htuch Jan 25, 2024

htuch left a comment

edf: adding a construction option with a given number of pre-picks #31592

edf: adding a construction option with a given number of pre-picks #31592

Conversation

adisuissa commented Jan 2, 2024 • edited Loading

adisuissa commented Jan 2, 2024

adisuissa commented Jan 3, 2024

tonya11en commented Jan 4, 2024

tonya11en commented Jan 4, 2024

adisuissa commented Jan 5, 2024

adisuissa commented Jan 8, 2024

phlax commented Jan 9, 2024

adisuissa commented Jan 9, 2024

phlax commented Jan 9, 2024

tonya11en left a comment

Choose a reason for hiding this comment

tonya11en Jan 9, 2024

Choose a reason for hiding this comment

tonya11en Jan 9, 2024

Choose a reason for hiding this comment

adisuissa Jan 10, 2024

Choose a reason for hiding this comment

nezdolik left a comment

Choose a reason for hiding this comment

adisuissa commented Jan 10, 2024

nezdolik left a comment

Choose a reason for hiding this comment

adisuissa commented Jan 12, 2024

htuch left a comment

Choose a reason for hiding this comment

htuch Jan 15, 2024

Choose a reason for hiding this comment

adisuissa Jan 16, 2024

Choose a reason for hiding this comment

adisuissa left a comment

Choose a reason for hiding this comment

adisuissa Jan 16, 2024

Choose a reason for hiding this comment

htuch Jan 25, 2024

Choose a reason for hiding this comment

htuch left a comment

Choose a reason for hiding this comment

adisuissa commented Jan 2, 2024 •

edited

Loading