Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort By Counters or Lists during GameServerAllocation 2716 #3091

Merged
merged 4 commits into from
Apr 20, 2023

Conversation

igooch
Copy link
Collaborator

@igooch igooch commented Apr 12, 2023

What type of PR is this?
/kind feature

What this PR does / Why we need it:

  • Adds Priorities "Counter" and "List" that allow Game Servers to be sorted by any Counters or Lists for allocation in a Descending or Ascending manner, where Descending (3, 2, 1, 0, nil) is the default.
  • The Counter or List sort sorts by the first Priority in the Priorities list. Any subsequent Priority(s) is only used as a tie-breaker in the order in which they are in Priorities list.
  • Sorting by Counters or Lists at Allocation happens only as a tie-breaker for equal game servers after preferring game servers for state Allocated, being a node that is most allocated, and PlayerTracking if that feature is enabled.
  • Updates the final tie-breaker for sorting to sort by Game Server name instead of NodeName.

Which issue(s) this PR fixes:
Working on #2716

Special notes for your reviewer:

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 08c46189-2374-438e-afaa-35de0518b5da

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.31.0-db9fff6-amd64

pkg/apis/allocation/v1/gameserverallocation.go Outdated Show resolved Hide resolved
pkg/apis/allocation/v1/gameserverallocation.go Outdated Show resolved Hide resolved
pkg/gameserverallocations/allocation_cache.go Outdated Show resolved Hide resolved
pkg/gameserverallocations/allocation_cache.go Outdated Show resolved Hide resolved
pkg/gameserverallocations/allocation_cache.go Outdated Show resolved Hide resolved
pkg/gameserverallocations/allocator.go Outdated Show resolved Hide resolved
@@ -40,6 +40,7 @@ func findGameServerForAllocation(gsa *allocationv1.GameServerAllocation, list []
var loop func(list []*agonesv1.GameServer, f func(i int, gs *agonesv1.GameServer))

// packed is forward looping, distributed is random looping
// TODO: Discuss if we change findGameServerForAllocation so all strategies use forward looping for the the pre-sorted list for each strategy once FeatureCountsAndLists gate is graduated.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the new version, I think we can throw most of this complexity away, and do simple search and filter -- we don't need this fancy looping construct anymore, since it will be pre-sorted / pre-randomised.

(Also can then also do a duplicated function as well for counts and lists)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we have three paths A) Sort by Allocated and then loop through list. B) Sort by Allocated, then randomly shuffle list, and then loop through randomized list. C) Sort by CountsAndLists and then loop through list.

Where/when should I be randomizing? Should I make a 4th path D) Sort by CountsAndLists, then randomly shuffle list, and then loop through randomized list? Or are you suggesting a different set of paths?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought is this - for this part of the code -- the list should already be appropriately sorted when it is passed to it. So it should be a for .. range operation.

Then for each strategy:

Distributed

  1. Randomise first (so that will give us a random distribution within each sorted Allocated/Ready section of the list we have)
  2. Sort by:
  3. Allocated
  4. Count/List as appropriate

Question on the above: Does K8s List already give us a generally random order of GameServers, such that we don't need to bother shuffling it? I'm actually not sure! May be worth checking.

Packed
Sort by:

  1. Allocated First
  2. Packed Strategy (basically what we have now)
  3. Tie break with counts and lists

I hope that makes things clearer?

Copy link
Collaborator Author

@igooch igooch Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why randomize before sorting? I thought we had the entire list of Ready/Allocated game servers here, and not a subset?

If we're keeping our current definitions of Packed and Distributed, then it seems to make more sense to more or less keep the flow as-is:

  1. Sort by Allocated
    Tie break sort by number of Ready servers on the Node
    Tie break sort by Counts and Lists
    Tie break sort by Game Server Name <- Separate discussion on this as it currently tie breaks on NodeName
  2. If Distributed, randomize list
  3. Loop through list to find matching Game Servers (original sorted list for Packed, randomized list for Distributed)

A possible alternative would to first check for matches before sorting the Game Servers, and only sort for the Packed case:

  1. Filter List of Game Servers for Matches
  2. Return unsorted list of Matches for Distributed (not completely random, could be randomized before returning)
  3. Sort List (same as above) and return sorted list of Matches for Packed

Another possible alternative would to first check for matches before sorting the Game Servers, and then sort for both cases then randomize for distributed:

  1. Filter List of Game Servers for Matches
  2. Sort List (same as above)
    Return sorted list of Matches for Packed
    Randomize then return list of Matches for Distributed

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One benefit to do sorting before matching is that we can return fast when we find the first match. If we do matching first, we have to go though all the GameServers to find out the ones that match, then perform sorting for them, and then return the first one. Though Mathematically speaking there is no difference on the time complexity (they both are O(nlogn) for sorting and O(n) for matching).

I kind of agree on Mark, the sorting/randomization should happen in ListSortedGameServers. This gives us a central place to maintain different sorting logic (random is also a kind of sorting) for different scheduling strategy, and use the findGameServerForAllocation purely to pick the first GameServer that match the selectors from the sorted GameServer list. What I'm not quite convinced is whether we should apply the sorting based on Allocated and Count/List after the randomize for distributed strategy. Right now, for distributed strategy, it looks like the although we perform the sorting based on Allocated and player tracking in ListSortedGameServers, we shuffle the list eventually in findGameServerForAllocation. I.e. it's a totally randomize without any sorting. So instead of "Randomize first -> Sort by: Allocated and Count/List as appropriate", maybe just a randomize in ListSortedGameServers is good enough for distributed strategy. This also fit our definition of Distributed strategy better. ("Distributed" is aimed at static Kubernetes clusters, wherein we want to distribute resources across the entire cluster)

Another thought may outside of the scope. Can we be more smart for Distributed strategy? For example, Instead of using randomize to achieve distributed, can we sort on the GameServer count per node and pick the one node with least GameServers on it? I know a lot of other complexity will involve for the other allocation criteria, but just want to call out this basic thoughts which can make the distributed allocation deterministic. Distributed should not equal to Random, and Random sounds like a different placement/allocation strategy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the above being said, I think the current findGameServerForAllocation totally works fine. We can leave all the refactoring outside this PR.

@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch 2 times, most recently from f2f7fd7 to ca0fce5 Compare April 14, 2023 21:27
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 09c52bde-1174-4ff8-8fb3-fd750e99cda5

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.31.0-ca0fce5-amd64

})
}

if !(p.Order == Ascending || p.Order == Descending || p.Order == "") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming Order will be defaulted, so checking for "" is probably not something we need to do? (or want to if we have defaulting)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done any defaulting. Where in the code should I do defaulting for this?

pkg/apis/allocation/v1/gameserverallocation.go Outdated Show resolved Hide resolved
pkg/apis/allocation/v1/gameserverallocation.go Outdated Show resolved Hide resolved
pkg/gameserverallocations/allocation_cache.go Show resolved Hide resolved
pkg/gameserverallocations/allocation_cache.go Outdated Show resolved Hide resolved
pkg/apis/allocation/v1/gameserverallocation.go Outdated Show resolved Hide resolved
@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch 2 times, most recently from f267cb1 to 8418791 Compare April 17, 2023 17:17
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 5b55855c-a9ec-464c-818a-f783ef3a664a

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 08c0fe60-14a7-4e24-95fd-9bd91c156e3b

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch from 8418791 to e3a669f Compare April 17, 2023 17:36
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 8f81d742-e8e7-46be-80aa-64408fcc408a

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.31.0-e3a669f-amd64

@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch from e3a669f to 924c6f1 Compare April 18, 2023 16:42
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 2b31ade1-b986-4ffc-a90c-1076463eeb31

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch from 924c6f1 to 8895853 Compare April 18, 2023 17:06
// finally sort lexicographically, so we have a stable order
return gs1.Status.NodeName < gs2.Status.NodeName
return gs1.GetObjectMeta().GetName() < gs2.GetObjectMeta().GetName()
Copy link
Collaborator Author

@igooch igooch Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markmandel I changed this to the ObjectMeta.Name instead of the Status.NodeName because the NodeName is not unique, so this didn't create a stable order. (I discovered this during test TestAllocationCacheListSortedGameServers as gs2 and gs4 are the same when comparing by NodeName.)

This change assumes that both 1) all Game Servers have ObjectMeta.Name, and 2) All ObjectMeta.Name are unique. Are these correct assumptions? If those assumptions are not correct, then this change would also not create a stable order. Would it be better to use the ObjectMeta.UID? Or would it be better to leave things as-is? Or should we leave it out for the sake of not making changes that are not behind a feature gate?

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: fd8bdc9a-849d-423b-857d-eebda4397fc9

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.31.0-8895853-amd64

@igooch igooch changed the title Arbitrary counts lists 2716 Sort By Counters and Lists during GameServerAllocation 2716 Apr 19, 2023
@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch from 4864d08 to ac45472 Compare April 19, 2023 16:40
@igooch igooch marked this pull request as ready for review April 19, 2023 16:40
@igooch igooch changed the title Sort By Counters and Lists during GameServerAllocation 2716 Sort By Counters or Lists during GameServerAllocation 2716 Apr 19, 2023
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: ccc80805-9b8d-48e1-aa47-8c916fa1dadf

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch from ac45472 to 6ca5667 Compare April 19, 2023 17:01
@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 7157a452-c1e2-4d0b-9bc8-5068ad327887

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 48802fd3-b1aa-475d-8698-0d958111f929

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.32.0-6ca5667-amd64

Copy link
Collaborator

@gongmax gongmax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add unit test coverage for the new validate logic added in gameserverallocation.go?

// finally sort lexicographically, so we have a stable order
return gs1.Status.NodeName < gs2.Status.NodeName
return gs1.GetObjectMeta().GetName() < gs2.GetObjectMeta().GetName()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -40,6 +40,7 @@ func findGameServerForAllocation(gsa *allocationv1.GameServerAllocation, list []
var loop func(list []*agonesv1.GameServer, f func(i int, gs *agonesv1.GameServer))

// packed is forward looping, distributed is random looping
// TODO: Discuss if we change findGameServerForAllocation so all strategies use forward looping for the the pre-sorted list for each strategy once FeatureCountsAndLists gate is graduated.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One benefit to do sorting before matching is that we can return fast when we find the first match. If we do matching first, we have to go though all the GameServers to find out the ones that match, then perform sorting for them, and then return the first one. Though Mathematically speaking there is no difference on the time complexity (they both are O(nlogn) for sorting and O(n) for matching).

I kind of agree on Mark, the sorting/randomization should happen in ListSortedGameServers. This gives us a central place to maintain different sorting logic (random is also a kind of sorting) for different scheduling strategy, and use the findGameServerForAllocation purely to pick the first GameServer that match the selectors from the sorted GameServer list. What I'm not quite convinced is whether we should apply the sorting based on Allocated and Count/List after the randomize for distributed strategy. Right now, for distributed strategy, it looks like the although we perform the sorting based on Allocated and player tracking in ListSortedGameServers, we shuffle the list eventually in findGameServerForAllocation. I.e. it's a totally randomize without any sorting. So instead of "Randomize first -> Sort by: Allocated and Count/List as appropriate", maybe just a randomize in ListSortedGameServers is good enough for distributed strategy. This also fit our definition of Distributed strategy better. ("Distributed" is aimed at static Kubernetes clusters, wherein we want to distribute resources across the entire cluster)

Another thought may outside of the scope. Can we be more smart for Distributed strategy? For example, Instead of using randomize to achieve distributed, can we sort on the GameServer count per node and pick the one node with least GameServers on it? I know a lot of other complexity will involve for the other allocation criteria, but just want to call out this basic thoughts which can make the distributed allocation deterministic. Distributed should not equal to Random, and Random sounds like a different placement/allocation strategy.

pkg/gameserverallocations/allocator.go Outdated Show resolved Hide resolved
@igooch igooch force-pushed the arbitrary-counts-lists-2716 branch from 6ca5667 to e8d646b Compare April 19, 2023 21:27
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 381d855b-36ed-4196-8c03-ee407227bcae

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.32.0-e8d646b-amd64

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: cea29962-d9b3-40b6-ad40-a43196ea4e1f

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 930ad8cc-c190-42ef-8d4d-930430cc2f6d

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3091/head:pr_3091 && git checkout pr_3091
  • helm install agones ./install/helm/agones --namespace agones-system --agones.image.release=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.32.0-264e9cd-amd64

Copy link
Member

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good set of initial changes 👍🏻 LGTM!

// finally sort lexicographically, so we have a stable order
return gs1.Status.NodeName < gs2.Status.NodeName
return gs1.GetObjectMeta().GetName() < gs2.GetObjectMeta().GetName()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those assumptions are correct - so this sounds like a good change 👍🏻

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gongmax, igooch, markmandel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@markmandel markmandel merged commit 5911f0c into googleforgames:main Apr 20, 2023
@Kalaiselvi84 Kalaiselvi84 added the kind/feature New features for Agones label May 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants