Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement Phoenix Network Coordinates #854

Merged
merged 14 commits into from
Feb 29, 2024
Merged

Implement Phoenix Network Coordinates #854

merged 14 commits into from
Feb 29, 2024

Conversation

XAMPPRocky
Copy link
Collaborator

@XAMPPRocky XAMPPRocky commented Nov 15, 2023

This commit uses the previous commit of datacentre discovery with an
implementation of Phoenix Network Coordinates1, to build a
decentralised network coordinate map of all the datacentres that the
proxies know about, using that information, we expose a new HTTP server
at the same QCMP port that a client can use to query what datacentres
are the closest to the proxy.

For identifying datacentres we use ICAO codes, this allows us to
colnsildate multiple logical datacentres into a single location, for
example if you have multiple zones allocated in the same physical
datacentre, in order to reduce the amount of options a client has to
pick from in that case they're consolidated under their ICAO code.

This allows for clients to get their full latency to datacentres which
allows operators to a allocate datacentres dynamically, and in the
closest location available to players.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

1 similar comment
Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

5 similar comments
Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link
Contributor

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome - passing tests!

Would love some docs - this looks very cool 😄

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

@XAMPPRocky
Copy link
Collaborator Author

Well, works as in, all tests pass locally for me, I have no idea why relay_routing fails in CI.

Seems like that is still spurious. I'll add you as the code owner, and add the docs for this feature, and then we can get it in.

@Jake-Shadle
Copy link
Collaborator

Yes, I just increased the timeout to get it to pass. Honestly I would like to find a path to getting all tests to work without any timeouts via some kind of sidechannel only enabled when running tests because the amount of just "random" timeouts throughout the tests just feels very fragile.

@XAMPPRocky
Copy link
Collaborator Author

because the amount of just "random" timeouts throughout the tests just feels very fragile.

Agreed, though it would require a lot of refactoring that I'm not sure is worth it right now. Since it would require making a lot of initialisation and watch change notification channels and making those available to tests in a way that's also not hacky.

@markmandel
Copy link
Contributor

Rather than sleeps, one pattern I've always appreciated in Go was:
https://pkg.go.dev/github.com/stretchr/testify/assert#Eventually

Which is essentially an assertion that polls as a set interval until a timeout - and I've found it super handy for test cases where things are eventually consistent (and don't require the aforementioned refactoring).

I've never managed to find an equivalent in Rust, short a loop with a sleep in it and a continue on the condition check? Maybe that's just the best way?

XAMPPRocky and others added 14 commits February 29, 2024 09:32
This commit adds the concept of "datacenter" to our xDS protocol. Unlike
other resources, there's slightly different behaviour between proxies
to relays and relays to agents. When agent queries for a datacenter from
the agent, it returns its own QCMP port and ICAO location. The host then
adds the address of remote agent to identify the datacenter. The proxy
then queries for datacenters, it gets all of agents addresses, ports,
and ICAO codes.

This will be used with a future commit to measure the distance between
proxies and agents.
This commit uses the previous commit of datacentre discovery with an
implementation of Phoenix Network Coordinates[1], to build a
decentralised network coordinate map of all the datacentres that the
proxies know about, using that information, we expose a new HTTP server
at the same QCMP port that a client can use to query what datacentres
are the closest to the proxy.

For identifying datacentres we use ICAO codes, this allows us to
colnsildate multiple logical datacentres into a single location, for
example if you have multiple zones allocated in the same physical
datacentre, in order to reduce the amount of options a client has to
pick from in that case they're consolidated under their ICAO code.

This allows for clients to get their full latency to datacentres which
allows operators to a allocate datacentres dynamically, and in the
closest location available to players.

[1]: https://www.researchgate.net/publication/221198962_Phoenix_Towards_an_Accurate_Practical_and_Decentralized_Network_Coordinate_System
Copy link

This PR exceeds the recommended size of 1000 lines. Please make sure you are NOT addressing multiple issues with one PR. Note this PR might be rejected due to its size.

@quilkin-bot
Copy link
Collaborator

Build Failed 😭

Build Id: 1c410a77-71ec-41ad-b912-206d177686e5

Status: FAILURE

To get permission to view the Cloud Build view, join the quilkin-discuss Google Group.

Filter with the Git Commit 6b706e7 within us-docker.pkg.dev/quilkin/ci//quilkin to see if a development image is available from this build, and can be used for debugging purposes.

Development images are retained for at least 30 days.

@quilkin-bot
Copy link
Collaborator

Build Succeeded 🥳

Build Id: c6805cdf-9d46-46e0-8aef-44cb07288baa

The following development images have been built, and will exist for the next 30 days:

To build this version:

git fetch git@github.com:googleforgames/quilkin.git pull/854/head:pr_854 && git checkout pr_854
cargo build

@XAMPPRocky XAMPPRocky merged commit d22a444 into main Feb 29, 2024
5 checks passed
@markmandel markmandel deleted the ep/phoenix branch March 11, 2024 06:00
@markmandel markmandel added kind/feature New feature or request area/operations Installation, updating, metrics etc area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc labels Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/operations Installation, updating, metrics etc area/user-experience Pertaining to developers trying to use Quilkin, e.g. cli interface, configuration, etc kind/feature New feature or request size/xl
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants