-
Notifications
You must be signed in to change notification settings - Fork 831
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(scheduler): Controller to update the list of servers to scheduler on reconnect #5893
Conversation
replicas = 1 | ||
|
||
if grpcClient == nil { | ||
// we assume that all servers are in the same namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All servers associated with a given scheduler (connection) will be in the same namespace, right? As in, this is always a safe assumption to make under the current design, as far as I know. Useful to have the comment though in case we change things in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All servers associated with a given scheduler (connection) will be in the same namespace
yes
noSchedulerState bool | ||
} | ||
|
||
// note expected state is derived in the test, maybe we should be explictl about it in the future |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say this is ok. Having the expected state derived in the test here shows the constraint explicitly (correspondence between number of model replicas). Having the expected state declared would show the state explicitly but would "hide" the nature of the constraint -- which will be implemented manually and could be error prone.
I thus argue that in this case it might be better to have the expected state derived within the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What this PR does / why we need it:
This change introduces the ability to send a list of servers from k8s
etcd
to the scheduler on connection. This is required to update the scheduler with the state of the work with regards to servers, which include also k8s metadata that is required for various logic in the scheduler.In this change we also allow
ServerNotify
to send the list of servers in one go, which will be required in subsequent changes that will follow up from this PR. In any case it is more efficient.Which issue(s) this PR fixes:
Fixes: