-
Notifications
You must be signed in to change notification settings - Fork 519
refactor: upgrade code refactor part 1 #287
refactor: upgrade code refactor part 1 #287
Conversation
7ae3dba
to
2849aea
Compare
// getClusterNodeVersion returns a node's "orchestrator:version" via Kubernetes API or VM tag. | ||
func (uc *UpgradeCluster) getClusterNodeVersion(client armhelpers.KubernetesClient, name string, tags map[string]*string) string { | ||
// getNodeVersion returns a node's current Kubernetes version via Kubernetes API or VM tag. | ||
func (uc *UpgradeCluster) getNodeVersion(client armhelpers.KubernetesClient, name string, tags map[string]*string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reverse the precedence order here? Shouldn't we prefer to check the k8s API first, and then fall back to the tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same behavior that was implemented in @mboersma's PR to stop requiring tags Azure/acs-engine#4100 so I'll defer to him for choosing this approach over the other.
I think the idea was that we maintain existing behavior to limit risk and only use the alternative method if we can't find the tags.
If we do decide we want to reverse the order, I would argue that's a behavior change and it doesn't belong in this refactor PR (which is already dangerous enough as is).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Cécile's comment here. I think it's safer to leave it alone.
This way the existing behavior is still the default (use tags), and the k8s API is explicitly a fallback strategy. At the time, that was deemed to be less disruptive, but it has proved to be useful in failed upgrade cases: setting the tags is a safe way to "trick" the upgrade behavior to run again.
Is this an appropriate PR to include consideration of the following changes:
Use cases:
|
@jackfrancis short answer is no. This is a refactor PR and the first of many. I tried to break down the changes so that they are easier to review/test. Adding new features is out of the scope of this PR. We can definitely keep that in the backlog as a future addition though. |
To clarify for reviewers: there are many more things I want to change/improve in the upgrade code area, this is only the beginning. I am breaking down the changes in smaller units of code changes to make it more manageable to review and limit risk of breakage since upgrade is a complicated/sensitive part of the code. Trying to change that, one step at a time :) |
uc.agentPoolsToUpgrade = []string{} | ||
log.Infoln(fmt.Sprintf("Gathering agent pool names...")) | ||
uc.agentPoolsToUpgrade = make(map[string]bool) | ||
uc.agentPoolsToUpgrade[kubernetesupgrade.MasterPoolName] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this additive? I don't see us adding the master in the previous agentPoolsToUpgrade
string slice pattern
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe I should rename this to poolsToUpgrade
- the agentPoolsToUpgrade
name doesn't really make sense
for _, poolName := range agentPoolsToUpgrade { | ||
uc.AgentPoolsToUpgrade[poolName] = true | ||
} | ||
uc.AgentPoolsToUpgrade[MasterPoolName] = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I see now what you did above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I just copied existing behavior but I don't really understand what AgentPoolsToUpgrade is used for yet
Scenarios tested so far: ✅ Windows 1.8 cluster with VMAS |
if client != nil { | ||
node, err := client.GetNode(name) | ||
if err == nil { | ||
return api.Kubernetes + ":" + strings.TrimPrefix(node.Status.NodeInfo.KubeletVersion, "v") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we return just the k8s version instead of kubernetes:v1.x.x
since this is in the kubernetesupgrade
package, we only care about the version - not the orchestrator.
return uc.Translator.Errorf("Error while querying ARM for resources: %+v", err) | ||
} | ||
|
||
var upgrader UpgradeWorkFlow | ||
upgradeVersion := uc.DataModel.Properties.OrchestratorProfile.OrchestratorVersion | ||
uc.Logger.Infof("Upgrading to Kubernetes version %s\n", upgradeVersion) | ||
switch { | ||
case strings.HasPrefix(upgradeVersion, "1.6."): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if anyone knows a good reason why these separate upgraders are still needed, please speak up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 as far as I can tell, these structs provide no utility over the Upgrader
base.
Maybe someone was anticipating we would need different behavior for each version upgrade? (Don't do that in advance because YAGNI.)
VMSS upgrade is failing with #298 might help |
Codecov Report
@@ Coverage Diff @@
## master #287 +/- ##
=========================================
+ Coverage 53.17% 53.2% +0.02%
=========================================
Files 95 95
Lines 14247 14216 -31
=========================================
- Hits 7576 7563 -13
+ Misses 6006 5988 -18
Partials 665 665 |
@@ -204,41 +164,37 @@ func (uc *UpgradeCluster) getClusterNodeStatus(subscriptionID uuid.UUID, az armh | |||
} | |||
|
|||
for _, vm := range vmListPage.Values() { | |||
vmOrchestratorTypeAndVersion := uc.getClusterNodeVersion(kubeClient, *vm.Name, vm.Tags) | |||
if vmOrchestratorTypeAndVersion == "" { | |||
// Windows VMs contain a substring of the name suffix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the Windows-specific implications of the line(s) below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strings.Contains(*(vm.Name), uc.NameSuffix[:4]+"k8s"
is checking for VM naming specific to windows instead of looking for the whole name suffix
@@ -394,28 +345,3 @@ func (uc *UpgradeCluster) addVMToAgentPool(vm compute.VirtualMachine, isUpgradab | |||
|
|||
return nil | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually kinda think we should keep this in, uncomment it, and standardize to persisting the template that we deploy to upgrade always.
But perhaps for a later exercise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have logic for saving the modified template in upgrade.go https://github.com/CecileRobertMichon/aks-engine/blob/0b7a9702300ed5d033874a6ced2d2a140867fe18/cmd/upgrade.go#L281 so this commented out code is redundant.
I think we need to improve the logic for persisting templates and instead of passing them to a file, saving them in the RG or something like that but, like you said, for a later exercise.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: CecileRobertMichon, jackfrancis The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
refactor: some more upgradeCluster() refactoring refactor: getClusterNodeStatus fix: s/nodes/VMs fix: s/subset/substring fix: ClusterTopology is in kubernetesupgrade package refactor: simplify upgradable() func refactor: move upgradecluster to instance literal Revert "refactor: move upgradecluster to instance literal" This reverts commit ac6aacd. refactor: fix upgradeCluster parameters in unit tests fix import remove import cycle fix unit test error string
14519b2
to
5795eea
Compare
New changes are detected. LGTM label has been removed. |
Reason for Change:
This is an initial refactor of the upgrade code. More to follow.
Issue Fixed:
Requirements:
Notes:
TODO: