-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Add CAPI kubetest2 deployer #4041
[WIP] Add CAPI kubetest2 deployer #4041
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
happy to see this pr continuing. i am going to keep testing but i do still have a few open questions. see #3855 (review) |
This should address some of #2826 |
As an example, to deploy with CAPD: | ||
|
||
``` | ||
export KIND_EXPERIMENTAL_DOCKER_NETWORK=bridge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can go away now that CAPD works on the kind network (it requires kind >= v0.8)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, i found i wasn't needing this env variable while testing
if err := process.ExecJUnitContext(ctx, "kubectl", args, os.Environ()); err != nil { | ||
return err | ||
} | ||
args = []string{"--kubeconfig", kubeconfig, "wait", "--for=condition=Available", "--all", "--all-namespaces", "deployment", "--timeout=-1m"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a little bit fragile because it assumes the CNI is using deployments. What about waiting for the nodes to be ready instead?
} | ||
|
||
println("Up: waiting for cluster to become ready\n") | ||
args = []string{"wait", "--for=condition=Ready", "cluster/" + d.workloadClusterName, "--timeout=-1m"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the intention here is to check if the cluster is fully provisioned, we should check also for the machines to be provisioned. This is tricky because it usually means to wait for the control plane provider to be ready and for the higher-level abstractions (MachineDeployments/MachinePools) to be ready.
Might be a possible way forward is to wait for the expected number of machines to be ready...
args = []string{ | ||
"config", | ||
"cluster", d.workloadClusterName, | ||
"--infrastructure", d.provider, | ||
"--kubernetes-version", d.kubernetesVersion, | ||
"--worker-machine-count", d.workerCount, | ||
"--control-plane-machine-count", d.controlPlaneCount, | ||
"--flavor", d.flavor, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for confirmation: this syntax assumes that all the flavors required for E2E should be part of the CAP* release artifact. Is this the desired behavior? (I'm wondering if this is an excessive burden for the CAP* maintainers, and also a limitation for the end users)
"--flavor", d.flavor, | ||
} | ||
|
||
clusterctl := exec.CommandContext(ctx, "clusterctl", args...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if instead of shelling out we can use clusterctl as a library (directly or via the Cluster API E2E test framework) so we can have more control over this integration.
It would also remove the dependency on the clusterctl binary, which simplifies the test setup
} | ||
|
||
func (d *deployer) DumpClusterLogs() error { | ||
panic("Not implemented") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could leverage on the Cluster API E2E test framework for solving this problem...
// helper used to create & bind a flagset to the deployer | ||
func bindFlags(d *deployer, flags *pflag.FlagSet) { | ||
flags.StringVar( | ||
&d.provider, "provider", "", "--provider flag for clusterctl", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
&d.provider, "provider", "", "--provider flag for clusterctl", | |
&d.provider, "provider", "", "--infrastructure flag for clusterctl", |
What about renaming the flag into infrastructure-provider
, cloud-provider
or something more intuitive for people not aware of the Cluster API internals/terminology
&d.workerCount, "worker-machine-count", "1", "--worker-machine-count flag for clusterctl", | ||
) | ||
flags.StringVar( | ||
&d.flavor, "flavor", "", "--flavor flag for clusterctl", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about renaming the flag into cluster-template
, cluster-spec
or something more intuitive for people not aware of the Cluster API internals/terminology
&d.cniManifest, "cni-manifest", "", "automatically install this CNI manifest when the cluster becomes available", | ||
) | ||
flags.StringVar( | ||
&d.workloadClusterName, "workload-cluster-name", "capi-workload-cluster", "the workload cluster name", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we drop the workload-
prefix, which is part of the CAPI jargon.
Also, the default could be adapted to reflect the purpose of the cluster instead of focusing on the provisioning method (test something)
--image-name "kindest/node:$KUBE_GIT_VERSION" \ | ||
--config=- | ||
kind: Cluster | ||
apiVersion: kind.x-k8s.io/v1alpha4 | ||
nodes: | ||
- role: control-plane | ||
extraMounts: | ||
- hostPath: /var/run/docker.sock | ||
containerPath: /var/run/docker.sock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we can use the Cluster API E2E test framework instead of "sigs.k8s.io/kubetest2/kubetest2-kind/deployer", so we can hide all those details from the end-users.
The tricky part will be how to support --build
with the different infrastructure providers...
/milestone v0.4.0 |
- cluster-api-admins | ||
- cluster-api-maintainers | ||
- benmoss | ||
- elmiko | ||
- randomvariable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this list up to date?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i feel like @benmoss may not want to stay on that list. i'm happy to be an approver, but i wonder if we don't have some other folks who would like to get more involved as well?
@randomvariable What's the status of this PR? |
@@ -0,0 +1,26 @@ | |||
/* | |||
Copyright 2019 The Kubernetes Authors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copyright 2019 The Kubernetes Authors. | |
Copyright 2021 The Kubernetes Authors. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
Rotten issues close after 30d of inactivity. Send feedback to sig-contributor-experience at kubernetes/community. |
@fejta-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What this PR does / why we need it:
Follow up from #3855 to bring this to completion.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #4040