-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Improve managed node group bootstrap #1433
feat: Improve managed node group bootstrap #1433
Conversation
@stevehipwell I've never tried it this way, but it seems way cleaner :) Did you manage to test the extra args also (like taint etc) ? |
@ArchiFleKs I think native taint support has been merged ready to be released, which other arguments were you thinking of? I've tested an additional |
Yes you're right. Nice. |
@barryib could I get some feedback on this? |
@barryib could you take a look a this? |
@barryib I've added support for |
Any chance of having this release soon ? |
@ArchiFleKs I'll rebase this shortly. |
this looks be a breaking change as user_data is changing Probably we will need consolidate a breaking changes PRs and make some major release as there is currently a few in queue |
@daroga0002 I'm not sure about where you're seeing the breaking change? I'd classify this as a feature update as the public API is being added to but no existing behaviour will change. |
maybe I put this in bad words, behaviour will not change so it will not break, but it will cause that everybody using module and updating it will be impacted as launch templates will be updated. Does you maybe tested this how it will impact cluster created before this change and running apply after those changes? Does node groups will be recreated or this will just silently update a launch template? |
@daroga0002 the launch templates will change, but that's not a breaking change in relation to SemVer and the module. I'd expect the launch templates to change on a major or minor module upgrade and if I didn't want additional churn I'd just delay taking the changes until the AMI was updated. |
This PR is very useful to use new max pod feature in VPC CNI 1.9 |
@ArchiFleKs that's one of the things I'm already using this pattern for, but it's not only useful for the new prefix behaviour but also the existing custom networking. |
Been using it for a couple of weeks with a fork on ARM micro with new network prefix and containerd, working like a charm, would love to see this merged |
@@ -9,8 +9,10 @@ data "cloudinit_config" "workers_userdata" { | |||
content_type = "text/x-shellscript" | |||
content = templatefile("${path.module}/templates/userdata.sh.tpl", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this userdata keeps coming up often, maybe its time to stop adding more options here and just have a generic "bring your own userdata script" option. so either you can use the standard userdata script provided here which only supports pre_userdata
and kubelet_extra_args
, or you can provide your own userdata script that it will use instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daroga0002 thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exactly was thinking about same, we have already few PRs which are changing/modifying. I think maybe we should:
- make some basic template
- allow overwrite whole user_data (leaving templating to users outside module)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryantbiggs & @daroga0002 I disagree, this PR solves a lot of the need for regular changes. It allows important parameters to be captured as strongly typed variables (container_runtime
, use_max_pods
&kubelet_extra_args
) and enables any additional userdata via pre_userdata
which can now also interact with the AMI boostrap.sh script by adding variables to the new /etc/profile.d/bootstrap.sh file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevehipwell it solves the here and now for yourself and maybe a handful of others, but it doesn't solve for tomorrow when someone else wants to add a new piece of functionality, and the next user, and the next ... etc.
this approach is common amongst our modules - we try to provide a good, default "out of box" experience that will hopefully cover the 80% of use cases, but the number of possible combinations and customizations can explode very quickly to meet individual use cases. we see this most often with IAM roles and policies and we have used this pattern successfully - provide a baseline default IAM role/policy that makes provisioning the module easy and seamless, but also provide the ability to override the role/policy that the module uses with one provided by the user of the module
we use this pattern in this module already as well:
https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/local.tf#L3-L7
also, we are looking for ways to reduce the complexity of this module and abstract away some of these niche configurations. if we can do this, it allows for more focus on larger initiatives within the project and more agility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryantbiggs the current design needs the user data to be templated to support some very common use cases and this PR extends this capability to some additional very common use cases. If you wish to change this you'd need to release a breaking change and significantly reduce capability for the 80% use case.
@daroga0002 variables need to be persisted or else they don't work between init steps, this is why I've designed the user data in this way for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bryantbiggs @daroga0002 the pattern in this PR simplifies both the implementation and maintenance of customising /etc/eks/bootstrap.sh. It is now easy to add new managed values, such as I've done here for container_runtime
and use_max_pods
, by updating the initial creation of /etc/profile.d/eks-bootstrap.sh. It's also easy for end users to set additional environment variables by appending to this file.
As there is nothing happening besides linking to /etc/profile.d/eks-bootstrap.sh from /etc/eks/bootstrap.sh before the pre_userdata
is templated there is no reason to add a second variable. A power user could always clear /etc/profile.d/eks-bootstrap.sh from pre_userdata
so there is no reason to even need to provide a userdata override variable.
An example of what this PR offers end-users can be seen by the following snippet setting an environment variable in /etc/eks/bootstrap.sh.
node_group = {
pre_userdata = <<-EOT
printf 'PAUSE_CONTAINER_VERSION="%s"
' "3.1-eksbuild.1" >> /etc/profile.d/eks-bootstrap.sh
EOT
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like sourcing source /etc/profile.d/eks-bootstrap.sh
into bootstrap, but still I dont think that we dont need a seperate variables to customize bootstrap script.
Currently we have https://github.com/awslabs/amazon-eks-ami/blob/548b1650e63f3ab582d79418c7b3336943642fd1/files/bootstrap.sh#L108-L118
so I am still pushing to make this more generic and allow to push any variable into bootstrap than create variables in TF code for each of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daroga0002 we already have a strongly typed variable for KUBELET_EXTRA_ARGS
so for consistency the approach in this PR makes sense as the alternative would be to document multiple ways of setting KUBELET_EXTRA_ARGS
; which after further thought I could do pretty easily. If we're happy with a breaking change then it's even simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@daroga0002 I've created an alternate PR #1577 to use a more generic pattern. The new PR still needs testing but I'm happy to close this PR in favour of the new one if that's the agreed direction.
@bryantbiggs @daroga0002 @antonbabenko could you take another look at this and the comments above? I really think that this PR makes the managed node group module more generic while supporting important parameters in a strongly typed way. |
Closed in favour of #1577. |
I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
PR o'clock
Description
This PR could be replaced by #1577.
The managed node group bootstrap pattern has been modified to use the environment variables already defined in the
bootstrap.sh
script sourced from a new/etc/profile.d/bootstrap.sh
file. The script still needs to be modified to usebootstrap.sh
but this pattern can support additional bootstrap options not currently directly supported and other patterns (such as dynamic labels from instance metadata).I've enabled support for managed node group
--container-runtime
viacontainer_runtime
and--use-max-pods
viause_max_pods
(use max pods can already be set viaextra_bootstrap_args
for un-managed node groups).Checklist