Simple toggle to stop the automatic assignment of public IP's for all the nodes. #4

pieterlange · 2016-10-25T14:37:37Z

This toggle still requires the operator to bootstrap the VPC with a NAT gateway, but i think it's fair to expect the operator to do this manually/with external tooling.

I might send a subsequent PR to add functionality to create a NAT gateway from kube-aws. Gateway initialization takes several (sometimes 5+) minutes though, so i'm slightly worried cluster bootstrap might be affected.

the nodes. This toggle requires the operator to bootstrap the VPC with a NAT gateway.

mumoshu · 2016-10-28T02:21:24Z

LGTM and mapPublicIps: false is something I'd like to generally encourage everyone to use for security.

Btw, one gotcha I'm aware of using NAT gateways is that it is limited to 10 Gbps/gateway(http://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-nat-gateway.html#nat-gateway-basics).
You'll be required to create one gateway for each subnet to scale up to 10 Gbps * num of subnets bandwidth.
If anyone is worrying about this, any suggestion/pull request for documentation or else is welcome.

mumoshu · 2016-10-28T02:21:57Z

Going to merge. Thanks for your contribution @pieterlange !

pieterlange · 2016-10-28T06:58:56Z

There's more than one gotcha for this, which is why i left actual NAT gateway creation up to the operator. :)

The NAT gateway also only lives in one availability zone so now the operator also has to deal with that. In general i think if you have these kinds of problems (including the 10Gbps/gateway problem), the organization should have the resources to help fix this. ;-)

When node pool functionality arrives i think it's worth it to add NAT gateway creation to kube-aws. Thanks for the merge!

…update-to-latest-kube-aws-master to hcom-flavour * commit '175217133f75b3c251536bc0d51ccafd2b1a5de4': Fix the dead-lock while bootstrapping etcd cluster when wait signal is enabled. Resolves kubernetes-retired#525 Fix elasticFileSystemId to be propagated to node pools Resolves kubernetes-retired#487 'Cluster-dump' feature to export Kubernetes Resources to S3 Follow-up for the multi API endpoints support This fixes the issue which prevented a k8s cluster from being properly configured when multiple API endpoints are defined in cluster.yaml. Fix incorrect validations on apiEndpoints Ref kubernetes-retired#520 (comment) Wait until kube-system becomes ready Resolves kubernetes-retired#467

Motivation is to avoid serious memory leak on etcd nodes as in * coreos/bugs#1927

…ocker Use docker instead of rkt for regular etcdadm tasks (#4)

* kubernetes-incubator/master: Removed unused sysctl override Fix node drain error when trying to evict pods from jobs Use docker instead of rkt for regular etcdadm tasks (kubernetes-retired#4)

Motivation is to avoid serious memory leak on etcd nodes as in * coreos/bugs#1927

…m/etcdadm-rkt-to-docker Use docker instead of rkt for regular etcdadm tasks (#4)

* KIAM updates to support assumeRoleArn functionalilty * Add compute.internal to etcd san when using private zones, because the aws controller does not support private zones * Fix issue with node names in the clusters * Fix tests * Whitespace. * Forced rebuild. * Update cloud-config-controller * Update cloud-config-controller * Update test * Remove verbose json output. * Allow dnsmasq to be backed by a local copy of CoreDNS This commit allows the user to specify that dnsmasq should be backed by a pod-local copy of CoreDNS rather than relying on the global CoreDNS service. If enabled, the dnsmasq-node DaemonSet will be configured to use a local copy of CoreDNS for its resolution while setting the global CoreDNS service as a fallback. This is handy in situations where the number of DNS requests within a cluster grows large and causes resolution issues as dnsmasq reaches out to the global CoreDNS service. Additionally, several values passed to dnsmasq are now configurable including its `--cache-size` and `--dns-forward-max`. See [this postmortem](https://github.com/zalando-incubator/kubernetes-on-aws/blob/dev/docs/postmortems/jan-2019-dns-outage.md) for an investigation into this situation which was instrumental in understanding issues we were facing. Many thanks to dominicgunn for providing the manifests which I codified into this commit. --- These features can be enabled and tuned by setting the following values within cluster.yaml: ```yaml kubeDns: dnsmasq: coreDNSLocal: # When enabled, this will run a copy of CoreDNS within each DNS-masq pod and # configure the utility to use it for resolution. enabled: true # Defines the resource requests/limits for the coredns-local container. # cpu and/or memory constraints can be removed by setting the appropriate value(s) # to an empty string. resources: requests: cpu: 50m memory: 100Mi limits: cpu: 50m memory: 100Mi # The size of dnsmasq's cache. cacheSize: 50000 # The maximum number of concurrent DNS queries. dnsForwardMax: 500 # This option gives a default value for time-to-live (in seconds) which dnsmasq # uses to cache negative replies even in the absence of an SOA record. negTTL: 60 ``` * Always create required dnsmasq resources The dnsmasq-node ServiceAccount must exist whether or not CoreDNS-local has been enabled. Therefore, it is created alongside the DaemonSet rather than as part of the coredns-local manifest. Additionally, always create dnsmasq-node-coredns-local.yaml If this file does not exist (as would be the case if the CoreDNS local feature has not been enabled), controller nodes will fail to come up with the error: > error: the path "/srv/kubernetes/manifests/dnsmasq-node-coredns-local.yaml" does not exist This is caused when `kubectl delete` is called against the file because of the line `remove "${mfdir}/dnsmasq-node-coredns-local.yaml`. This manifest must always be generated because the CoreDNS-local feature cannot be enabled and then later disabled without otherwise requiring manual operator intervention. Co-authored-by: Dominic Gunn <dominic@fable.sh> Co-authored-by: Dominic Gunn <4493719+dominicgunn@users.noreply.github.com> Co-authored-by: Kevin Richardson <kevin@kevinrichardson.co>

Simple toggle to stop the automatic assignment of public IP's for all

10e4333

the nodes. This toggle requires the operator to bootstrap the VPC with a NAT gateway.

mumoshu merged commit b2f425d into kubernetes-retired:master Oct 28, 2016

pieterlange deleted the feature/dont-map-public-ips branch November 7, 2016 13:21

mumoshu pushed a commit that referenced this pull request Jun 20, 2017

Use docker instead of rkt for regular etcdadm tasks (#4)

062ea37

Motivation is to avoid serious memory leak on etcd nodes as in * coreos/bugs#1927

mumoshu added a commit that referenced this pull request Jun 20, 2017

Merge pull request #705 from iflix-letsplay/upstream/etcdadm-rkt-to-d…

44e5e24

…ocker Use docker instead of rkt for regular etcdadm tasks (#4)

kylehodgetts referenced this pull request in HotelsDotCom/kube-aws Mar 27, 2018

Use docker instead of rkt for regular etcdadm tasks (#4)

1532426

Motivation is to avoid serious memory leak on etcd nodes as in * coreos/bugs#1927

kylehodgetts referenced this pull request in HotelsDotCom/kube-aws Mar 27, 2018

Merge pull request kubernetes-retired#705 from iflix-letsplay/upstrea…

01a7efd

…m/etcdadm-rkt-to-docker Use docker instead of rkt for regular etcdadm tasks (#4)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simple toggle to stop the automatic assignment of public IP's for all the nodes. #4

Simple toggle to stop the automatic assignment of public IP's for all the nodes. #4

pieterlange commented Oct 25, 2016

mumoshu commented Oct 28, 2016 •

edited

Loading

mumoshu commented Oct 28, 2016

pieterlange commented Oct 28, 2016

Simple toggle to stop the automatic assignment of public IP's for all the nodes. #4

Simple toggle to stop the automatic assignment of public IP's for all the nodes. #4

Conversation

pieterlange commented Oct 25, 2016

mumoshu commented Oct 28, 2016 • edited Loading

mumoshu commented Oct 28, 2016

pieterlange commented Oct 28, 2016

mumoshu commented Oct 28, 2016 •

edited

Loading