Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add environment option to disable on-node SNAT #81

Merged
merged 1 commit into from
Jun 28, 2018

Conversation

robbrockbank
Copy link
Contributor

*Issue #53 *

Description of changes:
This is a partial possible solution for issue 53. It adds an environment option AWS_VPC_K8S_CNI_EXTERNALSNAT to indicate that an explicitly configured NAT gateway will be used to provide SNAT functionality for the secondary ENIs. Setting this to true removes the iptables SNAT rule and the ip rule for off-VPC routing. I've tested this live setting/unsettings/setting this option and the relevant rules are added and deleted as necessary. I haven't yet updated the tests, but can do that if the approach is sensible.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@liwenwu-amazon
Copy link
Contributor

@robbrockbank The change looks good to me. I have few comments and questions:
Can you add more detail on how you test it? For example:

  • what os type, e.g. ubuntu or amazon linux
  • how about upgrade ? in another word, use kubectl apply -f <aws-k8s-cni.yaml> to upgrade/downgrade.
  • add some detail how you verify it to be ok, e.g. are u doing ping from Pod to Pod?

@robbrockbank
Copy link
Contributor Author

Hi @liwenwu-amazon ,

My EKS cluster worker nodes are running Amazon Linux:

cat /etc/*-release
NAME="Amazon Linux"
VERSION="2 (2017.12)"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2 (2017.12) LTS Release Candidate"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
Amazon Linux release 2 (2017.12) LTS Release Candidate

I've upgraded and downgraded using kubectl to apply the original and the modified version of the AWS node and see the appropriate iptable rule and ip rule appear (for the original image) and get removed (for the modified image).

I've verified that pod to pod intra cluster pings are working, and that it is also possible (with the patch applied and the environment variable set) that I can observe traffic between the EKS cluster and a cluster in a peered (CGW/VGW/VPN) network. Also that I can access the internet from the EKS pods through the use of ax explicitly configured NAT gateway.

@liwenwu-amazon
Copy link
Contributor

@robbrockbank , we would like to include your PR in next CNI release, can you take a look at my comments ?
thanks

@robbrockbank
Copy link
Contributor Author

Hi @liwenwu-amazon, I thought I'd answered the questions you'd asked above. Am I missing something, if so let me know and I'll address it. Thanks, Rob.

@liwenwu-amazon
Copy link
Contributor

@robbrockbank , can you look at my code review comments on network.go? thanks

@robbrockbank
Copy link
Contributor Author

Hi @liwenwu-amazon, I can't see any comments in the review, nor in the history on this page. Have the review comments been submitted?

// TODO : implement ip rule not to 10.0.0.0/16(vpc'subnet) table main priority 1024
func (os *linuxNetwork) SetupHostNetwork(vpcCIDR *net.IPNet, primaryAddr *net.IP) error {

externalSNAT, err := useExternalSNAT()
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we default back to today's behavior on error?

return errors.Wrapf(err, "host network setup: failed to add host rule")
// Only include the rule if SNAT is not being handled by an external NAT gateway and needs to be
// handled on-node.
if !externalSNAT {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need to do this. Can you remove this check and see the your tests still passing? thanks

Copy link
Contributor Author

@robbrockbank robbrockbank Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liwenwu-amazon : You are correct, this strictly this isn't about using external SNAT.

The reason I added this is because it's required if you are trying to route from another VPC or across a VPN gateway. Without it you have the situation:

  • Route to pod from outside VPC (i.e. other VPC or VPN) may come in on eth1 depending on which secondary ENI interface is assigned to the pod
  • Route back again is forced over eth0 with this rule
  • This results in packet drop due to reverse path filtering.

Details are also in #53.

So I feel that it's all related, but perhaps the environment variable is incorrectly named. I think rather than making this option about SNAT it should maybe focus on the "Allow traffic between VPCs or VPNs" aspect, where there is an additional requirement of a NAT gateway if you want to enable this feature.

I couldn't come up with a nice succinct name for that though. Perhaps AWS_VPC_K8S_CNI_ALLOW_INTER_VPC?

WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it a little more though - is the real issue that traffic from another VPC or from a VPN can arrive on eth1?

Copy link
Contributor Author

@robbrockbank robbrockbank Jun 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liwenwu-amazon I've just been retesting on a new EKS cluster and I have a question that is related to all of this.

How do I specify which VPC subnet is selected for the pods? Does the subnet determine whether intra-VPC traffic is handled by eth0 or eth1?

When I previously tested, some of my pods were using eth1, but in my retests they seem to be using eth0. Do you know if anything has changed in this area since this PR was created, or is there a way to tell a deployment or pod to use eth1?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbrockbank right now all ENIs on a node using same PVC subnet and security group as node's primary ENI. In another word, all eth0, eth1 etc. are using same PVC subnet and security group. ipamdD randomly choose a secondary IP address of these ENIs (e.g. eth0, eth1 etc) and assign it to Pod.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if a Pod is assigned with a secondary IP address from eth1, and if a packet with IP-DA=Pod's address is received on eth1 interface, it will get forwarded to Pod use main routing table(thanks to toPodRulePriority 512 ip rule).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about AWS_VPC_K8S_CNI_DISABLE_NAT ?

Copy link
Contributor

@liwenwu-amazon liwenwu-amazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@robbrockbank can you squash 2 commits into 1 commits for this PR?
also, i think it might be cleaner put check in nodeInit(), which is similar to what @lbernail have done in his (https://github.com/lbernail/amazon-vpc-cni-k8s/tree/lbernail/disable-nat-rule) branch? thanks

@robbrockbank
Copy link
Contributor Author

@liwenwu-amazon : The reason I put the code in the SetupHostNetwork is to ensure we remove the rule if you disable SNAT rule after it was previously installed. The approach in the lbernail branch that you suggest would simply prevent the call to SetupHostNetwork from being called.

I could add the check in nodeInit() where we call through to a separate RemoteHostNetwork() call if this new option is enabled, but there will be a bit of duplicated code.

Copy link
Contributor

@liwenwu-amazon liwenwu-amazon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thank you!

@liwenwu-amazon liwenwu-amazon merged commit 48e0c82 into aws:master Jun 28, 2018
@robbrockbank
Copy link
Contributor Author

@liwenwu-amazon - is there any way i could chat with you about the eth0 eth1 selection - I'm not sure I really grok it at the moment and since I was hitting issues related to that i'd like to get a handle on it.

@liwenwu-amazon
Copy link
Contributor

liwenwu-amazon commented Jun 28, 2018

@robbrockbank I am on eks slack channel and sig network channel. Or directly contact me on slack (liwwu88)

@bilby91
Copy link

bilby91 commented Jul 12, 2018

I think I'm running into a similar issue that could be related to the ticket I opened today #133 .

Basically I have an openvpn connection established at the node and the pods can't ping other ips (outside AWS). At the same time I can successfully run a container in the node (no k8 involved) that is able to ping the outside ip. Could this be related to the SNAT rules discussed in this PR ?

I'm really new to k8 and EKS so this could be totally unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants