Skip to content

Commit

Permalink
Merge branch 'master' into strict_na
Browse files Browse the repository at this point in the history
  • Loading branch information
jayanthvn authored Feb 14, 2024
2 parents 1db2d2d + 0129baf commit a7f0bd6
Show file tree
Hide file tree
Showing 5 changed files with 186 additions and 116 deletions.
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,15 @@ Default: empty
Specify a comma-separated list of IPv4 CIDRs to exclude from SNAT. For every item in the list an `iptables` rule and off\-VPC
IP rule will be applied. If an item is not a valid ipv4 range it will be skipped. This should be used when `AWS_VPC_K8S_CNI_EXTERNALSNAT=false`.

#### `POD_MTU` (v1.x.x+)

Type: Integer as a String

*Note*: The default value is set to AWS_VPC_ENI_MTU, which defaults to 9001 if unset.
Default: 9001

Used to configure the MTU size for pod virtual interfaces. The valid range is from `576` to `9001`.

#### `WARM_ENI_TARGET`

Type: Integer as a String
Expand Down
8 changes: 6 additions & 2 deletions cmd/aws-vpc-cni/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,7 @@ const (
envHostCniConfDirPath = "HOST_CNI_CONFDIR_PATH"
envVethPrefix = "AWS_VPC_K8S_CNI_VETHPREFIX"
envEniMTU = "AWS_VPC_ENI_MTU"
envPodMTU = "POD_MTU"
envEnablePodEni = "ENABLE_POD_ENI"
envPodSGEnforcingMode = "POD_SECURITY_GROUP_ENFORCING_MODE"
envPluginLogFile = "AWS_VPC_K8S_PLUGIN_LOG_FILE"
Expand Down Expand Up @@ -278,15 +279,18 @@ func generateJSON(jsonFile string, outFile string, getPrimaryIP func(ipv4 bool)
}
}
vethPrefix := utils.GetEnv(envVethPrefix, defaultVethPrefix)
mtu := utils.GetEnv(envEniMTU, defaultMTU)
// Derive pod MTU from ENI MTU by default
eniMTU := utils.GetEnv(envEniMTU, defaultMTU)
// If pod MTU environment variable is set, overwrite ENI MTU.
podMTU := utils.GetEnv(envPodMTU, eniMTU)
podSGEnforcingMode := utils.GetEnv(envPodSGEnforcingMode, defaultPodSGEnforcingMode)
pluginLogFile := utils.GetEnv(envPluginLogFile, defaultPluginLogFile)
pluginLogLevel := utils.GetEnv(envPluginLogLevel, defaultPluginLogLevel)
randomizeSNAT := utils.GetEnv(envRandomizeSNAT, defaultRandomizeSNAT)

netconf := string(byteValue)
netconf = strings.Replace(netconf, "__VETHPREFIX__", vethPrefix, -1)
netconf = strings.Replace(netconf, "__MTU__", mtu, -1)
netconf = strings.Replace(netconf, "__MTU__", podMTU, -1)
netconf = strings.Replace(netconf, "__PODSGENFORCINGMODE__", podSGEnforcingMode, -1)
netconf = strings.Replace(netconf, "__PLUGINLOGFILE__", pluginLogFile, -1)
netconf = strings.Replace(netconf, "__PLUGINLOGLEVEL__", pluginLogLevel, -1)
Expand Down
45 changes: 27 additions & 18 deletions cmd/cni-metrics-helper/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,24 +15,33 @@ The following diagram shows how `cni-metrics-helper` works in a cluster:
As you can see in the diagram, the `cni-metrics-helper` connects to the API Server over https (`tcp/443`), and another connection is created from the API Server to the worker node over http (`tcp/61678`). If you deploy Amazon EKS with the recommended security groups from [Restricting cluster traffic](https://docs.aws.amazon.com/eks/latest/userguide/sec-group-reqs.html#security-group-restricting-cluster-traffic), then make sure that a security group is in place that allows the inbound connection from the API Server to the worker nodes over `tcp/61678`.

Adding the CNI metrics helper will publish the following metrics to CloudWatch:
```
"addReqCount",
"assignIPAddresses",
"awsAPIErr",
"awsAPILatency",
"awsUtilErr",
"delReqCount",
"eniAllocated",
"eniMaxAvailable",
"ipamdActionInProgress",
"ipamdErr",
"maxIPAddresses",
"podENIErr",
"reconcileCount",
"totalIPAddresses",
"totalIPv4Prefixes",
"totalAssignedIPv4sPerCidr"
```

| Metric | Description | Statistic[^1] |
| ------ | ----------- | ------------- |
| addReqCount | The number of CNI ADD requests that require an IP address | Sum |
| assignIPAddresses | The number of IP addresses assigned to pods | Sum |
| awsAPIErr | The number of times AWS API returns an error | Sum |
| awsAPILatency | AWS API call latency in ms | Max |
| awsUtilErr | The number of errors not handled in awsutils library | Sum |
| delReqCount | The number of CNI DEL requests | Sum |
| eniAllocated | The number of ENIs allocated | Sum |
| eniMaxAvailable | The maximum number of ENIs that can be attached to this instance, accounting for unmanaged ENIs | Sum |
| ipamdActionInProgress | The number of ipamd actions in progress | Sum |
| ipamdErr | The number of errors encountered in ipamd | Sum |
| maxIPAddresses | The maximum number of IP addresses that can be allocated to the instance | Sum |
| podENIErr | The number of errors encountered while managing ENIs for pods | Sum |
| reconcileCount | The number of times ipamd reconciles on ENIs and IP/Prefix addresses | Sum |
| totalIPAddresses | The number of IPs allocated for pods | Sum |
| totalIPv4Prefixes | The total number of IPv4 prefixes | Sum |
| totalAssignedIPv4sPerCidr | The total number of IP addresses assigned per cidr | Sum |
| forceRemoveENI | The number of ENIs force removed while they had assigned pods | Sum |
| forceRemoveIPs | The number of IPs force removed while they had assigned pods | Sum |
| ec2ApiReqCount | The number of requests made to EC2 APIs by CNI | Sum |
| ec2ApiErrCount | The number of failed EC2 API requests | Sum |

[^1]: This column indicates how the metric has been aggregated across all nodes
Sum: For datapoints from all nodes, this is the summation of those datapoints
Max: For datapoints from all nodes, this is the maximum value of those datapoints

## Using IRSA
As per [AWS EKS Security Best Practice](https://docs.aws.amazon.com/eks/latest/userguide/best-practices-security.html), if you are using IRSA for pods then following requirements must be satisfied to succesfully publish metrics to CloudWatch
Expand Down
121 changes: 72 additions & 49 deletions test/integration/cni/host_networking_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,11 @@ import (
"strconv"
"time"

v1 "k8s.io/api/core/v1"

"github.com/aws/amazon-vpc-cni-k8s/test/framework/resources/k8s/manifest"
k8sUtils "github.com/aws/amazon-vpc-cni-k8s/test/framework/resources/k8s/utils"
"github.com/aws/amazon-vpc-cni-k8s/test/framework/utils"
"github.com/aws/amazon-vpc-cni-k8s/test/integration/common"
v1 "k8s.io/api/core/v1"

. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
Expand All @@ -31,13 +30,15 @@ import (
// TODO: Instead of passing the list of pods to the test helper, have the test helper get the pod on node
const (
NEW_MTU_VAL = 1300
NEW_POD_MTU = 1280
NEW_VETH_PREFIX = "veth"
podLabelKey = "app"
podLabelVal = "host-networking-test"
)

var err error

var _ = Describe("test host networking", func() {
var err error
var podLabelKey = "app"
var podLabelVal = "host-networking-test"

// For host networking tests, increase WARM_IP_TARGET to prevent long IPAMD warmup.
BeforeEach(func() {
Expand All @@ -57,6 +58,10 @@ var _ = Describe("test host networking", func() {
"AWS_VPC_ENI_MTU": DEFAULT_MTU_VAL,
"AWS_VPC_K8S_CNI_VETHPREFIX": DEFAULT_VETH_PREFIX,
})
k8sUtils.RemoveVarFromDaemonSetAndWaitTillUpdated(f, utils.AwsNodeName,
utils.AwsNodeNamespace, utils.AwsNodeName, map[string]struct{}{
"POD_MTU": {},
})
// After updating daemonset pod, we must wait until conflist is updated so that container-runtime calls CNI ADD with the latest VETH prefix and MTU.
// Otherwise, the stale value can cause failures in future test cases.
time.Sleep(utils.PollIntervalMedium)
Expand Down Expand Up @@ -104,51 +109,13 @@ var _ = Describe("test host networking", func() {
common.ValidateHostNetworking(common.NetworkingTearDownSucceeds, input, primaryNode.Name, f)
})

It("Validate Host Networking setup after changing MTU and Veth Prefix", func() {
deployment := manifest.NewBusyBoxDeploymentBuilder(f.Options.TestImageRegistry).
Replicas(maxIPPerInterface*2).
PodLabel(podLabelKey, podLabelVal).
NodeName(primaryNode.Name).
Build()

By("Configuring Veth Prefix and MTU value on aws-node daemonset")
k8sUtils.AddEnvVarToDaemonSetAndWaitTillUpdated(f, utils.AwsNodeName, utils.AwsNodeNamespace, utils.AwsNodeName, map[string]string{
"AWS_VPC_ENI_MTU": strconv.Itoa(NEW_MTU_VAL),
"AWS_VPC_K8S_CNI_VETHPREFIX": NEW_VETH_PREFIX,
Context("Validate Host Networking setup after changing Veth Prefix and", func() {
It("ENI MTU", func() {
mtuValidationTest(false, NEW_MTU_VAL)
})
It("POD MTU", func() {
mtuValidationTest(true, NEW_POD_MTU)
})
// After updating daemonset pod, we must wait until conflist is updated so that container-runtime calls CNI ADD with the new VETH prefix and MTU.
time.Sleep(utils.PollIntervalMedium)

By("creating a deployment to launch pods")
deployment, err = f.K8sResourceManagers.DeploymentManager().
CreateAndWaitTillDeploymentIsReady(deployment, utils.DefaultDeploymentReadyTimeout)
Expect(err).ToNot(HaveOccurred())

By("getting the list of pods using IP from primary and secondary ENI")
interfaceTypeToPodList :=
common.GetPodsOnPrimaryAndSecondaryInterface(primaryNode, podLabelKey, podLabelVal, f)

By("generating the pod networking validation input to be passed to tester")
podNetworkingValidationInput := common.GetPodNetworkingValidationInput(interfaceTypeToPodList, vpcCIDRs)
podNetworkingValidationInput.VethPrefix = NEW_VETH_PREFIX
podNetworkingValidationInput.ValidateMTU = true
podNetworkingValidationInput.MTU = NEW_MTU_VAL
input, err := podNetworkingValidationInput.Serialize()
Expect(err).NotTo(HaveOccurred())

By("validating host networking setup is setup correctly with MTU check as well")
common.ValidateHostNetworking(common.NetworkingSetupSucceeds, input, primaryNode.Name, f)

By("deleting the deployment to test teardown")
err = f.K8sResourceManagers.DeploymentManager().
DeleteAndWaitTillDeploymentIsDeleted(deployment)
Expect(err).ToNot(HaveOccurred())

By("waiting to allow CNI to tear down networking for terminated pods")
time.Sleep(time.Second * 60)

By("validating host networking is teared down correctly")
common.ValidateHostNetworking(common.NetworkingTearDownSucceeds, input, primaryNode.Name, f)
})
})

Expand Down Expand Up @@ -205,3 +172,59 @@ var _ = Describe("test host networking", func() {
})
})
})

func mtuValidationTest(usePodMTU bool, mtuVal int) {
deployment := manifest.NewBusyBoxDeploymentBuilder(f.Options.TestImageRegistry).
Replicas(maxIPPerInterface*2).
PodLabel(podLabelKey, podLabelVal).
NodeName(primaryNode.Name).
Build()

if usePodMTU {
By("Configuring Veth Prefix and Pod MTU value on aws-node daemonset")
k8sUtils.AddEnvVarToDaemonSetAndWaitTillUpdated(f, utils.AwsNodeName, utils.AwsNodeNamespace, utils.AwsNodeName, map[string]string{
"AWS_VPC_ENI_MTU": strconv.Itoa(NEW_MTU_VAL),
"POD_MTU": strconv.Itoa(NEW_POD_MTU),
"AWS_VPC_K8S_CNI_VETHPREFIX": NEW_VETH_PREFIX,
})
} else {
By("Configuring Veth Prefix and ENI MTU value on aws-node daemonset")
k8sUtils.AddEnvVarToDaemonSetAndWaitTillUpdated(f, utils.AwsNodeName, utils.AwsNodeNamespace, utils.AwsNodeName, map[string]string{
"AWS_VPC_ENI_MTU": strconv.Itoa(NEW_MTU_VAL),
"AWS_VPC_K8S_CNI_VETHPREFIX": NEW_VETH_PREFIX,
})
}
// After updating daemonset pod, we must wait until conflist is updated so that container-runtime calls CNI ADD with the new VETH prefix and MTU.
time.Sleep(utils.PollIntervalMedium)

By("creating a deployment to launch pods")
deployment, err = f.K8sResourceManagers.DeploymentManager().
CreateAndWaitTillDeploymentIsReady(deployment, utils.DefaultDeploymentReadyTimeout)
Expect(err).ToNot(HaveOccurred())

By("getting the list of pods using IP from primary and secondary ENI")
interfaceTypeToPodList :=
common.GetPodsOnPrimaryAndSecondaryInterface(primaryNode, podLabelKey, podLabelVal, f)

By("generating the pod networking validation input to be passed to tester")
podNetworkingValidationInput := common.GetPodNetworkingValidationInput(interfaceTypeToPodList, vpcCIDRs)
podNetworkingValidationInput.VethPrefix = NEW_VETH_PREFIX
podNetworkingValidationInput.ValidateMTU = true
podNetworkingValidationInput.MTU = mtuVal
input, err := podNetworkingValidationInput.Serialize()
Expect(err).NotTo(HaveOccurred())

By("validating host networking setup is setup correctly with MTU check as well")
common.ValidateHostNetworking(common.NetworkingSetupSucceeds, input, primaryNode.Name, f)

By("deleting the deployment to test teardown")
err = f.K8sResourceManagers.DeploymentManager().
DeleteAndWaitTillDeploymentIsDeleted(deployment)
Expect(err).ToNot(HaveOccurred())

By("waiting to allow CNI to tear down networking for terminated pods")
time.Sleep(time.Second * 60)

By("validating host networking is teared down correctly")
common.ValidateHostNetworking(common.NetworkingTearDownSucceeds, input, primaryNode.Name, f)
}
Loading

0 comments on commit a7f0bd6

Please sign in to comment.