Skip to content

Commit

Permalink
Squashed commit of the following:
Browse files Browse the repository at this point in the history
commit 5aac358
Merge: 0bcf24b 30f98bd
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Fri Jun 26 11:57:31 2020 -0400

    Merge branch 'upstream-master' into scale-test-single-node-old

commit 0bcf24b
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Fri Jun 26 11:55:48 2020 -0400

    Revert rolling update change.

commit 53866a0
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Thu Jun 25 16:22:33 2020 -0400

    Increase rollingupdate limit.

commit 966466a
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Thu Jun 25 11:01:07 2020 -0400

    Fix environment unset environment variables.

commit f429283
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Wed Jun 24 13:26:51 2020 -0400

    Remove sleeps, deleted load balancers in test account.

commit 166a168
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Wed Jun 24 09:21:17 2020 -0400

    Attempt all scale tests.

commit 81dd0aa
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Tue Jun 23 12:31:48 2020 -0400

    Try adding all node groups back.

commit 828f7aa
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Tue Jun 23 11:37:35 2020 -0400

    Attempt only large performance test and no conformance.

commit 82a80e7
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Mon Jun 22 18:02:59 2020 -0400

    Try deleting other node groups.

commit 284fcd1
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Mon Jun 22 16:13:47 2020 -0400

    Trying again.

commit e5ef16b
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Mon Jun 22 16:10:20 2020 -0400

    Altar size again.

commit d1e0062
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Mon Jun 22 12:53:06 2020 -0400

    Attempt instance size change.

commit 686e7f2
Author: Ben Napolitan <bnapolitan@outlook.com>
Date:   Fri Jun 19 16:47:51 2020 -0400

    Fix duplicate name.

commit e17358c
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Fri Jun 19 14:04:58 2020 -0400

    Attempt 5000 pod scale test.

commit e9ea95d
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 18 17:53:28 2020 -0400

    Attempt 730 pods on one node performance test.

commit cad25af
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 18 13:26:51 2020 -0400

    Fix file output syntax.

commit 974ac0e
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 18 11:42:30 2020 -0400

    Verify scale test uploading works.

commit b7efa10
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Wed Jun 17 17:56:32 2020 -0400

    Create data file after scale test.

commit 3a9eaec
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Mon Jun 15 14:27:37 2020 -0400

    Fix if syntax.

commit 00d74bc
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Mon Jun 15 11:36:03 2020 -0400

    Run scale tests moved and hidden behind env var.

commit ef6841e
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Sat Jun 13 21:35:21 2020 -0400

    Fix grep causing failure.

commit 4fbce7e
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Sat Jun 13 18:37:11 2020 -0400

    Reduce sleep for scale test.

commit d766018
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Sat Jun 13 13:32:50 2020 -0400

    Try to diagnose polling problem.

commit 1ac7d35
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Fri Jun 12 17:46:54 2020 -0400

    Run scale test for 130 pods.

commit 9933a09
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Fri Jun 12 13:29:32 2020 -0400

    Add new nodegroup and move directory copy to proper place.

commit 470116c
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Fri Jun 12 12:04:48 2020 -0400

    Move to after kubeconfig.

commit 1f1f0fb
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Fri Jun 12 01:19:04 2020 -0400

    Switch to use KUBECTL_PATH.

commit 1b43268
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 11 23:46:58 2020 -0400

    Retry with one nodegroup.

commit b0d3228
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 11 23:00:48 2020 -0400

    Try to create new nodegroup and apply deployment to it.

commit abd9015
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 11 21:25:40 2020 -0400

    Correct cluster name and change region in CircleCI.

commit 46fe54f
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Thu Jun 11 19:03:03 2020 -0400

    Get info for eksctl.

commit bbb3557
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Wed Jun 10 16:08:26 2020 -0400

    Attempt to ssh into test run.

commit 353130b
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Wed Jun 10 14:22:18 2020 -0400

    Delete eks nodegroup create.

commit 0ff7589
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Wed Jun 10 13:14:51 2020 -0400

    Try to use eksctl.

commit 3ec6da4
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Wed Jun 10 12:28:23 2020 -0400

    Syntax fix.

commit e79b32f
Author: Ben Napolitan <bennapol@amazon.com>
Date:   Tue Jun 9 19:55:25 2020 -0400

    Trying to create nodegroup and deploy pods.
  • Loading branch information
bnapolitan committed Jun 26, 2020
1 parent 30f98bd commit 72a8608
Show file tree
Hide file tree
Showing 7 changed files with 336 additions and 2 deletions.
26 changes: 26 additions & 0 deletions deploy-130-pods.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-130-pods
spec:
replicas: 130
selector:
matchLabels:
app: deploy-130-pods
template:
metadata:
name: test-pod-130
labels:
app: deploy-130-pods
tier: backend
track: stable
spec:
containers:
- name: hello
image: "gcr.io/google-samples/hello-go-gke:1.0"
ports:
- name: http
containerPort: 80
imagePullPolicy: IfNotPresent
nodeSelector:
eks.amazonaws.com/nodegroup: three-nodes
26 changes: 26 additions & 0 deletions deploy-5000-pods.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-5000-pods
spec:
replicas: 5000
selector:
matchLabels:
app: deploy-5000-pods
template:
metadata:
name: test-pod-5000
labels:
app: deploy-5000-pods
tier: backend
track: stable
spec:
containers:
- name: hello
image: "gcr.io/google-samples/hello-go-gke:1.0"
ports:
- name: http
containerPort: 80
imagePullPolicy: IfNotPresent
nodeSelector:
eks.amazonaws.com/nodegroup: multi-node
26 changes: 26 additions & 0 deletions deploy-730-pods.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy-730-pods
spec:
replicas: 730
selector:
matchLabels:
app: deploy-730-pods
template:
metadata:
name: test-pod-730
labels:
app: deploy-730-pods
tier: backend
track: stable
spec:
containers:
- name: hello
image: "gcr.io/google-samples/hello-go-gke:1.0"
ports:
- name: http
containerPort: 80
imagePullPolicy: IfNotPresent
nodeSelector:
eks.amazonaws.com/nodegroup: single-node
10 changes: 9 additions & 1 deletion scripts/lib/cluster.sh
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,14 @@ function down-test-cluster() {
}

function up-test-cluster() {
MNGS=""
if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then
MNGS='{"three-nodes":{"name":"three-nodes","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":3,"asg-max-size":3,"asg-desired-capacity":3,"instance-types":["m5.xlarge"],"volume-size":40}, "single-node":{"name":"single-node","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":1,"asg-max-size":1,"asg-desired-capacity":1,"instance-types":["m5.16xlarge"],"volume-size":40}, "multi-node":{"name":"multi-node","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":98,"asg-max-size":100,"asg-desired-capacity":98,"instance-types":["m5.xlarge"],"volume-size":40}}'
RUN_CONFORMANCE=false
else
MNGS='{"GetRef.Name-mng-for-cni":{"name":"GetRef.Name-mng-for-cni","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":3,"asg-max-size":3,"asg-desired-capacity":3,"instance-types":["c5.xlarge"],"volume-size":40}}'
fi

echo -n "Configuring cluster $CLUSTER_NAME"
AWS_K8S_TESTER_EKS_NAME=$CLUSTER_NAME \
AWS_K8S_TESTER_EKS_LOG_COLOR=true \
Expand All @@ -26,7 +34,7 @@ function up-test-cluster() {
AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_ENABLE=true \
AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_ROLE_CREATE=$ROLE_CREATE \
AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_ROLE_ARN=$ROLE_ARN \
AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_MNGS='{"GetRef.Name-mng-for-cni":{"name":"GetRef.Name-mng-for-cni","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":3,"asg-max-size":3,"asg-desired-capacity":3,"instance-types":["c5.xlarge"],"volume-size":40}}' \
AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_MNGS=$MNGS \
AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_FETCH_LOGS=true \
AWS_K8S_TESTER_EKS_ADD_ON_NLB_HELLO_WORLD_ENABLE=true \
AWS_K8S_TESTER_EKS_ADD_ON_ALB_2048_ENABLE=true \
Expand Down
195 changes: 194 additions & 1 deletion scripts/lib/common.sh
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,199 @@ function display_timelines() {
echo "TIMELINE: Default CNI integration tests took $DEFAULT_INTEGRATION_DURATION seconds."
echo "TIMELINE: Updating CNI image took $CNI_IMAGE_UPDATE_DURATION seconds."
echo "TIMELINE: Current image integration tests took $CURRENT_IMAGE_INTEGRATION_DURATION seconds."
echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds."
if [[ "$RUN_CONFORMANCE" == true ]]; then
echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds."
fi
if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then
echo "TIMELINE: Performance tests took $PERFORMANCE_DURATION seconds."
fi
echo "TIMELINE: Down processes took $DOWN_DURATION seconds."
}

function run_performance_test_130_pods() {
echo "Running scale tests against cluster"
DEPLOY_START=$SECONDS

SCALE_UP_DURATION_ARRAY=()
SCALE_DOWN_DURATION_ARRAY=()
while [ ${#SCALE_UP_DURATION_ARRAY[@]} -lt 3 ]
do
ITERATION_START=$SECONDS
$KUBECTL_PATH scale -f deploy-130-pods.yaml --replicas=130
sleep 20
while [[ ! $($KUBECTL_PATH get deploy | grep 130/130) ]]
do
sleep 1
echo "Scaling UP"
echo $($KUBECTL_PATH get deploy)
done

SCALE_UP_DURATION_ARRAY+=( $((SECONDS - ITERATION_START)) )
MIDPOINT_START=$SECONDS
$KUBECTL_PATH scale -f deploy-130-pods.yaml --replicas=0
while [[ $($KUBECTL_PATH get pods) ]]
do
sleep 1
echo "Scaling DOWN"
echo $($KUBECTL_PATH get deploy)
done
SCALE_DOWN_DURATION_ARRAY+=($((SECONDS - MIDPOINT_START)))
done

echo "Times to scale up:"
INDEX=0
while [ $INDEX -lt ${#SCALE_UP_DURATION_ARRAY[@]} ]
do
echo ${SCALE_UP_DURATION_ARRAY[$INDEX]}
INDEX=$((INDEX + 1))
done
echo ""
echo "Times to scale down:"
INDEX=0
while [ $INDEX -lt ${#SCALE_DOWN_DURATION_ARRAY[@]} ]
do
echo "${SCALE_DOWN_DURATION_ARRAY[$INDEX]} seconds"
INDEX=$((INDEX + 1))
done
echo ""
DEPLOY_DURATION=$((SECONDS - DEPLOY_START))

now="pod-130-scale-test-data-$(date +"%m-%d-%Y-%T").csv"
echo $now

echo $(date +"%m-%d-%Y-%T") >> $now
echo $((SCALE_UP_DURATION_ARRAY[0])), $((SCALE_DOWN_DURATION_ARRAY[0])) >> $now
echo $((SCALE_UP_DURATION_ARRAY[1])), $((SCALE_DOWN_DURATION_ARRAY[1])) >> $now
echo $((SCALE_UP_DURATION_ARRAY[2])), $((SCALE_DOWN_DURATION_ARRAY[2])) >> $now

cat $now
aws s3 cp $now s3://cni-scale-test-data

echo "TIMELINE: 130 Pod performance test took $DEPLOY_DURATION seconds."
}

function run_performance_test_730_pods() {
echo "Running scale tests against cluster"
DEPLOY_START=$SECONDS

SCALE_UP_DURATION_ARRAY=()
SCALE_DOWN_DURATION_ARRAY=()
while [ ${#SCALE_UP_DURATION_ARRAY[@]} -lt 3 ]
do
ITERATION_START=$SECONDS
$KUBECTL_PATH scale -f deploy-730-pods.yaml --replicas=730
sleep 100
while [[ ! $($KUBECTL_PATH get deploy | grep 730/730) ]]
do
sleep 2
echo "Scaling UP"
echo $($KUBECTL_PATH get deploy)
done

SCALE_UP_DURATION_ARRAY+=( $((SECONDS - ITERATION_START)) )
MIDPOINT_START=$SECONDS
$KUBECTL_PATH scale -f deploy-730-pods.yaml --replicas=0
sleep 100
while [[ $($KUBECTL_PATH get pods) ]]
do
sleep 2
echo "Scaling DOWN"
echo $($KUBECTL_PATH get deploy)
done
SCALE_DOWN_DURATION_ARRAY+=($((SECONDS - MIDPOINT_START)))
done

echo "Times to scale up:"
INDEX=0
while [ $INDEX -lt ${#SCALE_UP_DURATION_ARRAY[@]} ]
do
echo ${SCALE_UP_DURATION_ARRAY[$INDEX]}
INDEX=$((INDEX + 1))
done
echo ""
echo "Times to scale down:"
INDEX=0
while [ $INDEX -lt ${#SCALE_DOWN_DURATION_ARRAY[@]} ]
do
echo "${SCALE_DOWN_DURATION_ARRAY[$INDEX]} seconds"
INDEX=$((INDEX + 1))
done
echo ""
DEPLOY_DURATION=$((SECONDS - DEPLOY_START))

now="pod-730-scale-test-data-$(date +"%m-%d-%Y-%T").csv"
echo $now

echo $(date +"%m-%d-%Y-%T") >> $now
echo $((SCALE_UP_DURATION_ARRAY[0])), $((SCALE_DOWN_DURATION_ARRAY[0])) >> $now
echo $((SCALE_UP_DURATION_ARRAY[1])), $((SCALE_DOWN_DURATION_ARRAY[1])) >> $now
echo $((SCALE_UP_DURATION_ARRAY[2])), $((SCALE_DOWN_DURATION_ARRAY[2])) >> $now

cat $now
aws s3 cp $now s3://cni-scale-test-data

echo "TIMELINE: 730 Pod performance test took $DEPLOY_DURATION seconds."
}

function run_performance_test_5000_pods() {
echo "Running scale tests against cluster"
DEPLOY_START=$SECONDS

SCALE_UP_DURATION_ARRAY=()
SCALE_DOWN_DURATION_ARRAY=()
while [ ${#SCALE_UP_DURATION_ARRAY[@]} -lt 3 ]
do
ITERATION_START=$SECONDS
$KUBECTL_PATH scale -f deploy-5000-pods.yaml --replicas=5000
sleep 100
while [[ ! $($KUBECTL_PATH get deploy | grep 5000/5000) ]]
do
sleep 2
echo "Scaling UP"
echo $($KUBECTL_PATH get deploy)
done

SCALE_UP_DURATION_ARRAY+=( $((SECONDS - ITERATION_START)) )
MIDPOINT_START=$SECONDS
$KUBECTL_PATH scale -f deploy-5000-pods.yaml --replicas=0
sleep 100
while [[ $($KUBECTL_PATH get pods) ]]
do
sleep 2
echo "Scaling DOWN"
echo $($KUBECTL_PATH get deploy)
done
SCALE_DOWN_DURATION_ARRAY+=($((SECONDS - MIDPOINT_START)))
done

echo "Times to scale up:"
INDEX=0
while [ $INDEX -lt ${#SCALE_UP_DURATION_ARRAY[@]} ]
do
echo ${SCALE_UP_DURATION_ARRAY[$INDEX]}
INDEX=$((INDEX + 1))
done
echo ""
echo "Times to scale down:"
INDEX=0
while [ $INDEX -lt ${#SCALE_DOWN_DURATION_ARRAY[@]} ]
do
echo "${SCALE_DOWN_DURATION_ARRAY[$INDEX]} seconds"
INDEX=$((INDEX + 1))
done
echo ""
DEPLOY_DURATION=$((SECONDS - DEPLOY_START))

now="pod-5000-scale-test-data-$(date +"%m-%d-%Y-%T").csv"
echo $now

echo $(date +"%m-%d-%Y-%T") >> $now
echo $((SCALE_UP_DURATION_ARRAY[0])), $((SCALE_DOWN_DURATION_ARRAY[0])) >> $now
echo $((SCALE_UP_DURATION_ARRAY[1])), $((SCALE_DOWN_DURATION_ARRAY[1])) >> $now
echo $((SCALE_UP_DURATION_ARRAY[2])), $((SCALE_DOWN_DURATION_ARRAY[2])) >> $now

cat $now
aws s3 cp $now s3://cni-scale-test-data

echo "TIMELINE: 5000 Pod performance test took $DEPLOY_DURATION seconds."
}
21 changes: 21 additions & 0 deletions scripts/run-integration-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ ARCH=$(go env GOARCH)
: "${DEPROVISION:=true}"
: "${BUILD:=true}"
: "${RUN_CONFORMANCE:=false}"
: "${RUN_PERFORMANCE_TESTS:=false}"

__cluster_created=0
__cluster_deprovisioned=0
Expand Down Expand Up @@ -158,6 +159,10 @@ sed -i'.bak' "s,:$MANIFEST_IMAGE_VERSION,:$TEST_IMAGE_VERSION," "$TEST_CONFIG_PA
export KUBECONFIG=$KUBECONFIG_PATH
ADDONS_CNI_IMAGE=$($KUBECTL_PATH describe daemonset aws-node -n kube-system | grep Image | cut -d ":" -f 2-3 | tr -d '[:space:]')

echo "TESTING downloading eksctl"
curl --silent --location "https://github.com/weaveworks/eksctl/releases/latest/download/eksctl_$(uname -s)_amd64.tar.gz" | tar xz -C /tmp
sudo mv -v /tmp/eksctl /usr/local/bin

echo "*******************************************************************************"
echo "Running integration tests on default CNI version, $ADDONS_CNI_IMAGE"
echo ""
Expand Down Expand Up @@ -211,6 +216,22 @@ if [[ $TEST_PASS -eq 0 && "$RUN_CONFORMANCE" == true ]]; then
echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds."
fi

if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then
START=$SECONDS
$KUBECTL_PATH apply -f deploy-130-pods.yaml
run_performance_test_130_pods
$KUBECTL_PATH delete -f deploy-130-pods.yaml

$KUBECTL_PATH apply -f deploy-730-pods.yaml
run_performance_test_730_pods
$KUBECTL_PATH delete -f deploy-730-pods.yaml

$KUBECTL_PATH apply -f deploy-5000-pods.yaml
run_performance_test_5000_pods
$KUBECTL_PATH delete -f deploy-5000-pods.yaml
PERFORMANCE_DURATION=$((SECONDS - START))
fi

if [[ "$DEPROVISION" == true ]]; then
START=$SECONDS

Expand Down
34 changes: 34 additions & 0 deletions test/integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
## Conformance test duration log

Design document link - https://quip-amazon.com/BoJLAC3IpIfW

* May 20, 2020: Initial integration step took roughly 3h 41min
* May 27: 3h 1min
* Skip tests labeled as “Slow” for Ginkgo framework
* Timelines:
* Default CNI: 73s
* Updating CNI image: 110s
* Current image integration: 47s
* Conformance tests: 119.167 min (2 hrs)
* Down cluster: 30 min
* May 29: 2h 59min 30s
* Cache dependencies when testing default CNI
* Timelines:
* Docker build: 3.583 min
* Up test cluster: 31.4 min
* Default CNI: 50s
* Updating CNI image: 92s
* Current image integration: 17s
* Conformance tests: 113.8 min (1.9 hrs)
* Down cluster: 30.417 min
* June 5: 1h 24min 9s
* Parallel execution of conformance tests
* Timelines:
* Docker build: 3.617 min
* Up test cluster: 31.3 min
* Default CNI: 52s
* Updating CNI image: 92s
* Current image integration: 18s
* Conformance tests: 16.317 min
* Down cluster: 29.95 min

0 comments on commit 72a8608

Please sign in to comment.