From a8323cc578e5baa65ec9cfead32864e58bdee963 Mon Sep 17 00:00:00 2001 From: Ben Napolitan Date: Tue, 7 Jul 2020 12:25:31 -0400 Subject: [PATCH] Squashed commit of the following: commit 67b636352d2b95cdf60e5e1ea90c316f44474fd5 Merge: fd80aff5 afdb125d Author: Ben Napolitan Date: Tue Jul 7 12:22:14 2020 -0400 Merge branch 'upstream-master' into scale-test-single-node commit fd80aff5627b43cc7eba83c77760cd89ac77c035 Author: Ben Napolitan Date: Tue Jul 7 12:20:56 2020 -0400 Forgotten readme commit. commit dae08fd00c721c2bd31d8b982b9c6d966441edcf Author: Ben Napolitan Date: Tue Jul 7 12:20:43 2020 -0400 Fix duration calculation for timeout, remove eksctl, revise readme. commit 80a50fd03586ced7388396cd05202b4d76133fa5 Author: Ben Napolitan Date: Tue Jul 7 01:04:09 2020 -0400 Change image to kubernetes pause. commit 610402262e5bb3471f11e07a4fe1078f79d739e0 Author: Ben Napolitan Date: Mon Jul 6 16:59:36 2020 -0400 Revert back to 98 node startup. commit c7d9a5fef55b0ace2ad10fc70a5793ef3f0fa114 Author: Ben Napolitan Date: Mon Jul 6 14:49:30 2020 -0400 Reduce initial replicas to 1 commit ddf7cd81896beffc12c5aea455c34db012c1e02e Author: Ben Napolitan Date: Mon Jul 6 13:11:50 2020 -0400 Add timeout to performance tests, add content to readme. commit 44092a6801233fac1a664c65dde7beda277e949c Author: Ben Napolitan Date: Mon Jul 6 11:56:52 2020 -0400 Revert image to google. commit 2c8291e3f892d7182dc930648fbe8c6c0fd615ae Author: Ben Napolitan Date: Thu Jul 2 15:36:18 2020 -0400 Don't exit if s3 bucket upload fails. commit 318101aa967a9cfb1f3459c2cfd512c12af864d7 Author: Ben Napolitan Date: Thu Jul 2 13:37:36 2020 -0400 Fix file path issue. commit 16254ad256c3a1150fb53250f49200a7e3f3388e Author: Ben Napolitan Date: Wed Jul 1 17:07:12 2020 -0400 Fix CircleCI yml syntax error. commit 43dd11deb255af6a0d5fc38968e2990df2fd98ae Author: Ben Napolitan Date: Wed Jul 1 17:05:34 2020 -0400 Configure weekly performance. commit d9b58bba297b4a80dc73032964fe6ccf233068eb Author: Ben Napolitan Date: Wed Jul 1 16:57:17 2020 -0400 Start mng with 1 node, put metadata into data file names, suppress copy errors. commit 5bab04d97fd9507a67bfd274dc3746ed3fc3d5f9 Author: Ben Napolitan Date: Wed Jul 1 02:43:28 2020 -0400 Changes from PR. commit 72a8608c5947b5583dd6584772a82fa2434c6e8d Author: Ben Napolitan Date: Fri Jun 26 11:58:25 2020 -0400 Squashed commit of the following: commit 5aac3580040f2f6622814915b5e3ec5c3a1b04d2 Merge: 0bcf24b4 30f98bd1 Author: Ben Napolitan Date: Fri Jun 26 11:57:31 2020 -0400 Merge branch 'upstream-master' into scale-test-single-node-old commit 0bcf24b4cbb7cbaffe54e6b2cfdd012b45180df5 Author: Ben Napolitan Date: Fri Jun 26 11:55:48 2020 -0400 Revert rolling update change. commit 53866a01bba92d33e77cf86aa2d4947269f40328 Author: Ben Napolitan Date: Thu Jun 25 16:22:33 2020 -0400 Increase rollingupdate limit. commit 966466acbfd502207ebd976f97dcad4ef9caeeab Author: Ben Napolitan Date: Thu Jun 25 11:01:07 2020 -0400 Fix environment unset environment variables. commit f42928333ba3eb7fa5ef0200a9c1991d7774d03c Author: Ben Napolitan Date: Wed Jun 24 13:26:51 2020 -0400 Remove sleeps, deleted load balancers in test account. commit 166a168b1864f712381da1f413d3401c4733e39c Author: Ben Napolitan Date: Wed Jun 24 09:21:17 2020 -0400 Attempt all scale tests. commit 81dd0aaf39af12eeb85af1bcf33f5edd29c587a3 Author: Ben Napolitan Date: Tue Jun 23 12:31:48 2020 -0400 Try adding all node groups back. commit 828f7aa9a9132ffa856fa830b3501011bfe4a4df Author: Ben Napolitan Date: Tue Jun 23 11:37:35 2020 -0400 Attempt only large performance test and no conformance. commit 82a80e74155510b2f81abbc22e62ef7f0ee3b32c Author: Ben Napolitan Date: Mon Jun 22 18:02:59 2020 -0400 Try deleting other node groups. commit 284fcd1845c4fb98c78d8bc999ae5ef72ef8d520 Author: Ben Napolitan Date: Mon Jun 22 16:13:47 2020 -0400 Trying again. commit e5ef16b22b3c2f6a55f8e512e67e4f338f6a1215 Author: Ben Napolitan Date: Mon Jun 22 16:10:20 2020 -0400 Altar size again. commit d1e0062335ff09fd918ad7ccfd67fe7271a0ea55 Author: Ben Napolitan Date: Mon Jun 22 12:53:06 2020 -0400 Attempt instance size change. commit 686e7f21a0b918857bd66534b7f8ad08a5e9f97e Author: Ben Napolitan Date: Fri Jun 19 16:47:51 2020 -0400 Fix duplicate name. commit e17358c7e942ed8a84e1b26431d3ebce329ad51f Author: Ben Napolitan Date: Fri Jun 19 14:04:58 2020 -0400 Attempt 5000 pod scale test. commit e9ea95dc3c1f7b782dc42395865bcd78321a4302 Author: Ben Napolitan Date: Thu Jun 18 17:53:28 2020 -0400 Attempt 730 pods on one node performance test. commit cad25aff1ac21fa37dfe81f6a72153bbe0c3db8a Author: Ben Napolitan Date: Thu Jun 18 13:26:51 2020 -0400 Fix file output syntax. commit 974ac0e6832eff6274b40033b4eee249cf483810 Author: Ben Napolitan Date: Thu Jun 18 11:42:30 2020 -0400 Verify scale test uploading works. commit b7efa1001600ce45788dc657e907486c0aa66022 Author: Ben Napolitan Date: Wed Jun 17 17:56:32 2020 -0400 Create data file after scale test. commit 3a9eaec1abc5be083776631decef9d4d8ff98511 Author: Ben Napolitan Date: Mon Jun 15 14:27:37 2020 -0400 Fix if syntax. commit 00d74bc7425d213619ce020be29eec83be8dd163 Author: Ben Napolitan Date: Mon Jun 15 11:36:03 2020 -0400 Run scale tests moved and hidden behind env var. commit ef6841ea1600ac0d0f33f5e675882d8f8c0ecb3b Author: Ben Napolitan Date: Sat Jun 13 21:35:21 2020 -0400 Fix grep causing failure. commit 4fbce7eb0556d034814e7cb4d973f9fa3de04c7c Author: Ben Napolitan Date: Sat Jun 13 18:37:11 2020 -0400 Reduce sleep for scale test. commit d766018ef13e7ae24eff4be038ccf36761ec542f Author: Ben Napolitan Date: Sat Jun 13 13:32:50 2020 -0400 Try to diagnose polling problem. commit 1ac7d354ac172a070c8d1c21158a75625537af6d Author: Ben Napolitan Date: Fri Jun 12 17:46:54 2020 -0400 Run scale test for 130 pods. commit 9933a0981865e44202b0a32201efdaf97cda9eec Author: Ben Napolitan Date: Fri Jun 12 13:29:32 2020 -0400 Add new nodegroup and move directory copy to proper place. commit 470116cddf7ffa318bf27c589ce1c406b964977b Author: Ben Napolitan Date: Fri Jun 12 12:04:48 2020 -0400 Move to after kubeconfig. commit 1f1f0fbfcfe9714e39e31332f4409f166ea2fb4d Author: Ben Napolitan Date: Fri Jun 12 01:19:04 2020 -0400 Switch to use KUBECTL_PATH. commit 1b43268c92b520dac1f672e1f9c47dd92c41db92 Author: Ben Napolitan Date: Thu Jun 11 23:46:58 2020 -0400 Retry with one nodegroup. commit b0d3228959b856c43712d2d60412cb33dfc94178 Author: Ben Napolitan Date: Thu Jun 11 23:00:48 2020 -0400 Try to create new nodegroup and apply deployment to it. commit abd90157d456d00ad1125e185075fb26a4d00584 Author: Ben Napolitan Date: Thu Jun 11 21:25:40 2020 -0400 Correct cluster name and change region in CircleCI. commit 46fe54fd62ded4968c2a89b89dcf6210de888ee4 Author: Ben Napolitan Date: Thu Jun 11 19:03:03 2020 -0400 Get info for eksctl. commit bbb3557abd659c83610473eb87ad3551e3f362d4 Author: Ben Napolitan Date: Wed Jun 10 16:08:26 2020 -0400 Attempt to ssh into test run. commit 353130b93f8f12286131971a2428244501b91181 Author: Ben Napolitan Date: Wed Jun 10 14:22:18 2020 -0400 Delete eks nodegroup create. commit 0ff75894a9fcb253d24755ce20dc7400e36ed742 Author: Ben Napolitan Date: Wed Jun 10 13:14:51 2020 -0400 Try to use eksctl. commit 3ec6da45752d341533522e14b514d14d122f7864 Author: Ben Napolitan Date: Wed Jun 10 12:28:23 2020 -0400 Syntax fix. commit e79b32f85debfc7a11159b393b949afa9601a6f5 Author: Ben Napolitan Date: Tue Jun 9 19:55:25 2020 -0400 Trying to create nodegroup and deploy pods. --- .circleci/config.yml | 44 +++++++ scripts/lib/cluster.sh | 10 +- scripts/lib/common.sh | 8 +- scripts/lib/performance_tests.sh | 209 +++++++++++++++++++++++++++++++ scripts/run-integration-tests.sh | 33 ++++- test/integration/README.md | 45 +++++++ testdata/deploy-130-pods.yaml | 26 ++++ testdata/deploy-5000-pods.yaml | 26 ++++ testdata/deploy-730-pods.yaml | 26 ++++ 9 files changed, 419 insertions(+), 8 deletions(-) create mode 100644 scripts/lib/performance_tests.sh create mode 100644 test/integration/README.md create mode 100644 testdata/deploy-130-pods.yaml create mode 100644 testdata/deploy-5000-pods.yaml create mode 100644 testdata/deploy-730-pods.yaml diff --git a/.circleci/config.yml b/.circleci/config.yml index 313f810f6b..47d1583564 100644 --- a/.circleci/config.yml +++ b/.circleci/config.yml @@ -82,6 +82,38 @@ jobs: - store_artifacts: path: /tmp/cni-test + performance_test: + docker: + - image: circleci/golang:1.13-stretch + working_directory: /go/src/github.com/{{ORG_NAME}}/{{REPO_NAME}} + environment: + <<: *env + RUN_CONFORMANCE: "false" + RUN_PERFORMANCE_TESTS: "true" + steps: + - checkout + - setup_remote_docker + - aws-cli/setup: + profile-name: awstester + - restore_cache: + keys: + - dependency-packages-store-{{ checksum "test/integration/go.mod" }} + - dependency-packages-store- + - k8s/install-kubectl: + # requires 1.14.9 for k8s testing, since it uses log api. + kubectl-version: v1.14.9 + - run: + name: Run the integration tests + command: ./scripts/run-integration-tests.sh + no_output_timeout: 15m + - save_cache: + key: dependency-packages-store-{{ checksum "test/integration/go.mod" }} + paths: + - /go/pkg + when: always + - store_artifacts: + path: /tmp/cni-test + workflows: version: 2 check: @@ -118,3 +150,15 @@ workflows: - master jobs: - integration_test + + # triggers weekly tests on master + weekly-test-run: + triggers: + - schedule: + cron: "0 0 * * 6" + filters: + branches: + only: + - master + jobs: + - performance_test diff --git a/scripts/lib/cluster.sh b/scripts/lib/cluster.sh index d3d780a96d..1e5c646c77 100644 --- a/scripts/lib/cluster.sh +++ b/scripts/lib/cluster.sh @@ -12,6 +12,14 @@ function down-test-cluster() { } function up-test-cluster() { + MNGS="" + if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then + MNGS='{"three-nodes":{"name":"three-nodes","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":3,"asg-max-size":3,"asg-desired-capacity":3,"instance-types":["m5.xlarge"],"volume-size":40}, "single-node":{"name":"single-node","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":1,"asg-max-size":1,"asg-desired-capacity":1,"instance-types":["m5.16xlarge"],"volume-size":40}, "multi-node":{"name":"multi-node","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":1,"asg-max-size":100,"asg-desired-capacity":98,"instance-types":["m5.xlarge"],"volume-size":40}}' + RUN_CONFORMANCE=false + else + MNGS='{"GetRef.Name-mng-for-cni":{"name":"GetRef.Name-mng-for-cni","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":3,"asg-max-size":3,"asg-desired-capacity":3,"instance-types":["c5.xlarge"],"volume-size":40}}' + fi + echo -n "Configuring cluster $CLUSTER_NAME" AWS_K8S_TESTER_EKS_NAME=$CLUSTER_NAME \ AWS_K8S_TESTER_EKS_LOG_COLOR=true \ @@ -26,7 +34,7 @@ function up-test-cluster() { AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_ENABLE=true \ AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_ROLE_CREATE=$ROLE_CREATE \ AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_ROLE_ARN=$ROLE_ARN \ - AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_MNGS='{"GetRef.Name-mng-for-cni":{"name":"GetRef.Name-mng-for-cni","remote-access-user-name":"ec2-user","tags":{"group":"amazon-vpc-cni-k8s"},"release-version":"","ami-type":"AL2_x86_64","asg-min-size":3,"asg-max-size":3,"asg-desired-capacity":3,"instance-types":["c5.xlarge"],"volume-size":40}}' \ + AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_MNGS=$MNGS \ AWS_K8S_TESTER_EKS_ADD_ON_MANAGED_NODE_GROUPS_FETCH_LOGS=true \ AWS_K8S_TESTER_EKS_ADD_ON_NLB_HELLO_WORLD_ENABLE=true \ AWS_K8S_TESTER_EKS_ADD_ON_ALB_2048_ENABLE=true \ diff --git a/scripts/lib/common.sh b/scripts/lib/common.sh index c01637245e..38788454da 100644 --- a/scripts/lib/common.sh +++ b/scripts/lib/common.sh @@ -25,6 +25,12 @@ function display_timelines() { echo "TIMELINE: Default CNI integration tests took $DEFAULT_INTEGRATION_DURATION seconds." echo "TIMELINE: Updating CNI image took $CNI_IMAGE_UPDATE_DURATION seconds." echo "TIMELINE: Current image integration tests took $CURRENT_IMAGE_INTEGRATION_DURATION seconds." - echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds." + if [[ "$RUN_CONFORMANCE" == true ]]; then + echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds." + fi + if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then + echo "TIMELINE: Performance tests took $PERFORMANCE_DURATION seconds." + fi echo "TIMELINE: Down processes took $DOWN_DURATION seconds." } + diff --git a/scripts/lib/performance_tests.sh b/scripts/lib/performance_tests.sh new file mode 100644 index 0000000000..848c81155f --- /dev/null +++ b/scripts/lib/performance_tests.sh @@ -0,0 +1,209 @@ +function check_for_timeout() { + if [[ $((SECONDS - $1)) -gt 10000 ]]; then + RUNNING_PERFORMANCE=false + on_error + fi +} + +function run_performance_test_130_pods() { + echo "Running performance tests against cluster" + RUNNING_PERFORMANCE=true + + DEPLOY_START=$SECONDS + + SCALE_UP_DURATION_ARRAY=() + SCALE_DOWN_DURATION_ARRAY=() + while [ ${#SCALE_UP_DURATION_ARRAY[@]} -lt 3 ] + do + ITERATION_START=$SECONDS + $KUBECTL_PATH scale -f ./testdata/deploy-130-pods.yaml --replicas=130 + sleep 20 + while [[ ! $($KUBECTL_PATH get deploy | grep 130/130) ]] + do + sleep 1 + echo "Scaling UP" + echo $($KUBECTL_PATH get deploy) + check_for_timeout $DEPLOY_START + done + + SCALE_UP_DURATION_ARRAY+=( $((SECONDS - ITERATION_START)) ) + MIDPOINT_START=$SECONDS + $KUBECTL_PATH scale -f ./testdata/deploy-130-pods.yaml --replicas=0 + while [[ $($KUBECTL_PATH get pods) ]] + do + sleep 1 + echo "Scaling DOWN" + echo $($KUBECTL_PATH get deploy) + check_for_timeout $DEPLOY_START + done + SCALE_DOWN_DURATION_ARRAY+=($((SECONDS - MIDPOINT_START))) + done + + echo "Times to scale up:" + INDEX=0 + while [ $INDEX -lt ${#SCALE_UP_DURATION_ARRAY[@]} ] + do + echo ${SCALE_UP_DURATION_ARRAY[$INDEX]} + INDEX=$((INDEX + 1)) + done + echo "" + echo "Times to scale down:" + INDEX=0 + while [ $INDEX -lt ${#SCALE_DOWN_DURATION_ARRAY[@]} ] + do + echo "${SCALE_DOWN_DURATION_ARRAY[$INDEX]} seconds" + INDEX=$((INDEX + 1)) + done + echo "" + DEPLOY_DURATION=$((SECONDS - DEPLOY_START)) + + now="pod-130-Test#${TEST_ID}-$(date +"%m-%d-%Y-%T").csv" + echo $now + + echo $(date +"%m-%d-%Y-%T") >> $now + echo $((SCALE_UP_DURATION_ARRAY[0])), $((SCALE_DOWN_DURATION_ARRAY[0])) >> $now + echo $((SCALE_UP_DURATION_ARRAY[1])), $((SCALE_DOWN_DURATION_ARRAY[1])) >> $now + echo $((SCALE_UP_DURATION_ARRAY[2])), $((SCALE_DOWN_DURATION_ARRAY[2])) >> $now + + cat $now + aws s3 cp $now s3://cni-performance-test-data + + echo "TIMELINE: 130 Pod performance test took $DEPLOY_DURATION seconds." + RUNNING_PERFORMANCE=false +} + +function run_performance_test_730_pods() { + echo "Running performance tests against cluster" + RUNNING_PERFORMANCE=true + + DEPLOY_START=$SECONDS + + SCALE_UP_DURATION_ARRAY=() + SCALE_DOWN_DURATION_ARRAY=() + while [ ${#SCALE_UP_DURATION_ARRAY[@]} -lt 3 ] + do + ITERATION_START=$SECONDS + $KUBECTL_PATH scale -f ./testdata/deploy-730-pods.yaml --replicas=730 + sleep 100 + while [[ ! $($KUBECTL_PATH get deploy | grep 730/730) ]] + do + sleep 2 + echo "Scaling UP" + echo $($KUBECTL_PATH get deploy) + check_for_timeout $DEPLOY_START + done + + SCALE_UP_DURATION_ARRAY+=( $((SECONDS - ITERATION_START)) ) + MIDPOINT_START=$SECONDS + $KUBECTL_PATH scale -f ./testdata/deploy-730-pods.yaml --replicas=0 + sleep 100 + while [[ $($KUBECTL_PATH get pods) ]] + do + sleep 2 + echo "Scaling DOWN" + echo $($KUBECTL_PATH get deploy) + check_for_timeout $DEPLOY_START + done + SCALE_DOWN_DURATION_ARRAY+=($((SECONDS - MIDPOINT_START))) + done + + echo "Times to scale up:" + INDEX=0 + while [ $INDEX -lt ${#SCALE_UP_DURATION_ARRAY[@]} ] + do + echo ${SCALE_UP_DURATION_ARRAY[$INDEX]} + INDEX=$((INDEX + 1)) + done + echo "" + echo "Times to scale down:" + INDEX=0 + while [ $INDEX -lt ${#SCALE_DOWN_DURATION_ARRAY[@]} ] + do + echo "${SCALE_DOWN_DURATION_ARRAY[$INDEX]} seconds" + INDEX=$((INDEX + 1)) + done + echo "" + DEPLOY_DURATION=$((SECONDS - DEPLOY_START)) + + now="pod-730-Test#${TEST_ID}-$(date +"%m-%d-%Y-%T").csv" + echo $now + + echo $(date +"%m-%d-%Y-%T") >> $now + echo $((SCALE_UP_DURATION_ARRAY[0])), $((SCALE_DOWN_DURATION_ARRAY[0])) >> $now + echo $((SCALE_UP_DURATION_ARRAY[1])), $((SCALE_DOWN_DURATION_ARRAY[1])) >> $now + echo $((SCALE_UP_DURATION_ARRAY[2])), $((SCALE_DOWN_DURATION_ARRAY[2])) >> $now + + cat $now + aws s3 cp $now s3://cni-performance-test-data + + echo "TIMELINE: 730 Pod performance test took $DEPLOY_DURATION seconds." + RUNNING_PERFORMANCE=false +} + +function run_performance_test_5000_pods() { + echo "Running performance tests against cluster" + RUNNING_PERFORMANCE=true + + DEPLOY_START=$SECONDS + + SCALE_UP_DURATION_ARRAY=() + SCALE_DOWN_DURATION_ARRAY=() + while [ ${#SCALE_UP_DURATION_ARRAY[@]} -lt 3 ] + do + ITERATION_START=$SECONDS + $KUBECTL_PATH scale -f ./testdata/deploy-5000-pods.yaml --replicas=5000 + sleep 100 + while [[ ! $($KUBECTL_PATH get deploy | grep 5000/5000) ]] + do + sleep 2 + echo "Scaling UP" + echo $($KUBECTL_PATH get deploy) + check_for_timeout $DEPLOY_START + done + + SCALE_UP_DURATION_ARRAY+=( $((SECONDS - ITERATION_START)) ) + MIDPOINT_START=$SECONDS + $KUBECTL_PATH scale -f ./testdata/deploy-5000-pods.yaml --replicas=0 + sleep 100 + while [[ $($KUBECTL_PATH get pods) ]] + do + sleep 2 + echo "Scaling DOWN" + echo $($KUBECTL_PATH get deploy) + check_for_timeout $DEPLOY_START + done + SCALE_DOWN_DURATION_ARRAY+=($((SECONDS - MIDPOINT_START))) + done + + echo "Times to scale up:" + INDEX=0 + while [ $INDEX -lt ${#SCALE_UP_DURATION_ARRAY[@]} ] + do + echo ${SCALE_UP_DURATION_ARRAY[$INDEX]} + INDEX=$((INDEX + 1)) + done + echo "" + echo "Times to scale down:" + INDEX=0 + while [ $INDEX -lt ${#SCALE_DOWN_DURATION_ARRAY[@]} ] + do + echo "${SCALE_DOWN_DURATION_ARRAY[$INDEX]} seconds" + INDEX=$((INDEX + 1)) + done + echo "" + DEPLOY_DURATION=$((SECONDS - DEPLOY_START)) + + now="pod-5000-Test#${TEST_ID}-$(date +"%m-%d-%Y-%T").csv" + echo $now + + echo $(date +"%m-%d-%Y-%T") >> $now + echo $((SCALE_UP_DURATION_ARRAY[0])), $((SCALE_DOWN_DURATION_ARRAY[0])) >> $now + echo $((SCALE_UP_DURATION_ARRAY[1])), $((SCALE_DOWN_DURATION_ARRAY[1])) >> $now + echo $((SCALE_UP_DURATION_ARRAY[2])), $((SCALE_DOWN_DURATION_ARRAY[2])) >> $now + + cat $now + aws s3 cp $now s3://cni-performance-test-data + + echo "TIMELINE: 5000 Pod performance test took $DEPLOY_DURATION seconds." + RUNNING_PERFORMANCE=false +} diff --git a/scripts/run-integration-tests.sh b/scripts/run-integration-tests.sh index 0abac20081..54e322b821 100755 --- a/scripts/run-integration-tests.sh +++ b/scripts/run-integration-tests.sh @@ -8,6 +8,7 @@ DIR=$(cd "$(dirname "$0")"; pwd) source "$DIR"/lib/common.sh source "$DIR"/lib/aws.sh source "$DIR"/lib/cluster.sh +source "$DIR"/lib/performance_tests.sh # Variables used in /lib/aws.sh OS=$(go env GOOS) @@ -19,6 +20,8 @@ ARCH=$(go env GOARCH) : "${DEPROVISION:=true}" : "${BUILD:=true}" : "${RUN_CONFORMANCE:=false}" +: "${RUN_PERFORMANCE_TESTS:=false}" +: "${RUNNING_PERFORMANCE:=false}" __cluster_created=0 __cluster_deprovisioned=0 @@ -26,13 +29,15 @@ __cluster_deprovisioned=0 on_error() { # Make sure we destroy any cluster that was created if we hit run into an # error when attempting to run tests against the cluster - if [[ $__cluster_created -eq 1 && $__cluster_deprovisioned -eq 0 && "$DEPROVISION" == true ]]; then - # prevent double-deprovisioning with ctrl-c during deprovisioning... - __cluster_deprovisioned=1 - echo "Cluster was provisioned already. Deprovisioning it..." - down-test-cluster + if [[ $RUNNING_PERFORMANCE == false ]]; then + if [[ $__cluster_created -eq 1 && $__cluster_deprovisioned -eq 0 && "$DEPROVISION" == true ]]; then + # prevent double-deprovisioning with ctrl-c during deprovisioning... + __cluster_deprovisioned=1 + echo "Cluster was provisioned already. Deprovisioning it..." + down-test-cluster + fi + exit 1 fi - exit 1 } # test specific config, results location @@ -213,6 +218,22 @@ if [[ $TEST_PASS -eq 0 && "$RUN_CONFORMANCE" == true ]]; then echo "TIMELINE: Conformance tests took $CONFORMANCE_DURATION seconds." fi +if [[ "$RUN_PERFORMANCE_TESTS" == true ]]; then + START=$SECONDS + $KUBECTL_PATH apply -f ./testdata/deploy-130-pods.yaml + run_performance_test_130_pods + $KUBECTL_PATH delete -f ./testdata/deploy-130-pods.yaml + + $KUBECTL_PATH apply -f ./testdata/deploy-730-pods.yaml + run_performance_test_730_pods + $KUBECTL_PATH delete -f ./testdata/deploy-730-pods.yaml + + $KUBECTL_PATH apply -f ./testdata/deploy-5000-pods.yaml + run_performance_test_5000_pods + $KUBECTL_PATH delete -f ./testdata/deploy-5000-pods.yaml + PERFORMANCE_DURATION=$((SECONDS - START)) +fi + if [[ "$DEPROVISION" == true ]]; then START=$SECONDS diff --git a/test/integration/README.md b/test/integration/README.md new file mode 100644 index 0000000000..0d575ea22b --- /dev/null +++ b/test/integration/README.md @@ -0,0 +1,45 @@ +## How to run tests +# All tests + * set AWS_ACCESS_KEY_ID + * set AWS_SECRET_ACCESS_KEY + * set AWS_DEFAULT_REGION (optional, defaults to us-west-2 if not set) + * approve test after build completes +# Performance + * run from cni test account to upload test results + * set RUN_PERFORMANCE_TESTS=true +# KOPS + * set RUN_KOPS_TEST=true + * will occassionally fail/flake tests, try re-running test a couple times to ensure there is a problem + +## Conformance test duration log + +* May 20, 2020: Initial integration step took roughly 3h 41min +* May 27: 3h 1min + * Skip tests labeled as “Slow” for Ginkgo framework + * Timelines: + * Default CNI: 73s + * Updating CNI image: 110s + * Current image integration: 47s + * Conformance tests: 119.167 min (2 hrs) + * Down cluster: 30 min +* May 29: 2h 59min 30s + * Cache dependencies when testing default CNI + * Timelines: + * Docker build: 4 min + * Up test cluster: 31 min + * Default CNI: 50s + * Updating CNI image: 92s + * Current image integration: 17s + * Conformance tests: 114 min (1.9 hrs) + * Down cluster: 30 min +* June 5: 1h 24min 9s + * Parallel execution of conformance tests + * Timelines: + * Docker build: 3 min + * Up test cluster: 31 min + * Default CNI: 52s + * Updating CNI image: 92s + * Current image integration: 18s + * Conformance tests: 16 min + * Down cluster: 30 min + diff --git a/testdata/deploy-130-pods.yaml b/testdata/deploy-130-pods.yaml new file mode 100644 index 0000000000..1f4552d3b3 --- /dev/null +++ b/testdata/deploy-130-pods.yaml @@ -0,0 +1,26 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: deploy-130-pods +spec: + replicas: 1 + selector: + matchLabels: + app: deploy-130-pods + template: + metadata: + name: test-pod-130 + labels: + app: deploy-130-pods + tier: backend + track: stable + spec: + containers: + - name: hello + image: "kubernetes/pause:latest" + ports: + - name: http + containerPort: 80 + imagePullPolicy: IfNotPresent + nodeSelector: + eks.amazonaws.com/nodegroup: three-nodes diff --git a/testdata/deploy-5000-pods.yaml b/testdata/deploy-5000-pods.yaml new file mode 100644 index 0000000000..cb760f81fc --- /dev/null +++ b/testdata/deploy-5000-pods.yaml @@ -0,0 +1,26 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: deploy-5000-pods +spec: + replicas: 1 + selector: + matchLabels: + app: deploy-5000-pods + template: + metadata: + name: test-pod-5000 + labels: + app: deploy-5000-pods + tier: backend + track: stable + spec: + containers: + - name: hello + image: "kubernetes/pause:latest" + ports: + - name: http + containerPort: 80 + imagePullPolicy: IfNotPresent + nodeSelector: + eks.amazonaws.com/nodegroup: multi-node diff --git a/testdata/deploy-730-pods.yaml b/testdata/deploy-730-pods.yaml new file mode 100644 index 0000000000..48db130811 --- /dev/null +++ b/testdata/deploy-730-pods.yaml @@ -0,0 +1,26 @@ +apiVersion: apps/v1 +kind: Deployment +metadata: + name: deploy-730-pods +spec: + replicas: 1 + selector: + matchLabels: + app: deploy-730-pods + template: + metadata: + name: test-pod-730 + labels: + app: deploy-730-pods + tier: backend + track: stable + spec: + containers: + - name: hello + image: "kubernetes/pause:latest" + ports: + - name: http + containerPort: 80 + imagePullPolicy: IfNotPresent + nodeSelector: + eks.amazonaws.com/nodegroup: single-node