Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port new integration tests #1928

Merged
merged 9 commits into from
May 9, 2022
Merged
26 changes: 18 additions & 8 deletions scripts/run-integration-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ set -Euo pipefail
trap 'on_error $? $LINENO' ERR

DIR=$(cd "$(dirname "$0")"; pwd)
INTEGRATION_TEST_DIR="$DIR"/../test/integration

source "$DIR"/lib/common.sh
source "$DIR"/lib/aws.sh
source "$DIR"/lib/cluster.sh
Expand Down Expand Up @@ -110,6 +112,7 @@ if [[ ! -f "$BASE_CONFIG_PATH" ]]; then
fi

# double-check all our preconditions and requirements have been met
check_is_installed ginkgo
check_is_installed docker
check_is_installed aws
check_aws_credentials
Expand Down Expand Up @@ -188,6 +191,11 @@ __cluster_created=1
UP_CLUSTER_DURATION=$((SECONDS - START))
echo "TIMELINE: Upping test cluster took $UP_CLUSTER_DURATION seconds."

# Fetch VPC_ID from created cluster
DESCRIBE_CLUSTER_OP=$(aws eks describe-cluster --name "$CLUSTER_NAME" --region "$AWS_DEFAULT_REGION")
VPC_ID=$(echo "$DESCRIBE_CLUSTER_OP" | jq -r '.cluster.resourcesVpcConfig.vpcId')
echo "Using VPC_ID: $VPC_ID"

echo "Using $BASE_CONFIG_PATH as a template"
cp "$BASE_CONFIG_PATH" "$TEST_CONFIG_PATH"

Expand Down Expand Up @@ -216,11 +224,12 @@ echo "**************************************************************************
echo "Running integration tests on default CNI version, $ADDONS_CNI_IMAGE"
echo ""
START=$SECONDS
pushd ./test/integration
GO111MODULE=on go test -v -timeout 0 ./... --kubeconfig=$KUBECONFIG --ginkgo.focus="\[cni-integration\]" --ginkgo.skip="\[Disruptive\]" \
--assets=./assets

focus="CANARY"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So for now you are just enabling canary tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since there is some flakyness when running all suites at once. We will incrementally expand this. Also this is in alignment to what we have running on prow infra.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this will run for each PR merge and nightly runs so I agree we can do it incrementally but can we take it up as a fast follow up to enable all tests? Since canary tests doesn't cover all scenarios.

echo "Running ginkgo tests with focus: $focus"
(cd "$INTEGRATION_TEST_DIR/cni" && CGO_ENABLED=0 ginkgo --focus="$focus" -v --timeout 20m --failOnPending -- --cluster-kubeconfig="$KUBECONFIG" --cluster-name="$CLUSTER_NAME" --aws-region="$AWS_DEFAULT_REGION" --aws-vpc-id="$VPC_ID" --ng-name-label-key="kubernetes.io/os" --ng-name-label-val="linux")
(cd "$INTEGRATION_TEST_DIR/ipamd" && CGO_ENABLED=0 ginkgo --focus="$focus" -v --timeout 10m --failOnPending -- --cluster-kubeconfig="$KUBECONFIG" --cluster-name="$CLUSTER_NAME" --aws-region="$AWS_DEFAULT_REGION" --aws-vpc-id="$VPC_ID" --ng-name-label-key="kubernetes.io/os" --ng-name-label-val="linux")
TEST_PASS=$?
popd
DEFAULT_INTEGRATION_DURATION=$((SECONDS - START))
echo "TIMELINE: Default CNI integration tests took $DEFAULT_INTEGRATION_DURATION seconds."

Expand Down Expand Up @@ -250,11 +259,12 @@ echo "**************************************************************************
echo "Running integration tests on current image:"
echo ""
START=$SECONDS
pushd ./test/integration
GO111MODULE=on go test -v -timeout 0 ./... --kubeconfig=$KUBECONFIG --ginkgo.focus="\[cni-integration\]" --ginkgo.skip="\[Disruptive\]" \
--assets=./assets

focus="CANARY"
echo "Running ginkgo tests with focus: $focus"
(cd "$INTEGRATION_TEST_DIR/cni" && CGO_ENABLED=0 ginkgo --focus="$focus" -v --timeout 20m --failOnPending -- --cluster-kubeconfig="$KUBECONFIG" --cluster-name="$CLUSTER_NAME" --aws-region="$AWS_DEFAULT_REGION" --aws-vpc-id="$VPC_ID" --ng-name-label-key="kubernetes.io/os" --ng-name-label-val="linux")
(cd "$INTEGRATION_TEST_DIR/ipamd" && CGO_ENABLED=0 ginkgo --focus="$focus" -v --timeout 10m --failOnPending -- --cluster-kubeconfig="$KUBECONFIG" --cluster-name="$CLUSTER_NAME" --aws-region="$AWS_DEFAULT_REGION" --aws-vpc-id="$VPC_ID" --ng-name-label-key="kubernetes.io/os" --ng-name-label-val="linux")
TEST_PASS=$?
popd
CURRENT_IMAGE_INTEGRATION_DURATION=$((SECONDS - START))
echo "TIMELINE: Current image integration tests took $CURRENT_IMAGE_INTEGRATION_DURATION seconds."
if [[ $TEST_PASS -eq 0 ]]; then
Expand Down
2 changes: 1 addition & 1 deletion scripts/test/run-integration-tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

set -eo pipefail

pushd ./test/integration-new
pushd ./test/integration

echo "Running integration test with the following configuration:
KUBECONFIG: $KUBECONFIG
Expand Down
68 changes: 68 additions & 0 deletions test/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,74 @@ Smoke test provide fail early mechanism by failing the test if basic functionali

Ginkgo Focus: [SMOKE]

# Performance
* run from cni test account to upload test results
* set PERFORMANCE_TEST_S3_BUCKET_NAME to the name of the bucket (likely `cni-performance-tests`)
* set RUN_PERFORMANCE_TESTS=true
* to view data graph:
* Go to Isengard and open aws-wesley+vpc-cni-ci-test@amazon.com as admin
* Go to QuickSight and signup with your email email (it does not need an additional password)
* Open dashboards:
* 130-pods test - https://us-west-2.quicksight.aws.amazon.com/sn/dashboards/af137b24-a4c1-4ecd-addb-2056486e2022/views/4facfa4f-4b29-42d7-bdf5-5335d9114533
* 5000-pods test - https://us-west-2.quicksight.aws.amazon.com/sn/dashboards/55b56360-dbc3-4fc4-917a-167249a0eb8c/views/1cb4c112-cea3-4ea2-84f6-32d0436b0711
* 730-pods test - https://us-west-2.quicksight.aws.amazon.com/sn/dashboards/8e10011a-a29f-4218-a62d-691fd41c71f3/views/f78feb6c-f45b-4788-82c4-0fc348e793d0

* NOTE: if running on previous versions, change the date inside of the file to the date of release so as to not confuse graphing order

# KOPS
* set RUN_KOPS_TEST=true
* WARNING: will occassionally fail/flake tests, try re-running test a couple times to ensure there is a

# Bottlerocket
* set RUN_BOTTLEROCKET_TEST=true

# Calico
* set RUN_CALICO_TEST=true

## Conformance test duration log

* May 20, 2020: Initial integration step took roughly 3h 41min
* May 27: 3h 1min
* Skip tests labeled as “Slow” for Ginkgo framework
* Timelines:
* Default CNI: 73s
* Updating CNI image: 110s
* Current image integration: 47s
* Conformance tests: 119.167 min (2 hrs)
* Down cluster: 30 min
* May 29: 2h 59min 30s
* Cache dependencies when testing default CNI
* Timelines:
* Docker build: 4 min
* Up test cluster: 31 min
* Default CNI: 50s
* Updating CNI image: 92s
* Current image integration: 17s
* Conformance tests: 114 min (1.9 hrs)
* Down cluster: 30 min
* June 5: 1h 24min 9s
* Parallel execution of conformance tests
* Timelines:
* Docker build: 3 min
* Up test cluster: 31 min
* Default CNI: 52s
* Updating CNI image: 92s
* Current image integration: 18s
* Conformance tests: 16 min
* Down cluster: 30 min



## How to Manually delete k8s tester Resources (order of deletion)

Cloudformation - (all except cluster, vpc)
EC2 - load balancers, key pair
VPC - Nat gateways, Elastic IPs(after a minute), internet gateway
Cloudformation - cluster
EC2 - network interfaces, security groups
VPC - subnet, route tables
Cloudformation - cluster, vpc(after cluster deletes)
S3 - delete bucket

#### Work In Progress
- Run Upstream Conformance tests as part of Nightly Integration tests.
Expand Down
Loading