Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

re-enable all node flaky test, move node performance test to new job #23147

Merged
merged 1 commit into from
Aug 11, 2021

Conversation

manugupt1
Copy link
Contributor

From: #19352

Filtering of test runs in perf-image-config.yaml means only specific
tests matching "Node Performance Testing" run in flaky job.

Move flaky test job back to general image-config.yaml.
Create new job config specific to "Node Performance Testing".
Decrease ci interval of new "Node Performance Testing" job to 12h
instead of 2h.

Note: I was not able to check if the dashboard was created or not.

@k8s-ci-robot
Copy link
Contributor

Welcome @manugupt1!

It looks like this is your first PR to kubernetes/test-infra 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/test-infra has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 5, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @manugupt1. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added area/config Issues or PRs related to code in /config area/jobs sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Aug 5, 2021
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Aug 5, 2021
@manugupt1
Copy link
Contributor Author

manugupt1 commented Aug 5, 2021

/assign @SergeyKanzhelev
/assign @ehashman

preset-k8s-ssh: "true"
spec:
containers:
- image: gcr.io/k8s-testimages/kubekins-e2e:v20210721-2b77449-master
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the image; compared to previous PR

@SergeyKanzhelev
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Aug 5, 2021
@@ -126,7 +126,7 @@ periodics:
- --deployment=node
- --gcp-project-type=node-e2e-project
- --gcp-zone=us-west1-b
- --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/perf-image-config.yaml
- --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/image-config.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, this is why perf tests were executing:

tests:
- 'Node Performance Testing'

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh got it! thanks a lot.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is very illogical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the internal config pointed to Node Performance Testing. Something that I should have gone down a level deeper.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 5, 2021
@manugupt1
Copy link
Contributor Author

/assign @mrunalp

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay with merging this and finding out for the perf test stuff but let's fix the names.

@@ -531,3 +531,33 @@ periodics:
testgrid-tab-name: kubelet-gce-e2e-swap-fedora
testgrid-alert-email: ehashman@redhat.com, ikema@google.com
description: Executes E2E suite with swap enabled on Fedora

- name: ci-kubernetes-node-kubelet-node-performance-test
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- name: ci-kubernetes-node-kubelet-node-performance-test
- name: ci-kubernetes-node-kubelet-performance-test

value: /go
annotations:
testgrid-dashboards: sig-node-kubelet
testgrid-tab-name: node-performance-testing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
testgrid-tab-name: node-performance-testing
testgrid-tab-name: node-performance-test

- image: gcr.io/k8s-testimages/kubekins-e2e:v20210721-2b77449-master
args:
- --repo=k8s.io/kubernetes=master
- --timeout=90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these timeouts make sense? These tests are failing consistently but I don't know if that has anything to do with the runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not have timeouts when tests start passing to make sure that they do not take forever. I think it might be better to let them fail after a time regardless, I do not know what the correct value is though.

- --deployment=node
- --gcp-project-type=node-e2e-project
- --gcp-zone=us-west1-b
- --node-args=--image-config-file=/workspace/test-infra/jobs/e2e_node/perf-image-config.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the perf-image-config do we need to specify the focus below?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we do not, thanks for catching that.

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 11, 2021
- --node-test-args= --kubelet-flags="--cgroups-per-qos=true --cgroup-root=/" --server-start-timeout=420s
- --node-tests=true
- --provider=gce
- --test_args=--nodes=1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a stray "

Suggested change
- --test_args=--nodes=1"
- --test_args=--nodes=1

From: kubernetes#19352

Filtering of test runs in perf-image-config.yaml means only specific
tests matching "Node Performance Testing" run in flaky job.

Move flaky test job back to general image-config.yaml.
Create new job config specific to "Node Performance Testing".
Decrease ci interval of new "Node Performance Testing" job to 12h
instead of 2h.
Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 11, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ehashman, manugupt1, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 2e2328e into kubernetes:master Aug 11, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Aug 11, 2021
@k8s-ci-robot
Copy link
Contributor

@manugupt1: Updated the job-config configmap in namespace default at cluster test-infra-trusted using the following files:

  • key node-kubelet.yaml using file config/jobs/kubernetes/sig-node/node-kubelet.yaml

In response to this:

From: #19352

Filtering of test runs in perf-image-config.yaml means only specific
tests matching "Node Performance Testing" run in flaky job.

Move flaky test job back to general image-config.yaml.
Create new job config specific to "Node Performance Testing".
Decrease ci interval of new "Node Performance Testing" job to 12h
instead of 2h.

Note: I was not able to check if the dashboard was created or not.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/config Issues or PRs related to code in /config area/jobs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants