Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CNF-8553: add a test for applying the firewall by the commatrix that created by the endpointslice and make sure that will not affect the behavior of the cluster #28935

Closed
wants to merge 8 commits into from

Conversation

aabughosh
Copy link

@aabughosh aabughosh commented Jul 11, 2024

add a test for applying the firewall by the commatrix that created by the endpointslice and make sure that will not affect the behavior of the cluster
jira link https://issues.redhat.com/browse/CNF-8553

@openshift-ci openshift-ci bot requested review from deads2k and trozet July 11, 2024 09:31
@openshift-ci openshift-ci bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. vendor-update Touching vendor dir or related files labels Jul 11, 2024
Copy link
Contributor

openshift-ci bot commented Jul 11, 2024

Hi @aabughosh. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-merge-robot openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jul 16, 2024
@sosiouxme
Copy link
Member

/ok-to-test

@openshift-ci openshift-ci bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 23, 2024
@sosiouxme
Copy link
Member

administrative:

  • looks like you at least need to run hack/update-generated.sh to update generated tests.
  • could you link a jira or other justification?
  • could changes be consolidated into one or a few commits?

content concerns:

  • are all the dependency updates really necessary?
  • is there not already a function like isSNOCluster?
  • this sets up a firewall, should it not also tear it down at some point?
  • should this make an exclusion for microshift which lacks the infrastructure api?

@aabughosh
Copy link
Author

  • could changes be consolidat
    are all the dependency updates really necessary? i need the commatrix repo i ran go mod vendor
    is there not already a function like isSNOCluster? i didnt check the code but i created mine
    this sets up a firewall, should it not also tear it down at some point? thats test adding all the used ports to the nft table and drop the other, first we are seeing all the used ones by the endpointslice resource
    should this make an exclusion for microshift which lacks the infrastructure api? mm im not sure

@aabughosh
Copy link
Author

/retest-required

@aabughosh
Copy link
Author

/retest

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 7b9c174

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade Medium
[sig-network] pods should successfully create sandboxes by adding pod to network
This test has passed 95.33% of 5096 runs on release 4.17 [Overall] in the last week.

Open Bugs
s390x: [sig-network] pods should successfully create sandboxes by adding pod to network fails with error adding pod to CNI network
High rate of pod sandbox errors detected on metal

@sosiouxme sosiouxme changed the title add a test for applying the firewall by the commatrix that created by the endpointslice and make sure that will not affect the behavior of the cluster CNF-8553: add a test for applying the firewall by the commatrix that created by the endpointslice and make sure that will not affect the behavior of the cluster Jul 30, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 30, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jul 30, 2024

@aabughosh: This pull request references CNF-8553 which is a valid jira issue.

In response to this:

add a test for applying the firewall by the commatrix that created by the endpointslice and make sure that will not affect the behavior of the cluster
jira link https://issues.redhat.com/browse/CNF-8553

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@aabughosh
Copy link
Author

/retest

@aabughosh
Copy link
Author

/retest-required

@aabughosh
Copy link
Author

/retest

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: bc5f243

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-openstack-ovn IncompleteTests
Tests for this run (17) are below the historical average (1804): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-ipsec-serial Medium
[bz-openshift-apiserver] clusteroperator/openshift-apiserver should not change condition/Available
This test has passed 92.60% of 1135 runs on release 4.18 [Overall] in the last week.

@aabughosh
Copy link
Author

/retest

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 378a78c

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-openstack-ovn IncompleteTests
Tests for this run (18) are below the historical average (1754): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn-upgrade IncompleteTests
Tests for this run (14) are below the historical average (850): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-gcp-ovn IncompleteTests
Tests for this run (14) are below the historical average (2064): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd IncompleteTests
Tests for this run (14) are below the historical average (885): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

… the endpointslice and make sure that will not affect the behavior of the cluster

Update apply_commatrix_firewall.go

watchevents: add cert rotation events

library-go emits events when cert rotation happens, these should be displayed as interesting events

OCPBUGS-36672: Shorten stabilization time for etcd profiles test

Revert "Revert "Fail on APIs removed in the next release""

This reverts commit 387f417.

Move all ClusterVersion related invocations to common helpers

Drop unnecessary check in mustgather test

Fail on removed APIs based on cluster version

Fix vSphereDriverConfiguration tests

We no longer store vSphere configuration in a ConfigMap.

IR-471: Add test for ChunkSizeMiB configuration for Registry

egressfirewall: skip ping tests in case of hypershift kubevirt on Azure infra

In case the cluster under test is an HCP/HyperShift cluster of the kubevirt provider, and the management cluster
is running on Azure, ICMP responses from outside of the cluster to the guest cluster are getting blocked.
Thus, skipping the ping tests in that case, in addition to the existing exception.

Signed-off-by: Oren Cohen <ocohen@redhat.com>

upkeep: add better logging for crio failures

update the logging for workload partitioning to include namespace/pod/container when failures occur for cpu affinity

Signed-off-by: ehila <ehila@redhat.com>

rebase

use the nft creater code and apply the nft file on nodes

Aligned the code with the commatrix repo

Delete raw-ss-udp

update the generated file

removing writing to file

add rules to be available also after reboot nodes

Update commatrix.go

Update commatrix.go

Update commatrix.go

Update commatrix.go

add list nft rules then save them in the node

resolve comments

Update zz_generated.annotations.go

save to artifact

update the commatrix code

update test name

run go mod vendor

few updates to the get infra and resolve comments

Update commatrix.go
update comatrix code
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 8, 2024
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 8, 2024
@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 094cbce

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-kube-apiserver-rollout IncompleteTests
Tests for this run (100) are below the historical average (195): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-kube-apiserver-rollout IncompleteTests
Tests for this run (102) are below the historical average (143): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-edge-zones IncompleteTests
Tests for this run (101) are below the historical average (1637): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-kubevirt Medium
[sig-sippy] infrastructure should work
This test has passed 87.19% of 1788 runs on release 4.18 [Overall] in the last week.
pull-ci-openshift-origin-master-e2e-baremetalds-kubevirt Low
[sig-sippy] infrastructure should work
This test has passed 0.00% of 1 runs on release 4.18 [Architecture:amd64 FeatureSet:default Installer:ipi Network:sdn NetworkStack:ipv4 Platform:azure SecurityMode:default Topology:ha Upgrade:none] in the last week.

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 12af59f

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-kube-apiserver-rollout IncompleteTests
Tests for this run (100) are below the historical average (184): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-aws-ovn-edge-zones IncompleteTests
Tests for this run (100) are below the historical average (1587): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-baremetalds-kubevirt Medium
[sig-sippy] infrastructure should work
This test has passed 87.65% of 1984 runs on release 4.18 [Overall] in the last week.
pull-ci-openshift-origin-master-e2e-aws-ovn-kubevirt Medium
[sig-sippy] infrastructure should work
This test has passed 87.65% of 1984 runs on release 4.18 [Overall] in the last week.

Copy link
Contributor

openshift-ci bot commented Aug 8, 2024

@aabughosh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-kubevirt 813a4e5 link false /test e2e-aws-ovn-kubevirt
ci/prow/e2e-aws-ovn-kube-apiserver-rollout 019928f link false /test e2e-aws-ovn-kube-apiserver-rollout
ci/prow/e2e-metal-ipi-ovn-kube-apiserver-rollout 019928f link false /test e2e-metal-ipi-ovn-kube-apiserver-rollout
ci/prow/e2e-baremetalds-kubevirt 813a4e5 link false /test e2e-baremetalds-kubevirt
ci/prow/e2e-aws-ovn-single-node-upgrade 019928f link false /test e2e-aws-ovn-single-node-upgrade
ci/prow/e2e-aws-ovn-ipsec-serial 019928f link false /test e2e-aws-ovn-ipsec-serial

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-trt-bot
Copy link

Job Failure Risk Analysis for sha: 019928f

Job Name Failure Risk
pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-kube-apiserver-rollout IncompleteTests
Tests for this run (100) are below the historical average (169): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)
pull-ci-openshift-origin-master-e2e-baremetalds-kubevirt Medium
[sig-sippy] infrastructure should work
This test has passed 87.93% of 2162 runs on release 4.18 [Overall] in the last week.
pull-ci-openshift-origin-master-e2e-aws-ovn-kubevirt Medium
[sig-sippy] infrastructure should work
This test has passed 87.93% of 2162 runs on release 4.18 [Overall] in the last week.

)

var _ = g.Describe("[sig-network][Feature:commatrix][Serial]", func() {
g.It("should apply firewall by blocking all ports except the ones OCP is listening on", func() {
Copy link
Contributor

@danwinship danwinship Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This leaves the cluster in a modified state after it runs; this will break tests that run after it that assume they can connect to random hostports, etc.

But obviously, just applying and then removing the rules wouldn't test much of anything either.

What we need is not a firewalling test case but rather a separate firewalling test job. As I suggested in the enhancement PR, I think the right procedure is:

  1. install the cluster
  2. apply the firewall
  3. upgrade

We probably want one job that does a "same-version" upgrade (ie, from the current release version to a fresh build that is basically identical but has a different version number), and eventually also have y-stream upgrade tests (where it would apply a firewall that allows everything needed by either release before doing the upgrade).

These should be periodics rather than being presubmits associated with any particular repo, because any component could potentially break things by adding a new port dependency.

And we shouldn't try to run openshift-tests as part of this job, because too much of it will randomly fail.

@aabughosh aabughosh closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. vendor-update Touching vendor dir or related files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants