Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle ExternalIPPool range changes in Egress controller #6685

Merged

Conversation

antoninbas
Copy link
Contributor

@antoninbas antoninbas commented Sep 23, 2024

The validation webhook for ExternalIPPools prevents modifications to IP address ranges. However, it is theoretically possible for an ExternalIPPool to be deleted, then recreated immediately with different IP address ranges, and for the Egress controller to only process the Egress resources referencing the pool once, after the CREATE event has been handled by the ExternalIPPool controller. This is because the ExternalIPPool controller is in charge of notifying the Egress controller through a callback (event handler) mechanism, and that multiple events for the same ExternalIPPool name can be merged in the workqueue.

With this change, we try to ensure the same "observable" behavior for the Egress controller, regardless of whether the DELETE and CREATE events have been merged or not. The stale Egress IP should be removed, and a new Egress IP should be allocated from the new IP ranges (regardless of whether or not the initial Egress IP was specified by the user).

Prior to this change, the change of IP address ranges would be silently ignored by the Egress controller.

@antoninbas antoninbas added kind/bug Categorizes issue or PR as related to a bug. area/transit/egress Issues or PRs related to Egress (SNAT for traffic egressing the cluster). labels Sep 23, 2024
@antoninbas antoninbas requested review from tnqn and xliuxu September 23, 2024 20:45
@antoninbas antoninbas force-pushed the handle-external-ip-pool-range-changes branch from 7aca6af to b93c832 Compare September 23, 2024 20:54
xliuxu
xliuxu previously approved these changes Sep 24, 2024
Copy link
Contributor

@xliuxu xliuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one nit

require.True(t, cache.WaitForCacheSync(stopCh, controller.externalIPAllocator.HasSynced))
controller.restoreIPAllocations([]*v1beta1.Egress{egress})

getEgressIP, _, err := controller.syncEgressIP(egress)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could it receive the returned egress and avoid updating the stored egress at L613 which may cause race condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the change. By stored egress, do you mean the object stored by the fake clientset? They make a copy before adding it to the store:
https://github.com/kubernetes/client-go/blob/c5196ebcc18e1bf29561b7a689fdb121913d221b/testing/fixture.go#L534-L537

but based on the comment, the caller shouldn't modify the object anyway

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info.

The validation webhook for ExternalIPPools prevents modifications to IP
address ranges. However, it is theoretically possible for an
ExternalIPPool to be deleted, then recreated immediately with different
IP address ranges, and for the Egress controller to only process the
Egress resources referencing the pool once, *after* the CREATE event has
been handled by the ExternalIPPool controller. This is because the
ExternalIPPool controller is in charge of notifying the Egress
controller through a callback (event handler) mechanism, and that
multiple events for the same ExternalIPPool name can be merged in the
workqueue.

With this change, we try to ensure the same "observable" behavior for
the Egress controller, regardless of whether the DELETE and CREATE
events have been merged or not. The stale Egress IP should be removed,
and a new Egress IP should be allocated from the new IP ranges
(regardless of whether or not the initial Egress IP was specified by the
user).

Prior to this change, the change of IP address ranges would be silently
ignored by the Egress controller.

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
@antoninbas antoninbas force-pushed the handle-external-ip-pool-range-changes branch from b93c832 to fc16420 Compare September 24, 2024 20:19
@antoninbas antoninbas requested a review from tnqn September 24, 2024 20:21
@antoninbas antoninbas added the action/release-note Indicates a PR that should be included in release notes. label Sep 24, 2024
@antoninbas
Copy link
Contributor Author

I think we don't need to backport this given how unlikely it is to happen in practice

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
@antoninbas
Copy link
Contributor Author

@tnqn I found a bug in my implementation, so I addressed it in the last commit. I also found a better / simpler way of writing the test.

@antoninbas
Copy link
Contributor Author

/test-all

xliuxu
xliuxu previously approved these changes Sep 25, 2024
tnqn
tnqn previously approved these changes Sep 25, 2024
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, a typo

pkg/controller/egress/controller_test.go Outdated Show resolved Hide resolved
Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
@antoninbas antoninbas dismissed stale reviews from tnqn and xliuxu via d325b51 September 25, 2024 16:58
@antoninbas antoninbas requested a review from tnqn September 25, 2024 16:59
@antoninbas
Copy link
Contributor Author

/test-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit 903fd99 into antrea-io:main Sep 26, 2024
55 of 58 checks passed
@antoninbas antoninbas deleted the handle-external-ip-pool-range-changes branch September 26, 2024 16:59
hangyan pushed a commit to hangyan/antrea that referenced this pull request Oct 29, 2024
)

The validation webhook for ExternalIPPools prevents modifications to IP
address ranges. However, it is theoretically possible for an
ExternalIPPool to be deleted, then recreated immediately with different
IP address ranges, and for the Egress controller to only process the
Egress resources referencing the pool once, *after* the CREATE event has
been handled by the ExternalIPPool controller. This is because the
ExternalIPPool controller is in charge of notifying the Egress
controller through a callback (event handler) mechanism, and that
multiple events for the same ExternalIPPool name can be merged in the
workqueue.

With this change, we try to ensure the same "observable" behavior for
the Egress controller, regardless of whether the DELETE and CREATE
events have been merged or not. The stale Egress IP should be removed,
and a new Egress IP should be allocated from the new IP ranges
(regardless of whether or not the initial Egress IP was specified by the
user).

Prior to this change, the change of IP address ranges would be silently
ignored by the Egress controller.

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes. area/transit/egress Issues or PRs related to Egress (SNAT for traffic egressing the cluster). kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants