Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid cableEngine cleanup when the gateway pod is restarted #2499

Merged
merged 1 commit into from
May 26, 2023

Conversation

sridhargaddam
Copy link
Member

@sridhargaddam sridhargaddam commented May 26, 2023

In the current code, it was seen that when the submariner-gateway pod gets restarted, there is a brief datapath disruption until the new pod comes online. The cleanup code in submariner-gateway pod is not required as its handled by the route-agent code when there is any transition in the active Gateway of the cluster.

This PR removes the cleanup code from submariner-gateway pod which fixes the issue with VxLAN Cable Driver, but Libreswan Cable Driver seems to have some other issue[s] which needs to be addressed separately.

Screenshot from 2023-05-26 17-01-59

Partially fixes: #2498
Signed-off-by: Sridhar Gaddam sgaddam@redhat.com

In the current code, it was seen that when the submariner-gateway
pod gets restarted, there is a brief datapath disruption until the
new pod comes online. The cleanup code in submariner-gateway pod is
not required as its handled by the route-agent code when there is
any transition in the Gateway nodes.

Partially fixes: submariner-io#2498
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
@submariner-bot
Copy link
Contributor

🤖 Created branch: z_pr2499/sridhargaddam/issue-2498
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

@sridhargaddam
Copy link
Member Author

@tpantelis @skitt @aswinsuryan @yboaron I verified this PR couple of times in a KIND deployment and captured my observations in the commit message. I haven't got a chance to verify it on an OCP deployment. Do you think we should validate this once (although the fix looks valid and is reasonable) even with OCP before merging. Any thoughts?

@skitt
Copy link
Member

skitt commented May 26, 2023

Do you think we should validate this once (although the fix looks valid and is reasonable) even with OCP before merging.

Given that the change was introduced for OVNK, I think we should verify it on OCP with OVNK before merging.

@skitt
Copy link
Member

skitt commented May 26, 2023

Ah, I was basing my comment above only on the notification email, which didn’t include the nice table in the description. I see you’ve already tested this with OVNK on kind, so I don’t think testing on OCP is absolutely necessary.

@submariner-bot submariner-bot added the ready-to-test When a PR is ready for full E2E testing label May 26, 2023
@skitt skitt merged commit 8f50f29 into submariner-io:devel May 26, 2023
@submariner-bot
Copy link
Contributor

🤖 Closed branches: [z_pr2499/sridhargaddam/issue-2498]

@sridhargaddam
Copy link
Member Author

Backport PRs:
release-0.15: #2504
release-0.14: #2518
release-0.13: #2519

sridhargaddam added a commit to sridhargaddam/submariner that referenced this pull request Jun 2, 2023
skitt added a commit to sridhargaddam/submariner that referenced this pull request Jun 2, 2023
tpantelis added a commit to sridhargaddam/submariner that referenced this pull request Jun 2, 2023
@dfarrell07 dfarrell07 added the release-note-needed Should be mentioned in the release notes label Jun 5, 2023
sridhargaddam added a commit to sridhargaddam/submariner-website that referenced this pull request Jul 5, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
sridhargaddam added a commit to sridhargaddam/submariner-website that referenced this pull request Jul 6, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
skitt pushed a commit to skitt/submariner-website that referenced this pull request Jul 10, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
skitt pushed a commit to skitt/submariner-website that referenced this pull request Jul 10, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
skitt pushed a commit to submariner-io/submariner-website that referenced this pull request Jul 10, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
dfarrell07 pushed a commit to dfarrell07/submariner-website that referenced this pull request Oct 18, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
dfarrell07 pushed a commit to dfarrell07/submariner-website that referenced this pull request Oct 18, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
dfarrell07 pushed a commit to dfarrell07/submariner-website that referenced this pull request Oct 19, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
tpantelis pushed a commit to submariner-io/submariner-website that referenced this pull request Oct 22, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
tpantelis pushed a commit to tpantelis/submariner-website that referenced this pull request Nov 7, 2023
Includes the release notes for the following fixes.

* Submariner now uses case-insensitive comparison while parsing
  CNI names.
* subctl gather now collects Metrics proxy pod logs in a Globalnet
  deployment.
* Submariner Gateway pod now skips invoking cableEngine cleanup during
  termination, as this is handled by the Route agent during gateway migration.
* Fixed issue which caused the IPsec pluto process to crash when the remote
  endpoint was unstable.
* Submariner now handles out-of-order remote endpoint notifications properly
  in Globalnet component.

Related to: submariner-io/submariner#2486
Related to: submariner-io/subctl#770
Related to: submariner-io/submariner#2499
Related to: submariner-io/submariner#2517
Related to: submariner-io/submariner#2532
Signed-off-by: Sridhar Gaddam <sgaddam@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport This change requires a backport to eligible release branches backport-handled ready-to-test When a PR is ready for full E2E testing release-note-handled release-note-needed Should be mentioned in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gateway pod restart is causing datapath disruption
6 participants