Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade FRR to version 10.0.1, upgrade libyang2 to 2.1.148. #20269

Merged
merged 2 commits into from
Dec 4, 2024

Conversation

sudhanshukumar22
Copy link
Contributor

@sudhanshukumar22 sudhanshukumar22 commented Sep 16, 2024

[component/folder touched]: Description intent of your changes
FRR upgrade in community from 8.5.4 to 10.0.1. Upgraded libyang to 2.1.148.
Tested using BGP docker, BGP neighborship with route learning worked fine.

[List of changes]
applied patch
0001-Reduce-severity-of-Vty-connected-from-message.patch
0002-Allow-BGP-attr-NEXT_HOP-to-be-0.0.0.0-due-to-allevia.patch
0003-nexthops-compare-vrf-only-if-ip-type.patch
0004-frr-remove-frr-log-outchannel-to-var-log-frr.log.patch
0005-Add-support-of-bgp-l3vni-evpn.patch
0006-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch
0007-ignore-route-from-default-table.patch
0021-Disable-ipv6-src-address-test-in-pceplib.patch
0022-cross-compile-changes.patch
0028-zebra-fix-parse-attr-problems-for-encap.patch
0036-zebra-backpressure-Fix-Null-ptr-access-Coverity-Issu.patch
0038-zebra-Actually-display-I-O-buffer-sizes.patch
0039-zebra-Actually-display-I-O-buffer-sizes-part-2.patch
0041-bgpd-backpressure-Fix-to-avoid-CPU-hog.patch
0043-zebra-Use-the-ctx-queue-counters.patch
0044-zebra-Modify-dplane-loop-to-allow-backpressure-to-fi.patch
0045-zebra-Limit-queue-depth-in-dplane_fpm_nl.patch
0047-bgpd-backpressure-fix-evpn-route-sync-to-zebra.patch
0049-bgpd-backpressure-Improve-debuggability.patch
0051-bgpd-backpressure-fix-ret-value-evpn_route_select_in.patch
0052-bgpd-backpressure-log-error-for-evpn-when-route-inst.patch

Modified patches:
0008-Use-vrf_id-for-vrf-not-tabled_id.patch
0010-bgpd-Change-log-level-for-graceful-restart-events.patch
0025-bgp-community-memory-leak-fix.patch
0030-zebra-backpressure-Zebra-push-back-on-Buffer-Stream-.patch
0031-bgpd-backpressure-Add-a-typesafe-list-for-Zebra-Anno.patch
0033-bgpd-backpressure-cleanup-bgp_zebra_XX-func-args.patch
0034-gpd-backpressure-Handle-BGP-Zebra-Install-evt-Creat.patch
0035-bgpd-backpressure-Handle-BGP-Zebra-EPVN-Install-evt-.patch
0037-bgpd-Increase-install-uninstall-speed-of-evpn-vpn-vn.patch
0040-bgpd-backpressure-Fix-to-withdraw-evpn-type-5-routes.patch
0042-zebra-Use-built-in-data-structure-counter.patch
0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch
0048-bgpd-backpressure-fix-to-properly-remove-dest-for-bg.patch
0050-bgpd-backpressure-Avoid-use-after-free.patch

deleted patches:
0009-bgpd-Ensure-suppress-fib-pending-works-with-network-.patch
0011-zebra-Static-routes-async-notification-do-not-need-t.patch
0012-zebra-Rename-vrf_lookup_by_tableid-to-zebra_vrf_look.patch
0013-zebra-Move-protodown_r_bit-to-a-better-spot.patch
0014-zebra-Remove-unused-dplane_intf_delete.patch
0015-zebra-Remove-unused-add-variable.patch
0016-zebra-Remove-duplicate-function-for-netlink-interfac.patch
0017-zebra-Add-code-to-get-set-interface-to-pass-up-from-.patch
0018-zebra-Use-zebra-dplane-for-RTM-link-and-addr.patch
0019-zebra-remove-duplicated-nexthops-when-sending-fpm-msg.patch
0020-zebra-Fix-non-notification-of-better-admin-won.patch
0023-zebra-The-dplane_fpm_nl-return-path-leaks-memory.patch
0024-lib-use-snmp-s-large-fd-sets-for-agentx.patch
0026-bgp-fib-suppress-announce-fix.patch
0027-lib-Do-not-convert-EVPN-prefixes-into-IPv4-IPv6-if-n.patch
0029-zebra-nhg-fix-on-intf-up.patch
0032-bgpd-fix-flushing-ipv6-flowspec-entries-when-peering.patch
0053-bgpd-Set-md5-TCP-socket-option-for-outgoing-connections-on-listener.patch

Tests performed: Tested BGP neighborship formation and route learning using BGP docker

Signed-off-by: sudhanshu.kumar@broadcom.com

Copy link

linux-foundation-easycla bot commented Sep 16, 2024

CLA Signed

The committers listed above are authorized under a signed CLA.

@sudhanshukumar22 sudhanshukumar22 force-pushed the frr_10.0.1_upgrade branch 3 times, most recently from 8b65364 to 41c2e20 Compare September 18, 2024 07:08
@sudhanshukumar22
Copy link
Contributor Author

sudhanshukumar22 commented Sep 18, 2024

I see that all other builds except vs build is successful. This means FRR patches were OK there. But in VS build, there are whitespace errors. Can we avoid whitespace errors in VS build. Probably the git version in VS build is old one ?

24-09-18T13:04:08.3140753Z mv frr-pythontools_10.0.1-sonic-0_all.deb frr-dbgsym_10.0.1-sonic-0_amd64.deb frr-snmp_10.0.1-sonic-0_amd64.deb frr-snmp-dbgsym_10.0.1-sonic-0_amd64.deb frr_10.0.1-sonic-0_amd64.deb /sonic/target/debs/buster/
2024-09-18T13:04:08.3141278Z /sonic/src/sonic-frr/frr /sonic/src/sonic-frr
2024-09-18T13:04:08.3141639Z fatal: A branch named 'frr/10.0.1' already exists.
2024-09-18T13:04:08.3141882Z Switched to branch 'frr/10.0.1'
2024-09-18T13:04:08.3142088Z Your branch is up to date with 'origin/frr/10.0.1'.
2024-09-18T13:04:08.3142257Z :80: trailing whitespace.
2024-09-18T13:04:08.3142429Z void bgp_zebra_announce(struct bgp_dest *dest, struct bgp_path_info *info,
2024-09-18T13:04:08.3142606Z :106: space before tab in indent.
2024-09-18T13:04:08.3142737Z }
2024-09-18T13:04:08.3142863Z :211: trailing whitespace.
2024-09-18T13:04:08.3142987Z
2024-09-18T13:04:08.3143117Z warning: 3 lines add whitespace errors.
2024-09-18T13:04:08.3143262Z :499: trailing whitespace.
2024-09-18T13:04:08.3143387Z
2024-09-18T13:04:08.3143511Z warning: 1 line adds whitespace errors.
2024-09-18T13:04:08.3143659Z :61: space before tab in indent.
2024-09-18T13:04:08.3143946Z * for this VNI.
2024-09-18T13:04:08.3144084Z :62: space before tab in indent.
2024-09-18T13:04:08.3144210Z */
2024-09-18T13:04:08.3144341Z :93: space before tab in indent.
2024-09-18T13:04:08.3144464Z */
2024-09-18T13:04:08.3144614Z warning: 3 lines add whitespace errors.
2024-09-18T13:04:08.3144915Z gbp:info: Changelog last touched at '3fbd709d888ab94db178e44a5b9d67c3653e0b17'
2024-09-18T13:04:08.3145158Z gbp:info: Changelog committed for version 10.0.1-sonic-0

@sudhanshukumar22 sudhanshukumar22 force-pushed the frr_10.0.1_upgrade branch 4 times, most recently from f5d881c to 6471cba Compare September 19, 2024 21:11
@sudhanshukumar22
Copy link
Contributor Author

/azp run Azure.sonic-buildimage

Copy link

Commenter does not have sufficient privileges for PR 20269 in repo sonic-net/sonic-buildimage

@sudhanshukumar22 sudhanshukumar22 force-pushed the frr_10.0.1_upgrade branch 4 times, most recently from 941cc35 to 7103396 Compare October 3, 2024 17:32
@dgsudharsan dgsudharsan self-requested a review October 4, 2024 01:33
@sudhanshukumar22
Copy link
Contributor Author

@cscarpitta , @liushilongbuaa @kperumalbfn can you check why elastic tests are failing though builds are going fine.

@sudhanshukumar22
Copy link
Contributor Author

/azpw ms_conflict

@sudhanshukumar22
Copy link
Contributor Author

/azpw ms_checker

@sudhanshukumar22
Copy link
Contributor Author

@StormLiangMS Can you please help fix the ms_conflict?

Copy link
Collaborator

@dgsudharsan dgsudharsan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sudhanshukumar22 There is no FRR submodule update in this PR. Can you please update submodule?

@@ -67,7 +67,9 @@ RUN apt-get install -y net-tools \
# For libkrb5-dev
comerr-dev \
libgssrpc4 \
libkdb5-10
libkdb5-10 \
libprotobuf-c-dev \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these changes required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we upgraded the FRR to 10.0.1, it needs these new packages to compile.

@@ -2,10 +2,10 @@
SHELL = /bin/bash
.SHELLFLAGS += -e

LIBYANG_URL = https://sonicstorage.blob.core.windows.net/debian/pool/main/liby/libyang
LIBYANG_URL = https://deb.debian.org/debian/pool/main/liby/libyang2
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this URL change required? Should we rather update sonicstorage to cache libyang2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, libyang2 was not present in the sonicstorage repository. So, we had to change the path. I don't know how to update the sonic storage path with these libraries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Sudharasan, SONIC should not be grabbing libyang2 from an outside repository as that you have no ability to keep the version the same or control it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


DSC_FILE = libyang2_$(LIBYANG2_FULLVERSION).dsc
ORIG_FILE = libyang2_$(LIBYANG2_VERSION).orig.tar.gz
ORIG_FILE = libyang2_$(LIBYANG2_VERSION).orig.tar.xz
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the file format change expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the file present in repository is libyang2_2.1.148.orig.tar.xz

#The package libyang2.1.148 is taken from debian trixie, which only has dpkg-dev version 1.21.22
#The bullseye package has dpkg-dev version 1.20.13
#The VS package has dpkg-dev version 1.19.8
sed -i 's/dpkg-dev (>= 1.22.5)/dpkg-dev (>= 1.19.8)/' debian/control
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right way to do it. It changes both for bullseye and bookworm. Can we be specific here so that this change is applicable only for bullseye

Copy link
Contributor Author

@sudhanshukumar22 sudhanshukumar22 Oct 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current master branch has been built for bookworm platform. But, I found during compilation that we are compiling FRR for bookworm version of debian, bullseye version and VS platform also. Hence, we need to take care for all 3 versions.

@@ -6,30 +6,12 @@
0006-Link-local-scope-was-not-set-while-binding-socket-for-bgp-ipv6-link-local-neighbors.patch
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure we change the patch number to align with the removed patches?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to keep the old patches(which are no longer applicable) so that upgrading from the earlier FRR will be easy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have discussed in the meeting, we have deleted all the patches which are not needed now. Also, the ordering stays as is.

@@ -27,12 +34,12 @@ index b1f8f19594..bb3cd62950 100644
- zebra_announce_del(&bm->zebra_announce_head, dest);
}
}

bgp_evpn_remote_ip_hash_destroy(vpn);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change introduced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch was failing to be applied in its current shape. So, I had to manually apply the existing patch and recreate the patch file backwards.
Note that bgp_evpn_remote_ip_hash_destroy(vpn); is an existing code. It has not been added by me. This line has been autogenerated during patch creation by git diff command,

bgp_evpn_remote_ip_hash_destroy(vpn);
bgp_evpn_vni_es_cleanup(vpn);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please clarify why is this change introduced?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch was failing to be applied in its current shape. So, I had to manually apply the existing patch and recreate the patch file backwards.
Note that bgp_evpn_vni_es_cleanup(vpn); is an existing code. It has not been added by me. This line has been autogenerated during patch creation by git diff command,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, see in line number 34, we have a space. In the new patch apply, we have a strict whitespace check.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not add code that was not in the original patch. You are asking for trouble.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't added this code. This is an auto generated code that is added by git format-patch as part of hunk formation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. The git diff is little confusing since it doesn't differentiate between the + in the patch and + in git diff

@@ -630,6 +630,7 @@ static inline struct nexthop_group *rib_get_fib_backup_nhg(
}

@@ -628,6 +628,7 @@ extern int rib_add_gr_run(afi_t afi, vrf_id_t vrf_id, uint8_t proto,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need it as part of this patch?

Copy link
Contributor Author

@sudhanshukumar22 sudhanshukumar22 Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is autogenerated. This file is not modified by me in the patch. This is an existing patch, but failed to apply cleanly because of changes in the files. I manually applied them and created the patch backwards, Probably the functions must have been moved in the newer files. So, the patch is showing a new line 628.

Old patch file (0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch)
diff --git a/zebra/rib.h b/zebra/rib.h
index 2e62148ea0..b78cd218f6 100644
--- a/zebra/rib.h
+++ b/zebra/rib.h
@@ -630,6 +630,7 @@ static inline struct nexthop_group *rib_get_fib_backup_nhg(
}

extern void zebra_vty_init(void);
+extern uint32_t zebra_rib_dplane_results_count(void);

extern pid_t pid;

New patch file(0046-zebra-Modify-show-zebra-dplane-providers-to-give-mor.patch)
diff --git a/zebra/rib.h b/zebra/rib.h
index a721f4bac..15f877b66 100644
--- a/zebra/rib.h
+++ b/zebra/rib.h
@@ -628,6 +628,7 @@ extern int rib_add_gr_run(afi_t afi, vrf_id_t vrf_id, uint8_t proto,
uint8_t instance);

extern void zebra_vty_init(void);
+extern uint32_t zebra_rib_dplane_results_count(void);

extern pid_t pid;

From: Rajasekar Raja <rajasekarr@nvidia.com>
Date: Thu, 15 Feb 2024 11:23:51 -0800
Subject: [PATCH 07/11] bgpd : backpressure - Handle BGP-Zebra(EPVN) Install
From 2552ac0c492cdec01e36b48b63c057c6ad162701 Mon Sep 17 00:00:00 2001
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raja-rajasekar @donaldsharp can you please review this change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

From: Donald Sharp <sharpd@nvidia.com>
Date: Thu, 25 Jan 2024 13:07:37 -0500
Subject: [PATCH 05/11] bgpd: backpressure - cleanup bgp_zebra_XX func args
From 879558ccd9b0f4f43c708f43a3e0fcf38bebeab7 Mon Sep 17 00:00:00 2001
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@raja-rajasekar @donaldsharp can you please review this change?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sudhanshukumar22
Copy link
Contributor Author

sudhanshukumar22 commented Oct 5, 2024

@sudhanshukumar22 There is no FRR submodule update in this PR. Can you please update submodule?

I have updated the submodule here.

@sudhanshukumar22
Copy link
Contributor Author

/azpw ms_conflict

1 similar comment
@liushilongbuaa
Copy link
Contributor

/azpw ms_conflict

@sudhanshukumar22
Copy link
Contributor Author

Hi all, all changes discussed in the review meeting are done. Please approve the request, @donaldsharp , @dgsudharsan , @cscarpitta , @raja-rajasekar @gord1306 @lguohan @qiluo-msft @xumia

@dgsudharsan
Copy link
Collaborator

@sudhanshukumar22 A general feedback is not to rebase and force push changes as it makes it hard to review changes. The diff between commits is not visible.
During merge of the PR, all commits will be squashed. Hence force pushing while the PR is in review makes no sense and it makes review harder.

@yuezhoujk
Copy link

@sudhanshukumar22 Prepare to fork the 202411 branch. Will this PR be merged soon?
cc @zhangyanzhao

@sudhanshukumar22
Copy link
Contributor Author

@sudhanshukumar22 Prepare to fork the 202411 branch. Will this PR be merged soon? cc @zhangyanzhao @yuezhoujk

This PR is already approved and waiting to be merged.

@sudhanshukumar22
Copy link
Contributor Author

@sudhanshukumar22 A general feedback is not to rebase and force push changes as it makes it hard to review changes. The diff between commits is not visible. During merge of the PR, all commits will be squashed. Hence force pushing while the PR is in review makes no sense and it makes review harder.
@dgsudharsan
we can see the changes in the conversation tab itself as every push creates a new link here.

@donaldsharp
Copy link

LGTM

@kperumalbfn
Copy link
Contributor

@donaldsharp could you approve this PR and we could merge it.

@dgsudharsan
Copy link
Collaborator

@donaldsharp could you approve this PR and we could merge it.

@kperumalbfn You can see in the comment above yours that Donald mentioned LGTM. From our side PR looks fine. Please go ahead and merge if other reviewers have no concerns.

@kperumalbfn
Copy link
Contributor

@StormLiangMS could you check and approve this FRR change.

@kperumalbfn kperumalbfn requested review from StormLiangMS and removed request for gord1306, donaldsharp, cscarpitta and raja-rajasekar December 3, 2024 20:06
@lguohan lguohan merged commit f9e186c into sonic-net:master Dec 4, 2024
22 checks passed
saiarcot895 added a commit to saiarcot895/sonic-swss that referenced this pull request Dec 6, 2024
FRR 10.0.1 upgrade (sonic-net/sonic-buildimage#20269) brought in a mgmtd
daemon for FRR. This needs to be started up in docker-sonic-vs.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
saiarcot895 added a commit to saiarcot895/sonic-buildimage that referenced this pull request Dec 6, 2024
FRR 10.0.1 upgrade (sonic-net#20269) brought in a mgmtd
daemon for FRR. This needs to be started up in docker-sonic-vs as part
of the other daemons in this container.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
lguohan pushed a commit that referenced this pull request Dec 11, 2024
FRR 10.0.1 upgrade (#20269) brought in a mgmtd daemon for FRR. This needs to be started up in docker-sonic-vs as part of the other daemons in this container.

Additionally, Debian Bookworm provides version 2.5.0 of scapy, but the pip3 command later in the file downgraded it to 2.4.5, which does not work in Bookworm. Fix this by removing the pip3 installation for scapy, and updating the other packages installed via pip3.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this pull request Dec 18, 2024
FRR 10.0.1 upgrade (sonic-net#20269) brought in a mgmtd daemon for FRR. This needs to be started up in docker-sonic-vs as part of the other daemons in this container.

Additionally, Debian Bookworm provides version 2.5.0 of scapy, but the pip3 command later in the file downgraded it to 2.4.5, which does not work in Bookworm. Fix this by removing the pip3 installation for scapy, and updating the other packages installed via pip3.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
mssonicbld pushed a commit that referenced this pull request Dec 19, 2024
FRR 10.0.1 upgrade (#20269) brought in a mgmtd daemon for FRR. This needs to be started up in docker-sonic-vs as part of the other daemons in this container.

Additionally, Debian Bookworm provides version 2.5.0 of scapy, but the pip3 command later in the file downgraded it to 2.4.5, which does not work in Bookworm. Fix this by removing the pip3 installation for scapy, and updating the other packages installed via pip3.

Signed-off-by: Saikrishna Arcot <sarcot@microsoft.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.