-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[multiasic][supervisor] sonic-db-cli crashes at boot up when execute sonic-db-cli PING command in database.sh on multiasic platform #12047
Comments
@qiluo-msft can you please help to check the sonic-db-cli behavior change and see how to fix? looks like scalability issue Thanks. |
@SuvarnaMeenakshi - would we please check if multi-asic vs tests would catch this? Thanks. |
@abdosi , This is the same as we are observing on 202205 based image. |
As this error is seen during boot up, multi-asic VS tests suite we have today in PR checker will not be able to flag this. This specific issue is seen only on supervisor and not seen on multi-asic VS or multi-asic LC |
Create following PR to fix this issue: According to the database.sh code, it will wait until database ready by check sonic-db-cli return value, when database not ready sonic-db-cli should return 1: https://github.com/sonic-net/sonic-buildimage/blob/master/files/build_templates/docker_image_ctl.j2
However, because a code regression in sonic-db-cli, sonic-db-cli will crash. |
…eady issue. (#701) #### Why I did it Fix sonic-db-cli PING/SAVE/FLUSHALL command crash when database config file not ready issue: sonic-net/sonic-buildimage#12047 #### How I did it When run PING/SAVE/FLUSHALL command, catch database initialize failed exception and return 1. #### How to verify it Pass all existing UT and E2E test. Add new UT to cover changed code. Manually test, sonic-db-cli will return 1 when run PING command and can't find config file: azureuser@a7f66d2b794c:/sonic/src/sonic-swss-common$ ./sonic-db-cli/sonic-db-cli PING An exception of type Sonic database config file doesn't exist at /var/run/redis/sonic-db/database_config.json occurred. Arguments: /sonic/src/sonic-swss-common/sonic-db-cli/.libs/sonic-db-cli PING azureuser@a7f66d2b794c:/sonic/src/sonic-swss-common$ echo $? 1 #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [x] 202111 - [x] 202205 #### Description for the changelog Fix sonic-db-cli PING/SAVE/FLUSHALL command crash when database config file not ready issue. #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration. --> #### A picture of a cute animal (not mandatory but encouraged)
…eady issue. (#701) #### Why I did it Fix sonic-db-cli PING/SAVE/FLUSHALL command crash when database config file not ready issue: sonic-net/sonic-buildimage#12047 #### How I did it When run PING/SAVE/FLUSHALL command, catch database initialize failed exception and return 1. #### How to verify it Pass all existing UT and E2E test. Add new UT to cover changed code. Manually test, sonic-db-cli will return 1 when run PING command and can't find config file: azureuser@a7f66d2b794c:/sonic/src/sonic-swss-common$ ./sonic-db-cli/sonic-db-cli PING An exception of type Sonic database config file doesn't exist at /var/run/redis/sonic-db/database_config.json occurred. Arguments: /sonic/src/sonic-swss-common/sonic-db-cli/.libs/sonic-db-cli PING azureuser@a7f66d2b794c:/sonic/src/sonic-swss-common$ echo $? 1 #### Which release branch to backport (provide reason below if selected) <!-- - Note we only backport fixes to a release branch, *not* features! - Please also provide a reason for the backporting below. - e.g. - [x] 202006 --> - [ ] 201811 - [ ] 201911 - [ ] 202006 - [ ] 202012 - [ ] 202106 - [x] 202111 - [x] 202205 #### Description for the changelog Fix sonic-db-cli PING/SAVE/FLUSHALL command crash when database config file not ready issue. #### Link to config_db schema for YANG module changes <!-- Provide a link to config_db schema for the table for which YANG model is defined Link should point to correct section on https://github.com/Azure/SONiC/wiki/Configuration. --> #### A picture of a cute animal (not mandatory but encouraged)
fix available, please confirm if this can be closed @mlok-nokia |
I checked the changes in 202205 branch. It doesn't fix all issues. Although the change avoids the crash occurs and allow the database to load the configuration file, but the core files are still generated. admin@supervisor:~$ ls /var/core -al |
…ic platform after the c++ implementation of sonic-db-cli (#13207) Fixe #12047. After the c++ implementation of the sonic-db-cli, sonic-db-cli PING command tries to initialize the global database for all instances database starting. If all instance database-config.json are not ready yet. it will crash and generate core file. PR sonic-net/sonic-swss-common#701 only fix the crash and the process abortion. Signed-off-by: mlok <marty.lok@nokia.com>
@mlok-nokia, because the PR #13207 merged, could you please confirm we can close this issue and #13740? |
…ic platform after the c++ implementation of sonic-db-cli (sonic-net#13207) Fixe sonic-net#12047. After the c++ implementation of the sonic-db-cli, sonic-db-cli PING command tries to initialize the global database for all instances database starting. If all instance database-config.json are not ready yet. it will crash and generate core file. PR sonic-net/sonic-swss-common#701 only fix the crash and the process abortion. Signed-off-by: mlok <marty.lok@nokia.com>
…ic platform after the c++ implementation of sonic-db-cli (#13207) Fixe #12047. After the c++ implementation of the sonic-db-cli, sonic-db-cli PING command tries to initialize the global database for all instances database starting. If all instance database-config.json are not ready yet. it will crash and generate core file. PR sonic-net/sonic-swss-common#701 only fix the crash and the process abortion. Signed-off-by: mlok <marty.lok@nokia.com>
Related work items: sonic-net#276, sonic-net#305, sonic-net#332, sonic-net#338, sonic-net#339, sonic-net#1188, sonic-net#1192, sonic-net#1197, sonic-net#1206, sonic-net#1685, sonic-net#1690, sonic-net#1696, sonic-net#1699, sonic-net#1709, sonic-net#1727, sonic-net#1737, sonic-net#1741, sonic-net#1742, sonic-net#2511, sonic-net#2512, sonic-net#2532, sonic-net#2559, sonic-net#2626, sonic-net#2638, sonic-net#2645, sonic-net#2649, sonic-net#2660, sonic-net#2669, sonic-net#2670, sonic-net#2678, sonic-net#10084, sonic-net#11442, sonic-net#11873, sonic-net#12047, sonic-net#12110, sonic-net#12207, sonic-net#12529, sonic-net#12678, sonic-net#13235, sonic-net#13287, sonic-net#13372, sonic-net#13395, sonic-net#13456, sonic-net#13497, sonic-net#13522, sonic-net#13545, sonic-net#13547, sonic-net#13552, sonic-net#13569, sonic-net#13572, sonic-net#13578, sonic-net#13591, sonic-net#13611, sonic-net#13647, sonic-net#13649, sonic-net#13660, sonic-net#13710, sonic-net#13716, sonic-net#13724, sonic-net#13726, sonic-net#13732, sonic-net#13735, sonic-net#13739, sonic-net#13757, sonic-net#13786, sonic-net#13792, sonic-net#13800, sonic-net#13801, sonic-net#13802, sonic-net#13805, sonic-net#13806, sonic-net#13812, sonic-net#13814, sonic-net#13822, sonic-net#13831, sonic-net#13834, sonic-net#13847, sonic-net#13870, sonic-net#13882, sonic-net#13884, sonic-net#13885, sonic-net#13894, sonic-net#13895, sonic-net#13926, sonic-net#13932, sonic-net#13935, sonic-net#13942, sonic-net#13951, sonic-net#13953, sonic-net#13964
Description
On supervisor card, sonic-db-cli crashes when executes the sonic-db-cli PING command in the database.sh. The new implementation of the sonci-db-cli with PING command calls initializeGlobalConfig() which will check all ASICs redis#/sonic-db/database_config.json files which are not ready yet. This cause crash and the following error log. This function was used to wait for all database ready. If sonic-db-cli tries to access redis#/sonic-db/database_config.json files, it will failed.
There are 16 ASICs on this supervisor cards. This issue is similar to the isisue #10105. If sonic-db-cli behavior is changed, we may need to change waitForAllInstanceDatabaseConfigJsonFilesReady
Steps to reproduce the issue:
Describe the results you received:
There are core files. and the following error logs
Describe the results you expected:
There should not be any core file and no error log against the sonic-db-cli.
Output of
show version
:Output of
show techsupport
:Additional information you deem important (e.g. issue happens only occasionally):
The text was updated successfully, but these errors were encountered: