Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[202405] sai.profile format issue #20466

Closed
yanmo96 opened this issue Oct 10, 2024 · 12 comments
Closed

[202405] sai.profile format issue #20466

yanmo96 opened this issue Oct 10, 2024 · 12 comments
Assignees

Comments

@yanmo96
Copy link

yanmo96 commented Oct 10, 2024

Description

Found issue where syncd and pmon is not working properly. Caused by #18487
Where sai.profile is not correctly formatted, which will lead to issues when try to append new lines to sai.profile, resulting in the following where SAI_DUMP_STORE_PATH is not on a new line:

mroot@str3-msn4700-03:/# cat /tmp/sai.profile    
SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/sai_4700_8x400g_48x200g.xml
SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1
SAI_INDEPENDENT_MODULE_MODE=1
SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1SAI_DUMP_STORE_PATH=/var/log/mellanox/sdk-dumps
SAI_DUMP_STORE_AMOUNT=10
SAI_ASYNC_ROUTING_ENABLED=1
DEVICE_MAC_ADDRESS=9c:05:91:b9:ea:00
SAI_WARM_BOOT_WRITE_FILE=/var/warmboot/
root@str3-msn4700-03:/#

Steps to reproduce the issue:

  1. cat sonic-buildimage\device\mellanox\x86_64-mlnx_msn4700-r0\Mellanox-SN4700-O8V48\sai.profile

Describe the results you received:

The output is missing line feed at the end, hens why the new command line is not on a new line.

admin@str3-msn4700-03:/usr/share/sonic/device/x86_64-mlnx_msn4700-r0/Mellanox-SN4700-O8V48$ cat sai.profile 
SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/sai_4700_8x400g_48x200g.xml
SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1
SAI_INDEPENDENT_MODULE_MODE=1
SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1admin@str3-msn4700-03:/usr/share/sonic/device/x86_64-mlnx_msn4700-r0/Mellanox-SN4700-O8V48$

Or we can see this file by cat sai.profile | xxd which is missing 0a at the end

00000000: 5341 495f 494e 4954 5f43 4f4e 4649 475f  SAI_INIT_CONFIG_
00000010: 4649 4c45 3d2f 7573 722f 7368 6172 652f  FILE=/usr/share/
00000020: 736f 6e69 632f 6877 736b 752f 7361 695f  sonic/hwsku/sai_
00000030: 3437 3030 5f38 7834 3030 675f 3438 7832  4700_8x400g_48x2
00000040: 3030 672e 786d 6c0a 5341 495f 4445 4641  00g.xml.SAI_DEFA
00000050: 554c 545f 5357 4954 4348 494e 475f 4d4f  ULT_SWITCHING_MO
00000060: 4445 5f53 544f 5245 5f46 4f52 5741 5244  DE_STORE_FORWARD
00000070: 3d31 0a53 4149 5f49 4e44 4550 454e 4445  =1.SAI_INDEPENDE
00000080: 4e54 5f4d 4f44 554c 455f 4d4f 4445 3d31  NT_MODULE_MODE=1
00000090: 0a53 4149 5f4e 4f54 5f44 524f 505f 5349  .SAI_NOT_DROP_SI
000000a0: 505f 4449 505f 4c49 4e4b 5f4c 4f43 414c  P_DIP_LINK_LOCAL
000000b0: 3d31                                     =1

Describe the results you expected:

From different SKU (content is different, but same idea for the last line)

admin@str3-msn4700-03:/usr/share/sonic/device/x86_64-mlnx_msn4700-r0/Mellanox-SN4700-O8C48$ cat sai.profile 
SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/sai_4700_8x400g_48x100g.xml
SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1
SAI_INDEPENDENT_MODULE_MODE=1
SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1
admin@str3-msn4700-03:/usr/share/sonic/device/x86_64-mlnx_msn4700-r0/Mellanox-SN4700-O8C48$ 
00000000: 5341 495f 494e 4954 5f43 4f4e 4649 475f  SAI_INIT_CONFIG_
00000010: 4649 4c45 3d2f 7573 722f 7368 6172 652f  FILE=/usr/share/
00000020: 736f 6e69 632f 6877 736b 752f 7361 695f  sonic/hwsku/sai_
00000030: 3437 3030 5f38 7834 3030 675f 3438 7831  4700_8x400g_48x1
00000040: 3030 672e 786d 6c0a 5341 495f 4445 4641  00g.xml.SAI_DEFA
00000050: 554c 545f 5357 4954 4348 494e 475f 4d4f  ULT_SWITCHING_MO
00000060: 4445 5f53 544f 5245 5f46 4f52 5741 5244  DE_STORE_FORWARD
00000070: 3d31 0a53 4149 5f49 4e44 4550 454e 4445  =1.SAI_INDEPENDE
00000080: 4e54 5f4d 4f44 554c 455f 4d4f 4445 3d31  NT_MODULE_MODE=1
00000090: 0a53 4149 5f4e 4f54 5f44 524f 505f 5349  .SAI_NOT_DROP_SI
000000a0: 505f 4449 505f 4c49 4e4b 5f4c 4f43 414c  P_DIP_LINK_LOCAL
000000b0: 3d31 0a                                  =1.

Output of show version:

SONiC Software Version: SONiC.20240531.04
SONiC OS Version: 12
Distribution: Debian 12.6
Kernel: 6.1.0-11-2-amd64
Build commit: 82de211cf9
Build date: Wed Sep 25 03:32:30 UTC 2024
Built by: azureuser@dbfa1aa6c000001

Platform: x86_64-mlnx_msn4700-r0
HwSKU: Mellanox-SN4700-O8V48
ASIC: mellanox
ASIC Count: 1
Serial Number: MT2340XZ127B
Model Number: MSN4700-WS2ROS
Hardware Revision: A2
Uptime: 23:51:32 up  1:39,  2 users,  load average: 1.45, 1.05, 0.72
Date: Thu 10 Oct 2024 23:51:32

Output of show techsupport:

(paste your output here or download and attach the file here )

Additional information you deem important (e.g. issue happens only occasionally):

@bingwang-ms
Copy link
Contributor

@dgsudharsan Could you please take a look?

@dgsudharsan
Copy link
Collaborator

@bingwang-ms @yanmo96 This issue should be fixed with sonic-net/sonic-sairedis#1412 . Can we please cherry-pick to 202405?

@yanmo96
Copy link
Author

yanmo96 commented Oct 11, 2024

@dgsudharsan sonic-net/sonic-sairedis#1412 only remove redundant newlines, but in this case, is we are missing an end of line indicator.

Below is the results after manually test on the faulty file, still missing line feed at end

admin@str3-msn4700-03:~$ cat sai.profile 
SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/sai_4700_8x400g_48x200g.xml
SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1
SAI_INDEPENDENT_MODULE_MODE=1
SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1admin@str3-msn4700-03:~$ 

And out of all the sai.profile sonic-buildimage\device\mellanox\x86_64-mlnx_msn4700-r0\Mellanox-SN4700-O8V48\sai.profile is the only one with this weird issue

@dgsudharsan
Copy link
Collaborator

@yanmo96 I don't think missing end of line is an issue unless we append more data which mixes up two KV pairs in same line.
With the sonic-sairedis fix are you seeing any functional issue like pmon syncd restart?

@yanmo96
Copy link
Author

yanmo96 commented Oct 14, 2024

@dgsudharsan https://github.com/sonic-net/sonic-sairedis/blob/d62ac0d57efbe3b1970dae697151adc335f3f702/syncd/scripts/syncd_init_common.sh#L225 this part will append more info into the sai.profile
Which will result in following (SAI_DUMP_STORE_PATH should be on a new line):

oot@str3-msn4700-03:/# cat /tmp/sai.profile    
SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/sai_4700_8x400g_48x200g.xml
SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1
SAI_INDEPENDENT_MODULE_MODE=1
SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1SAI_DUMP_STORE_PATH=/var/log/mellanox/sdk-dumps
SAI_DUMP_STORE_AMOUNT=10
SAI_ASYNC_ROUTING_ENABLED=1
DEVICE_MAC_ADDRESS=9c:05:91:b9:ea:00
SAI_WARM_BOOT_WRITE_FILE=/var/warmboot/
root@str3-msn4700-03:/#

Below are the error message from syslog
ERR pmon#xcvrd: Failed to read from file /tmp/sai.profile - ValueError('too many values to unpack (expected 2)')

@yanmo96
Copy link
Author

yanmo96 commented Oct 14, 2024

@dgsudharsan This will cause
show int status all link down, and can't get any info for Transceivers

@bingwang-ms
Copy link
Contributor

Hi @dgsudharsan, from the comment added by Yan, below line is mixing up two KV items. The issue can be reproed easily by installing 202405 image with HWSKU Mellanox-SN4700-O8V48

SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1SAI_DUMP_STORE_PATH=/var/log/mellanox/sdk-dumps

@dgsudharsan
Copy link
Collaborator

@bingwang-ms Was it reproed with the fix I mentioned? This is the fix for the issue sonic-net/sonic-sairedis#1412

@dgsudharsan
Copy link
Collaborator

@dgsudharsan https://github.com/sonic-net/sonic-sairedis/blob/d62ac0d57efbe3b1970dae697151adc335f3f702/syncd/scripts/syncd_init_common.sh#L225 this part will append more info into the sai.profile Which will result in following (SAI_DUMP_STORE_PATH should be on a new line):

My question was are you still seeing the issue with the fix sonic-net/sonic-sairedis#1412

The fix should resolve the issue

@dgsudharsan
Copy link
Collaborator

@dgsudharsan sonic-net/sonic-sairedis#1412 only remove redundant newlines, but in this case, is we are missing an end of line indicator.

Below is the results after manually test on the faulty file, still missing line feed at end

admin@str3-msn4700-03:~$ cat sai.profile 
SAI_INIT_CONFIG_FILE=/usr/share/sonic/hwsku/sai_4700_8x400g_48x200g.xml
SAI_DEFAULT_SWITCHING_MODE_STORE_FORWARD=1
SAI_INDEPENDENT_MODULE_MODE=1
SAI_NOT_DROP_SIP_DIP_LINK_LOCAL=1admin@str3-msn4700-03:~$ 

And out of all the sai.profile sonic-buildimage\device\mellanox\x86_64-mlnx_msn4700-r0\Mellanox-SN4700-O8V48\sai.profile is the only one with this weird issue

Please check that this PR is also adding a new line between sai common profile and sai specific profile. I verified it in my setup.

Below is the line where a newline is introduced

https://github.com/sonic-net/sonic-sairedis/blob/d62ac0d57efbe3b1970dae697151adc335f3f702/syncd/scripts/syncd_init_common.sh#L228

@yanmo96
Copy link
Author

yanmo96 commented Oct 15, 2024

@dgsudharsan Got it! Thanks! I will give a try

@bingwang-ms
Copy link
Contributor

bingwang-ms commented Oct 15, 2024

Hi @dgsudharsan , I think the file device/mellanox/x86_64-mlnx_msn4700-r0/Mellanox-SN4700-O8V48/sai.profile still needs to be fixed. It's not reliable enough to depends on PR #1412. There can be other code that appends new lines to sai.profile and trigger the same issue.
The file in master branch has the same issue.

qiluo-msft pushed a commit that referenced this issue Oct 16, 2024
### Why I did it
Fixing #20466. Adding newline at the end of sai.profile

### How I did it
Added missing newline.

#### How to verify it
Loading with the SKU and check if everything works fine.
mssonicbld pushed a commit to mssonicbld/sonic-buildimage that referenced this issue Oct 16, 2024
### Why I did it
Fixing sonic-net#20466. Adding newline at the end of sai.profile

### How I did it
Added missing newline.

#### How to verify it
Loading with the SKU and check if everything works fine.
mssonicbld pushed a commit that referenced this issue Oct 17, 2024
### Why I did it
Fixing #20466. Adding newline at the end of sai.profile

### How I did it
Added missing newline.

#### How to verify it
Loading with the SKU and check if everything works fine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants