Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sxCoreAsync continuously crash on Mellanox 2700 platform #7915

Closed
yxieca opened this issue Jun 18, 2021 · 0 comments · Fixed by #7913
Closed

sxCoreAsync continuously crash on Mellanox 2700 platform #7915

yxieca opened this issue Jun 18, 2021 · 0 comments · Fixed by #7913

Comments

@yxieca
Copy link
Contributor

yxieca commented Jun 18, 2021

Description

Starting from public Mellanox cache build 857, sxCoreAsync crashes continuous on Mellanox 2700 platform.

The issue was triggered by sonic-net/sonic-swss#1749.

Jun 17 23:16:51.516475 str2-msn2700-spy-1 NOTICE syncd#SDK: :- threadFunction: span < 0 = -6 at 244890323
Jun 17 23:16:51.516475 str2-msn2700-spy-1 NOTICE syncd#SDK: :- threadFunction: new span = 140
Jun 17 23:16:51.671873 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to counter types bitmap.
Jun 17 23:16:51.679274 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to counter types bitmap.
Jun 17 23:16:51.679344 str2-msn2700-spy-1 ERR syncd#SDK: [SX_API_BULK_COUNTER.ERR] Invalid value of port counter-groups bitmap [0x0]
Jun 17 23:16:51.679646 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[5448]- mlnx_port_stats_allocate_sx_bulk_buffer: Failed to create buffer: Parameter Error.
Jun 17 23:16:51.679864 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6434]- mlnx_get_port_stats_ext: Failed to allocate SDK buffer.
Jun 17 23:16:51.682573 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to counter types bitmap.
Jun 17 23:16:51.683873 str2-msn2700-spy-1 ERR syncd#SDK: message repeated 25 times: [ [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to coun
ter types bitmap.]
Jun 17 23:16:51.684010 str2-msn2700-spy-1 ERR syncd#SDK: [SX_API_BULK_COUNTER.ERR] Invalid value of port counter-groups bitmap [0x0]
Jun 17 23:16:51.684203 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[5448]- mlnx_port_stats_allocate_sx_bulk_buffer: Failed to create buffer: Parameter Error.
Jun 17 23:16:51.684400 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6434]- mlnx_get_port_stats_ext: Failed to allocate SDK buffer.
Jun 17 23:16:51.684603 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to counter types bitmap.
Jun 17 23:16:51.684804 str2-msn2700-spy-1 ERR syncd#SDK: [SX_API_BULK_COUNTER.ERR] Invalid value of port counter-groups bitmap [0x0]
Jun 17 23:16:51.684995 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[5448]- mlnx_port_stats_allocate_sx_bulk_buffer: Failed to create buffer: Parameter Error.
Jun 17 23:16:51.685208 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6434]- mlnx_get_port_stats_ext: Failed to allocate SDK buffer.
Jun 17 23:16:51.688662 str2-msn2700-spy-1 ERR syncd#SDK: [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to counter types bitmap.
Jun 17 23:16:51.761922 str2-msn2700-spy-1 ERR syncd#SDK: message repeated 43 times: [ [SAI_PORT.ERR] mlnx_sai_port.c[6410]- mlnx_get_port_stats_ext: Failed to convert counter IDs to coun
ter types bitmap.]
Jun 17 23:16:51.762151 str2-msn2700-spy-1 ERR syncd#SDK: :- addCounter: Objecpacket_write_wait: Connection to 10.3.146.66 port 22: Broken pipeJECT_TYPE_NULL, field PORT_COUNTER_ID_LIST
yinxi@acs-trusty8:~$ q str2-msn2700-spy-1 INFO kernel: [ 245.068240] traps: sxCoreAsync[5994] trap stack segment ip:7f189c507dc8 sp:7f1893a23370 error:0 in libsxbulk_counter.so.1.0.0[7f
The program 'q' can be found in the following packages:

  • python-q-text-as-datatr2-msn2700-spy-1 ERR syncd#SDK: message repeated 4 times: [ :- addCounter: Object type and field combination is not supported, object type SAI_OBJECT_TYPE_NULL,
  • python3-q-text-as-dataT]

Steps to reproduce the issue:

  1. start up image https://sonic-jenkins.westus2.cloudapp.azure.com/job/mellanox/job/buildimage-mlnx-cache/857/ or later.
  2. Wait 5 minutes or so. crash will start.

Describe the results you received:

Describe the results you expected:

Output of show version:

(paste your output here)

Output of show techsupport:

techsupport is too big to upload to here. Already shared with nvidia team.

Additional information you deem important (e.g. issue happens only occasionally):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant