Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RESTAPI] autorestart for restapi process inside the docker is disabled #5

Closed
wants to merge 2 commits into from

Conversation

vivekrnv
Copy link
Owner

@vivekrnv vivekrnv commented Jun 29, 2021

Signed-off-by: Vivek Reddy Karri vkarri@nvidia.com

Why I did it

The critical process in the restapi docker restarts immediately after getting killed even though the FEATURE|restapi|autorestart is set to False.

Critical processes in other dockers follow the same approach.

How I did it

How to verify it

admin@sonic:~$ sudo config feature autorestart restapi disabled 
admin@sonic:~$ docker exec restapi supervisorctl status
dependent-startup                EXITED    Jun 29 12:47 AM
restapi                          RUNNING   pid 21, uptime 0:02:55
rsyslogd                         RUNNING   pid 13, uptime 0:02:57
start                            EXITED    Jun 29 12:47 AM
supervisor-proc-exit-listener    RUNNING   pid 8, uptime 0:02:59

#Kill the Critical Process
admin@sonic:~$ docker exec restapi kill -SIGKILL 21 

admin@sonic:~$ show logging -f | grep restapi
Jun 29 00:51:04.293305 r-lionfish-16 NOTICE restapi#root: message repeated 2 times: [ Waiting for certificates...]
Jun 29 00:51:34.633684 r-lionfish-16 INFO restapi#supervisord 2021-06-29 00:51:34,633 INFO exited: restapi (terminated by SIGKILL; not expected)
Jun 29 00:52:04.295672 r-lionfish-16 INFO restapi#supervisord 2021-06-29 00:52:04,294 INFO reaped unknown pid 67 (exit status 0)
Jun 29 00:52:34.704237 r-lionfish-16 ERR restapi#supervisor-proc-exit-listener: Process 'restapi' is not running in namespace 'host' (1.0 minutes).
Jun 29 00:53:34.769837 r-lionfish-16 ERR restapi#supervisor-proc-exit-listener: Process 'restapi' is not running in namespace 'host' (2.0 minutes).
Jun 29 00:54:34.833258 r-lionfish-16 ERR restapi#supervisor-proc-exit-listener: Process 'restapi' is not running in namespace 'host' (3.0 minutes).
Jun 29 00:55:34.895249 r-lionfish-16 ERR restapi#supervisor-proc-exit-listener: Process 'restapi' is not running in namespace 'host' (4.0 minutes).

After the change, the critical process did not restart and it did honor the update made in the FEATURE table.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012

Description for the changelog

A picture of a cute animal (not mandatory but encouraged)

@vivekrnv vivekrnv closed this Jun 29, 2021
vivekrnv pushed a commit that referenced this pull request Nov 11, 2021
Also add out of tree pca9548 mux driver to use platform data to mapping i2c bus with front panel port.

Signed-off-by: Jakkapan Jangmuang <jjangmua@celestica.com>

Co-authored-by: Saikrishna Arcot <sarcot@microsoft.com>
vivekrnv pushed a commit that referenced this pull request Dec 10, 2021
#### What I did 

[sonic-linkmgrd][master] submodule update

6c6151b Fix unstable unit tests (state change handler wasn't invoked) (#8)
2f7dc0a support code diff coverage (#5)
83f0002 Force mux state switch to standby if triggered from Cli (#6)

signed-off-by: Jing Zhang zhangjing@microsoft.com
vivekrnv pushed a commit that referenced this pull request Apr 4, 2022
6c6151b Fix unstable unit tests (state change handler wasn't invoked) (#8)
2f7dc0a support code diff coverage (#5)
83f0002 Force mux state switch to standby if triggered from Cli (#6)

signed-off-by: Jing Zhang zhangjing@microsoft.com
vivekrnv pushed a commit that referenced this pull request Oct 18, 2022
… URL support "not to use cac (sonic-net#12394)

he" (#45)
* 4f45e3a Update gnmi_cli (#5) (#44)
vivekrnv added a commit that referenced this pull request Dec 6, 2022
2fbe729 disable cfg dynamic change (#25)
13d0805 Use github code scanning instead of LGTM (#26)
1e846f6 Fix packet range check for relay-reply packets (#21)
4d19e13 (work/master, master) Add unittest infrastructure (#5)
7f4fdab fix packet range check issue (#20)
257ecdf Add client packet UDP header length check (#19)

Signed-off-by: Vivek Reddy Karri <vkarri@nvidia.com>
vivekrnv added a commit that referenced this pull request Dec 6, 2022
2fbe729 disable cfg dynamic change (#25)
13d0805 Use github code scanning instead of LGTM (#26)
1e846f6 Fix packet range check for relay-reply packets (#21)
4d19e13 Add unittest infrastructure (#5)
7f4fdab fix packet range check issue (#20)
257ecdf Add client packet UDP header length check (#19)

Signed-off-by: Vivek Reddy <vkarri@nvidia.com>
vivekrnv pushed a commit that referenced this pull request Dec 19, 2022
Added below commits:
9b30690 jcaiMR Fri Dec 16 fix handleSwssNotification crash in dhcp6relay (#28)
047afb7 jcaiMR Wed Dec 14 14:08:58 2022 +0800 Fix multiple vlan issue (#27)
ff6bec3 Vivek Thu Dec 8 09:44:15 2022 -0800 Made the Error log informative (#22)
2fbe729 jcaiMR Wed Nov 30 14:41:53 2022 +0800 disable cfg dynamic change (#25)
13d0805 Liu Shilong Wed Nov 30 10:54:11 2022 +0800 Use github code scanning instead of LGTM (#26)
1e846f6 kellyyeh Wed Nov 23 14:36:02 2022 -0800 Fix packet range check for relay-reply packets (#21)
4d19e13 kellyyeh Thu Nov 17 10:04:53 2022 -0800 Add unittest infrastructure (#5)
7f4fdab jcaiMR Fri Nov 11 14:47:51 2022 +0800 fix packet range check issue (#20)
257ecdf kellyyeh Thu Nov 3 11:34:11 2022 -0700 Add client packet UDP header length check (#19)
vivekrnv pushed a commit that referenced this pull request Mar 13, 2023


advance dhcp relay for 202211

4bf1868 - (HEAD, origin/master, origin/HEAD, master) fix relay-reply dhcpv6 packet counter issue (add support for a7050 qx32 platform #29) (2 weeks ago) [jcaiMR]
9b30690 - fix handleSwssNotification crash in dhcp6relay (Add libnl-nf-3-200 to docker-team #28) (4 weeks ago) [jcaiMR]
047afb7 - Fix multiple vlan issue (Failure trying to run: chroot /sonic-buildimage/fsroot mount -t proc proc /proc #27) (4 weeks ago) [jcaiMR]
ff6bec3 - Made the Error log informative (add python-tenjin as build dependency for p4-switch #22) (5 weeks ago) [Vivek]
2fbe729 - disable cfg dynamic change (p4: fix build dependency for python-p4c-bm #25) (6 weeks ago) [jcaiMR]
13d0805 - Use github code scanning instead of LGTM (Removed sx-libnl from Mellanox containers dependencies. #26) (6 weeks ago) [Liu Shilong]
1e846f6 - Fix packet range check for relay-reply packets (update sonic-swss and p4-switch submodule to fix docker sonic p4 bug #21) (7 weeks ago) [kellyyeh]
4d19e13 - Add unittest infrastructure (Cavium customization for docker containers #5) (8 weeks ago) [kellyyeh]
7f4fdab - fix packet range check issue (Makefile: add build dependency for python-p4c-bm #20) (9 weeks ago) [jcaiMR]
257ecdf - Add client packet UDP header length check (change port_config.ini directory for s6000 #19) (2 months ago) [kellyyeh]
vivekrnv pushed a commit that referenced this pull request Dec 6, 2024
To fix a statistical issue. The original fix was done in FRRouting/frr#17297. However to accommodate 8.5.4 the patch in the PR was added.

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
vivekrnv pushed a commit that referenced this pull request Dec 18, 2024
…et#21095)

Adding the below fix from FRR FRRouting/frr#17297

This is to fix the following crash which is a statistical issue

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M dplane_fpm_nl -M snmp'.
Program terminated with signal SIGABRT, Aborted.
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
[Current thread is 1 (Thread 0x7fccd6faf7c0 (LWP 36))]
(gdb) bt
#0  0x00007fccd7351e2c in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fccd7302fb2 in raise () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fccd72ed472 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#3  0x00007fccd75bb3a9 in _zlog_assert_failed (xref=xref@entry=0x7fccd7652380 <_xref.16>, extra=extra@entry=0x0) at ../lib/zlog.c:678
#4  0x00007fccd759b2fe in route_node_delete (node=<optimized out>) at ../lib/table.c:352
#5  0x00007fccd759b445 in route_unlock_node (node=0x0) at ../lib/table.h:258
#6  route_next (node=<optimized out>) at ../lib/table.c:436
#7  route_next (node=node@entry=0x56029d89e560) at ../lib/table.c:410
#8  0x000056029b6b6b7a in if_lookup_by_name_per_ns (ns=ns@entry=0x56029d873d90, ifname=ifname@entry=0x7fccc0029340 "PortChannel1020")
    at ../zebra/interface.c:312
#9  0x000056029b6b8b36 in zebra_if_dplane_ifp_handling (ctx=0x7fccc0029310) at ../zebra/interface.c:1867
#10 zebra_if_dplane_result (ctx=0x7fccc0029310) at ../zebra/interface.c:2221
#11 0x000056029b7137a9 in rib_process_dplane_results (thread=<optimized out>) at ../zebra/zebra_rib.c:4810
#12 0x00007fccd75a0e0d in thread_call (thread=thread@entry=0x7ffe8e553cc0) at ../lib/thread.c:1990
#13 0x00007fccd7559368 in frr_run (master=0x56029d65a040) at ../lib/libfrr.c:1198
#14 0x000056029b6ac317 in main (argc=9, argv=0x7ffe8e5540d8) at ../zebra/main.c:478
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants