Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel log spam after setting enable_depth= false #10158

Closed
gc-91 opened this issue Jan 17, 2022 · 24 comments
Closed

kernel log spam after setting enable_depth= false #10158

gc-91 opened this issue Jan 17, 2022 · 24 comments

Comments

@gc-91
Copy link

gc-91 commented Jan 17, 2022

  • Before opening a new issue, we wanted to provide you with some useful suggestions (Click "Preview" above for a better view):

  • All users are welcomed to report bugs, ask questions, suggest or request enhancements and generally feel free to open new issue, even if they haven't followed any of the suggestions above :)


Required Info
Camera Model D435
Firmware Version 05.12.15.50
Operating System & Version Linux (Debian 4.19.152-1)
Kernel Version (Linux Only) 4.19.0-12-rt-amd64
Platform PC
SDK Version 2.50.0
Language
Segment Robot

Issue Description

Hello,
my kernel.log is flooded with the following messages:

input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3.3/2-3.3.2/2-3.3.2:1.0/input/input490
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-12 directory.
uvcvideo 2-3.3.2:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-3.3.2:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-3.3.2:1.0: Entity type for entity Camera 1 was not initialized!

The repetition rate is 2Hz.
I noticed that this behaviour only occurs when I set the option enable_depth = false.
In rare cases it happens that my system freezes and the following message is stored in the kernel:

BUG: unable to handle kernel paging request at fffff23aa82f1224
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 23636 Comm: nodelet Tainted: G OE 4.19.0-12-rt-amd64 #1 Debian 4.19.152-1
Hardware name: CINCOZE DI-1000/DI-1000, BIOS 5.11 01/19/2019
RIP: 0010:__kmalloc_track_caller+0xa5/0x250

How can this behaviour be explained?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 17, 2022

Hi @gc-91 I note that you are using kernel 4.19. The librealsense SDK does not have official support for 4.19.

image

Whilst librealsense can work with kernels that are not officially supported, there may be unpredictable consequences in regard to stability.

If librealsense is built from source code with CMake using the RSUSB backend installation method then you should be able to continue using 4.19, as an RSUSB-based librealsense build is not dependent on Linux versions or kernel versions and does not require patching. Instructions for building librealsense with the RSUSB method can be found at #9931 (comment)


Alternatively, if you are using ROS and it works correctly for you already so long as depth is not disabled then a simpler workaround may be to allow depth to be published by having it set to True but configure it to a low setting such as 424x240 at 15 FPS so that it adds a minimal burden to overall processing.

@gc-91
Copy link
Author

gc-91 commented Jan 18, 2022

Hello MartyG,

Thanks for your message.
I had already built the librealsense from source, including RSUSB backend option.

You are right, I use the camera in ROS. So I will implement your low-setting suggestion.

However, I am still concerned that the system can freeze.
Can you deduce anything from the kernel message that was dropped just before the freeze?
Here I have the complete excerpt for you:

BUG: unable to handle kernel paging request at fffff23aa82f1224
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 3 PID: 23636 Comm: nodelet Tainted: G OE 4.19.0-12-rt-amd64 #1 Debian 4.19.152-1
Hardware name: CINCOZE DI-1000/DI-1000, BIOS 5.11 01/19/2019
RIP: 0010:__kmalloc_track_caller+0xa5/0x250
Code: 48 8b 70 08 48 39 f2 75 e7 48 83 78 10 00 48 8b 28 0f 84 39 01 00 00 48 85 ed 0f 84 30 01 00 00 41 8b 47 20 49 8b 3f 48 01 e8 <48> 8b 18 48 89 c1 49 33 9f 30 01 00 00 48 89 e8 48 0f c9 48 31 cb
RSP: 0018:ffffb32e8212f9b0 EFLAGS: 00010286
RAX: fffff23aa82f1224 RBX: 0000000000000000 RCX: 0000000000026660
RDX: 000000071d090a03 RSI: 000000071d090a03 RDI: 0000000000026660
RBP: fffff23aa82f1224 R08: 0000000000000000 R09: 0000000000000004
R10: ffff96cde0356910 R11: 0000000000000010 R12: 00000000006000c0
R13: 000000000000000c R14: ffff96cde5403b00 R15: ffff96cde5403b00
FS: 00007f5c1affd700(0000) GS:ffff96cde5b80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff23aa82f1224 CR3: 00000001dcb5e005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? __kernfs_new_node+0x41/0x1d0
kstrdup+0x2d/0x60
__kernfs_new_node+0x41/0x1d0
? usb_start_wait_urb+0xa2/0x160 [usbcore]
kernfs_new_node+0x21/0x40
kernfs_create_link+0x32/0xa0
sysfs_do_create_link_sd.isra.2+0x61/0xc0
driver_sysfs_add+0x4e/0xd0
device_bind_driver+0xf/0x50
usb_driver_claim_interface+0x99/0x110 [usbcore]
uvc_probe+0x9ae/0x1db0 [uvcvideo]
? __kernfs_new_node+0xd3/0x1d0
? usb_probe_interface+0xea/0x300 [usbcore]
? uvc_register_video_device+0x120/0x120 [uvcvideo]
usb_probe_interface+0xea/0x300 [usbcore]
really_probe+0x20a/0x3b0
? __driver_attach+0x110/0x110
driver_probe_device+0xb3/0xf0
? __driver_attach+0x110/0x110
bus_for_each_drv+0x79/0xc0
__device_attach+0xe5/0x160
proc_ioctl+0x1b6/0x230 [usbcore]
usbdev_do_ioctl+0x270/0x1110 [usbcore]
? hrtimer_try_to_cancel+0x29/0x150
? hrtimer_cancel+0x1b/0x30
? do_nanosleep+0x42/0x160
usbdev_ioctl+0xa/0x10 [usbcore]
do_vfs_ioctl+0xa4/0x630
? __hrtimer_init_sleeper+0x60/0x60
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x53/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f5c525cd427
Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007f5c1affcb78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f5c525cd427
RDX: 00007f5c1affcb80 RSI: 00000000c0105512 RDI: 0000000000000019
RBP: 00007f5c10001f10 R08: 00007f5c10001f10 R09: 00007f5c080a41d0
R10: 0000000000000006 R11: 0000000000000246 R12: 0000000000000000
R13: 00000000fffffffb R14: 00007f5c080a41f0 R15: 00007f5c529fa260
Modules linked in: can_bcm can xt_conntrack ipt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nf_tables nfnetlink br_netfilter bridge stp llc aufs(OE) overlay cpufreq_userspace cpufreq_conservative cpufreq_powersave snd_hda_codec_hdmi snd_hda_codec_realtek intel_rapl snd_hda_codec_generic x86_pkg_temp_thermal ftdi_sio intel_powerclamp usbserial coretemp kvm_intel snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc kvm snd_soc_sst_dsp snd_hda_ext_core irqbypass snd_soc_acpi_intel_match crct10dif_pclmul snd_soc_acpi crc32_pclmul ghash_clmulni_intel snd_soc_core snd_compress intel_cstate snd_hda_intel snd_hda_codec intel_uncore snd_hda_core snd_hwdep intel_rapl_perf
snd_pcm snd_timer pcspkr iTCO_wdt snd i915 mei_wdt iTCO_vendor_support idma64 soundcore peak_pci sja1000 sg can_dev mei_me drm_kms_helper mei intel_pch_thermal drm wmi evdev pcc_cpufreq video acpi_pad button uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev media ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb sd_mod crc32c_intel aesni_intel aes_x86_64 xhci_pci crypto_simd cryptd ahci glue_helper xhci_hcd libahci libata intel_lpss_pci igb e1000e intel_lpss i2c_i801 usbcore i2c_algo_bit mfd_core scsi_mod dca usb_common fan thermal
CR2: fffff23aa82f1224
---[ end trace 0000000000000002 ]---
RIP: 0010:__kmalloc_track_caller+0xa5/0x250
Code: 48 8b 70 08 48 39 f2 75 e7 48 83 78 10 00 48 8b 28 0f 84 39 01 00 00 48 85 ed 0f 84 30 01 00 00 41 8b 47 20 49 8b 3f 48 01 e8 <48> 8b 18 48 89 c1 49 33 9f 30 01 00 00 48 89 e8 48 0f c9 48 31 cb
RSP: 0018:ffffb32e8212f9b0 EFLAGS: 00010286
RAX: fffff23aa82f1224 RBX: 0000000000000000 RCX: 0000000000026660
RDX: 000000071d090a03 RSI: 000000071d090a03 RDI: 0000000000026660
RBP: fffff23aa82f1224 R08: 0000000000000000 R09: 0000000000000004
R10: ffff96cde0356910 R11: 0000000000000010 R12: 00000000006000c0
R13: 000000000000000c R14: ffff96cde5403b00 R15: ffff96cde5403b00
FS: 00007f5c1affd700(0000) GS:ffff96cde5b80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: fffff23aa82f1224 CR3: 00000001dcb5e005 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

You can see that the tainted flag is set with the following options:

  • G = proprietary module was loaded
  • O = externally-built (“out-of-tree”) module was loaded
  • E = unsigned module was loaded

@MartyG-RealSense
Copy link
Collaborator

I cannot see anything obviously wrong in the log in regard to RealSense. As the freeze is occurring rarely for you when launching ROS with depth disabled, you could try adding initial_reset:=true to your roslaunch instruction if you are not doing so already in order to see whether this resolves the problem. Using this command resets the camera automatically during ROS launch.

@gc-91
Copy link
Author

gc-91 commented Jan 18, 2022

I already use the initial_reset option.
In the freeze log, the 4th line says "Comm: nodelt...".
I use nodelt exclusively for the operation of the Realsense.
Furthermore, you can see in the call trace that the modules usbcore and uvcvideo are involved. Therefore, I believe that there must be a connection to Realsense.

After a bit of research, I came across the following article:
https://googleprojectzero.blogspot.com/2021/10/how-simple-linux-kernel-memory.html

The ioctl system call also appears frequently in my call trace.

@MartyG-RealSense
Copy link
Collaborator

If librealsense is built with the RSUSB backend that bypasses video4linux and and communicates with the device using generic USB driver then it should not be experiencing kernel related conflicts.

usbcore and uvcvideo are general Linux kernel modules and not specific to RealSense.

Can you confirm please that you did not patch the kernel before building librealsense from source code with RSUSB, since - unlike a non-RSUSB source code build - kernel patching is not required with this method.

@gc-91
Copy link
Author

gc-91 commented Jan 19, 2022

Hello,
thank you for your message.

In the meantime, I have checked the installation.
However, in order to be able to perform a build as described by you under #9931, I have to install the following packages beforehand:

  • libusb-1.0-0-dev
  • libglu1-mesa-dev
  • glfw

Otherwise the code cannot be compiled. Otherwise,
I have not installed any kernel patches.

I did try and compile the code with RSUSB-backend = false, the D435 still worked.
If I understand your last message correctly, video4linux is now used or?
How can I find out which kernel modules the camera uses? I have already checked with lsmod all modules for a connection with the Realsense, unfortunately without result.

For your information, I only use my system via the cli, so I cannot use applications such as the realsense viewer.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 19, 2022

If DFORCE_RSUSB_BACKEND=OFF is used in the CMake build instruction then the V4L2 backend (which will make use of video4linux) will be used in the librealsense build.

If you go to #5212 (comment) and scroll down through the linked-to comment to the section headed What are the advantages and disadvantages of using libuvc vs patched kernel modules then there is a comparison of the advantages and disadvantages of RSUSB backend (formerly known as libuvc backend) versus a patched-kernel build.

Regarding the precise kernel modules that the camera uses, I am uncertain about this question (I am not an SDK developer) but the list of module 'inserts' in the SDK's kernel patch script may be a useful reference.

https://github.com/IntelRealSense/librealsense/blob/master/scripts/patch-realsense-ubuntu-lts.sh#L292-L311

@gc-91
Copy link
Author

gc-91 commented Jan 21, 2022

Hello Marty,

thanks for the information now I have a better understanding about the RSUSB_BACKEND flag.

I have now freshly set up my system, I have installed the librealsense according to your instructions (#9931). Before that, of course, I provided the necessary packages (libusb-1.0-0-dev, libglu1-mesa-dev, gffw). The compile flag RSUSB_BACKEND is ON.
I also adjusted the configuration and set enable_depth = true, including the lower resolution (424x240) and 15 fps.
Furthermore, I managed to clean up the tainted kernel, it was due to the aufs storage driver of docker. After switching to overlay2, the kernel is no longer tainted.

Nevertheless, I had a system freeze this morning when I started the realsense ROS node with Nodelt. Here is an excerpt from /var/log/messages:

uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!
input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input773
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!
input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input774
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!
input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input775
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!
input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input776
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!
input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input777
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!
input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input778
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
general protection fault: 0000 [#1] PREEMPT SMP PTI
CPU: 2 PID: 7515 Comm: nodelet Not tainted 4.19.0-12-rt-amd64 #1 Debian 4.19.152-1
Hardware name: CINCOZE DI-1000/DI-1000, BIOS 5.11 01/19/2019
RIP: 0010:__kmalloc+0xaa/0x250
Code: 48 8b 70 08 48 39 f2 75 e7 48 83 78 10 00 48 8b 28 0f 84 3d 01 00 00 48 85 ed 0f 84 34 01 00 00 41 8b 47 20 49 8b 3f 48 01 e8 <48> 8b 18 48 89 c1 49 33 9f 30 01 00 00 48 89 e8 48 0f c9 48 31 cb
RSP: 0018:ffffa567421afad0 EFLAGS: 00010286
RAX: ffff6af652604e87 RBX: 0000000000000000 RCX: 0000000000026660
RDX: 00000000bec69802 RSI: 00000000bec69802 RDI: 0000000000026660
RBP: ffff6af652604e87 R08: ffffffffc04cddc0 R09: ffff9474db578ea0
R10: ffff94755e863b94 R11: 0000000000000001 R12: 00000000006080c0
R13: 000000000000000d R14: ffff947565403b00 R15: ffff947565403b00
FS: 00007f0993fff700(0000) GS:ffff947565b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055fb8cf5f280 CR3: 00000001fe590001 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
? uvc_ctrl_add_info+0x57/0xa0 [uvcvideo]
uvc_ctrl_add_info+0x57/0xa0 [uvcvideo]
uvc_ctrl_init_device+0x2c0/0x360 [uvcvideo]
uvc_probe.cold.18+0x198/0xc59 [uvcvideo]
? usb_probe_interface+0xea/0x300 [usbcore]
? uvc_register_video_device+0x120/0x120 [uvcvideo]
usb_probe_interface+0xea/0x300 [usbcore]
really_probe+0x20a/0x3b0
? __driver_attach+0x110/0x110
driver_probe_device+0xb3/0xf0
? __driver_attach+0x110/0x110
bus_for_each_drv+0x79/0xc0
__device_attach+0xe5/0x160
proc_ioctl+0x1b6/0x230 [usbcore]
usbdev_do_ioctl+0x270/0x1110 [usbcore]
? hrtimer_try_to_cancel+0x29/0x150
? hrtimer_cancel+0x1b/0x30
? do_nanosleep+0x42/0x160
usbdev_ioctl+0xa/0x10 [usbcore]
do_vfs_ioctl+0xa4/0x630
? __hrtimer_init_sleeper+0x60/0x60
ksys_ioctl+0x60/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x53/0x110
entry_SYSCALL_64_after_hwframe+0x44/0xa9

The messages:

input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-4/2-4.3/2-4.3.3/2-4.3.3:1.0/input/input776
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-18 directory.
uvcvideo 2-4.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-4.3.3:1.0: Entity type for entity Camera 1 was not initialized!

Are no longer permanently written to the system logs at 2Hz after switching from enable_depth= false to enable_depth = true, but only during the startup process of the realsense node.

I suspect that something is wrong while the logs are being written, and it can therefore lead to a freeze in rare cases.
Since the number of messages has now been reduced by the above-mentioned parameter adjustment, the probability of a freeze also decreases. However, since it can still lead to a freeze, I would like to know exactly what the logs say. Can you send me more information on this?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 21, 2022

You could test the theory about whether logging causes the freeze by disabling all ROS logging using the information in the link below.

http://wiki.ros.org/rosconsole#Disable_all_logging

Have you previously built librealsense from packages instead of source code on the same computer? If you have then I wonder whether there may be multiple udev rules installed on your computer that are causing a conflict. A warning message about multiple udev rules would normally be displayed in the RealSense Viewer, but you are not using the Viewer because of your CLI environment.

If there is an udev rule present on your computer at the location /etc/udev/rules.d/ then deleting it from your computer automatically corrects the multiple udev rules conflict without you needing to check in the RealSense Viewer.

Only delete the /etc rule though if there is also an udev rule present at /lib/udev/rules.d/60-librealsense2-udev-rules.rules. If you have rules at both /etc and /lib then you have multiple udev rules installed, and /etc is the correct one of those two rules to delete.

@gc-91
Copy link
Author

gc-91 commented Jan 24, 2022

Hi Marty,

thanks for the suggestion to disable the ROS logs, however I wanted to know what is the cause of the posted kernel logs.

I have NOT installed librealsense from the packages before, but built it directly from source. Nevertheless, I found udev-rules in both places (/etc.. and /lib/..). However, the one under /lib/... is called 60-ros-noetic-realsense2-camera.rules.
I load the Realsense-ROS-Node from the packages, could this be the cause?
I removed the udev-rule from /etc/... and received a freeze immediately after the reboot and start the realsense2_rgb.launch, but this time with a different error pattern in the kernel logs:

general protection fault: 0000 [#1] PREEMPT SMP PTI
CPU: 2 PID: 2614 Comm: systemd-udevd Not tainted 4.19.0-12-rt-amd64 #1 Debian 4.19.152-1
Hardware name: CINCOZE DI-1000/DI-1000, BIOS 5.11 01/19/2019
RIP: 0010:__kmalloc_track_caller+0xa5/0x250

After a repeated reboot, the system ran without the udev-rule under /etc/..

@MartyG-RealSense
Copy link
Collaborator

The RealSense ROS packages in the wrapper's Method 1 instructions install both librealsense and the ROS wrapper together. So if you installed librealsense from source code and then afterwards installed the wrapper from packages with Method 1 then you could end up with a potentially conflicting setup of two librealsense versions on the same computer that were built with different methods (source code and packages).

You could deal with the conflict by uninstalling all RealSense related packages on your computer using the instruction below.

dpkg -l | grep "realsense" | cut -d " " -f 3 | xargs sudo dpkg --purge

This instruction would only affect packages, so the source-code build of librealsense should be left in place afterwards.

If installing the ROS wrapper from the Method 1 packages is the most familiar and comfortable installation method for you, then you could try the following procedure.

  1. Delete the librealsense source-code build by going to the build directory of the librealsense folder and running the CMake command below: to uninstall librealsense and clean the CMake cache

sudo make uninstall && make clean

  1. Install librealsense and the ROS wrapper together from packages with Method 1.

@gc-91
Copy link
Author

gc-91 commented Jan 25, 2022

Thanks for the tip, I was not aware that librealsense is also installed from the packages when installing the ROS wrapper from the packages.

I have now rebuilt my system and built both librealsense and the ROS wrapper from source. (Since my kernel version is not supported, I'm not allowed to use method 1, am I?).
The startup process of the wrapper is now much faster, but when I exit the wrapper and restart, I still get the following entries in the kernel log:

WARNING: CPU: 3 PID: 1246 at drivers/media/v4l2-core/v4l2-ioctl.c:1342 v4l_enum_fmt+0xfd5/0x13a0 [videodev]
Modules linked in: can_bcm can xt_conntrack ipt_MASQUERADE nf_conntrack_netlink xfrm_user xfrm_algo nft_counter xt_addrtype nft_compat nft_chain_nat_ipv4 nf_nat_ipv4 nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nf_tables nfnetlink br_netfilter bridge stp llc sr_mod cdrom overlay cpufreq_userspace cpufreq_conservative cpufreq_powersave snd_hda_codec_hdmi uas usb_storage snd_hda_codec_realtek intel_rapl snd_hda_codec_generic x86_pkg_temp_thermal intel_powerclamp ftdi_sio usbserial coretemp kvm_intel snd_soc_skl snd_soc_skl_ipc kvm snd_soc_sst_ipc snd_soc_sst_dsp irqbypass snd_hda_ext_core crct10dif_pclmul snd_soc_acpi_intel_match crc32_pclmul snd_soc_acpi ghash_clmulni_intel snd_soc_core intel_cstate snd_compress snd_hda_intel intel_uncore snd_hda_codec snd_hda_core intel_rapl_perf
snd_hwdep snd_pcm snd_timer iTCO_wdt pcspkr i915 snd mei_wdt iTCO_vendor_support soundcore idma64 uvcvideo peak_pci sg videobuf2_vmalloc sja1000 drm_kms_helper videobuf2_memops evdev can_dev videobuf2_v4l2 mei_me videobuf2_common drm mei intel_pch_thermal videodev media pcc_cpufreq video wmi acpi_pad button ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb sd_mod crc32c_intel xhci_pci ahci xhci_hcd libahci aesni_intel igb aes_x86_64 crypto_simd libata intel_lpss_pci i2c_algo_bit cryptd usbcore intel_lpss glue_helper dca e1000e i2c_i801 scsi_mod fan mfd_core usb_common thermal
CPU: 3 PID: 1246 Comm: nodelet Not tainted 4.19.0-12-rt-amd64 #1 Debian 4.19.152-1
Hardware name: CINCOZE DI-1000/DI-1000, BIOS 5.11 01/19/2019
RIP: 0010:v4l_enum_fmt+0xfd5/0x13a0 [videodev]

This entry is output 16 times in a row and is followed by the following message, which appears about 100 times in a row:

input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb1/1-1/1-1.3/1-1.3.3/1-1.3.3:1.0/input/input13
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 1-7 directory.
uvcvideo 1-1.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 1-1.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 1-1.3.3:1.0: Entity type for entity Camera 1 was not initialized!

Therefore, I suspect that I can expect a freeze again after a few reboots.

The ROS wrapper also writes the following warnings during the start-up process and during operation:

(messenger-libusb.cpp:42) control_transfer returned error, index: 768, error: Resource temporarily unavailable, number: 11

edit: the first freeze has just occurred.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 25, 2022

Correct, kernel 4.19 is not listed as an officially supported kernel as mentioned earlier in this discussion at #10158 (comment)

The messages look to me like they might be caused by an unpatched kernel. When you built librealsense from source code this time, did you build using the RSUSB backend method in order to make the build non-dependent on the kernel version?

@gc-91
Copy link
Author

gc-91 commented Jan 26, 2022

Yes I built it with RSUSB-BACKEND = ON, just like you described it here #9931 (comment).

@MartyG-RealSense
Copy link
Collaborator

If you built it with RSUSB-BACKEND = ON and librealsense 2.50.0, you may have greater stability if you build with 2.48.0 and ROS wrapper 2.3.1. This is because changes to RSUSB were made in 2.50.0 and a few RealSense users have since reported stability problems with their RSUSB build that they do not experience on earlier versions.

@gc-91
Copy link
Author

gc-91 commented Jan 26, 2022

Hello I have tested the 2.50.0 with the wrapper 2.3.2 further, almost every 3 start of the wrapper my system freezes.
I have the same behaviour with librealsense 2.48.0 and 2.3.1.

The behaviour is always the same, during the start process the following log is written cyclically 91 times:

input: Intel(R) RealSense(TM) Depth Ca as /devices/pci0000:00/0000:00:14.0/usb2/2-1/2-1.3/2-1.3.3/2-1.3.3:1.0/input/input91
uvcvideo: Unknown video format 00000050-0000-0010-8000-00aa00389b71
uvcvideo: Found UVC 1.50 device Intel(R) RealSense(TM) Depth Camera 435 (8086:0b07)
uvcvideo: Unable to create debugfs 2-8 directory.
uvcvideo 2-1.3.3:1.0: Entity type for entity Intel(R) RealSense(TM) Depth Ca was not initialized!
uvcvideo 2-1.3.3:1.0: Entity type for entity Processing 2 was not initialized!
uvcvideo 2-1.3.3:1.0: Entity type for entity Camera 1 was not initialized!

Can you please give me hints about these logs? Is it usual that these messages occur? Is it also usual that they occur so frequently? What exactly do the messages say?

After that I can receive pictures from the camera.
Sometimes the start does not succeed and the system freezes. The last thing in the log are numerous RIP messages. The first one looks like this:

general protection fault: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 3807 Comm: nodelet Not tainted 4.19.0-12-rt-amd64 #1 Debian 4.19.152-1
Hardware name: CINCOZE DI-1000/DI-1000, BIOS 5.11 01/19/2019
RIP: 0010:__kmalloc_node_track_caller+0x1ca/0x2f0

@MartyG-RealSense
Copy link
Collaborator

Do you see any improvement if you reload the udev rules by entering the commands below into your Ubuntu terminal?

sudo udevadm control --reload-rules
sudo udevadm trigger

@gc-91
Copy link
Author

gc-91 commented Jan 28, 2022

Hello, no unfortunately the commands do not help.

@MartyG-RealSense
Copy link
Collaborator

If you have tried most other possible fixes without success, a complete wipe and reinstall of the computer (including the OS) may sometimes clear a problem with a RealSense installation without finding out what was causing the problem.

@gc-91
Copy link
Author

gc-91 commented Jan 28, 2022

Hi Marty,

since I use the librealsense and ros-wrapper inside a Docker container, I have tested each of your suggestions in a fresh build container.
For info, my container is running with the options privileged: true and network_mode: host

Unfortunately you didn't address the question about kernel logs in my penultimate post, could you please answer my questions?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Jan 28, 2022

It is difficult to provide an answer about the meaning of the 'entity type' kernel messages, since their cause tends to be vague and if they are corrected then you may never find out for sure what caused them in the first place. I enquired to Intel about them in 2020 and received the following explanation.


The message “Entity camera ... not initialized" is related to UVC topology defined in the firmware descriptor for USB device.
Those are protocol-specific messages and are apparently stemmed from firmware, but since there are no observed misbehaviors reported and we have them from day 1 with D400, it is likely to remain that way for a while.

@MartyG-RealSense
Copy link
Collaborator

Hi @gc-91 Do you require further assistance with this case, please? Thanks!

@gc-91
Copy link
Author

gc-91 commented Feb 3, 2022

Hi Marty,

In the meantime I have carried out further tests on my system and have been able to locate the reason for my freezes.

Basically, it is as you had already suspected, namely that several variants / versions of librealsense had been installed on my system. The reason for the multiple installations was the use of rosdep install.
I had used rosdep install because the wrapper needs additional dependencies to build from source. Unfortunately, rosdep install cannot detect that librealsense already built from source is installed on the system and simply installs the latest version from the packages to it.
Therefore, I can recommend either installing the necessary dependencies for the ros-wrapper manually or adapting rosdep install and excluding librealsense.
Then the previously installed librealsense will also be used and there will be no spam in the kernel and consequently no system freeze.

@MartyG-RealSense
Copy link
Collaborator

It's excellent to hear that you found an answer. Thanks very much for the update and for sharing the details with the RealSense community :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants