Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset camera with unstable USB connection #8393

Closed
maltetoelle opened this issue Feb 19, 2021 · 9 comments
Closed

Reset camera with unstable USB connection #8393

maltetoelle opened this issue Feb 19, 2021 · 9 comments

Comments

@maltetoelle
Copy link


Required Info
Camera Model D400
Firmware Version 05.12.03.00
Operating System & Version Debian Buster
Kernel Version (Linux Only) 5.8.0-41-generic
Platform PC
SDK Version 2.41.0
Language python
Segment Robot

Issue Description

I am testing the reconnection capability of my Realsense camera in an environment with an instable USB connection. In order to simulate the connection issues I use two bash scripts/tools:

#!/bin/bash
echo -n "0000:00:14.0" | sudo tee /sys/bus/pci/drivers/xhci_hcd/unbind
echo "delay"
sleep 1
echo -n "0000:00:14.0" | sudo tee /sys/bus/pci/drivers/xhci_hcd/bind
echo "Reset finished"

and usbreset. When using both the stream freezes. I fixed that by using a simple try and except blog for wait_for_frames:

try:
    self.frames = self.pipeline.wait_for_frames()
except RuntimeError:
    print("New frames did not arrive within 5000.")
    self.reset()

and in the reset function a simple hardware_reset is called:

def reset(self):
    self.pipeline.stop()
    dev_list = self.ctx.query_devices()
    for dev in dev_list:
        dev.hardware_reset()
    time.sleep(0.5)

This works fine for resetting the whole USB controller in a loop and for resetting the camera after one usbreset. However, if the pauses between consecutive usbresets get too short, it fails. I believe this happens when a usbreset is called, while the camera is trying to reset itself with hardware_reset. I tried to tackle that issue first with a try ... except around the hardware_reset, but this just gives a segmentation fault

try:
    dev.hardware_reset()
except:
    print("hardware reset failed")

If I do not use the try ... except around the hardware_reset I get the following error:

Traceback (most recent call last):
  File "usb_resetloop.py", line 86, in <module>
    color_frame, depth_frame = stream.get_frames_np()
  File "/home/malte/Documents/camera_testing/camera.py", line 201, in get_frames_np
    self._get_frames()
  File "/home/malte/Documents/camera_testing/camera.py", line 195, in _get_frames
    self.reset()
  File "/home/malte/Documents/camera_testing/camera.py", line 124, in reset
    dev = dev_list[0]
RuntimeError: failed to set power state

I also tried to set_devices_changed_callback and set_notifications_callback but they were only triggered through the hardware_reset. This means, if the hardware_reset fails my program fails but this triggers the notifications callback with the notification USB SCP overflow or USB CAM overflow.

I do have a few questions now:

  1. Am I right that the hardware_reset fails, when a usbreset is called while the camera is resetting?
  2. How can I circumvent that?
  3. Is there any function that I can call before the hardware_reset to check whether a USB SCP/CAM overflow occurs?
@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Feb 19, 2021

Hi @maltetoelle

  1. I have not got specific information about that situation, but if the USB port was reset then it is likely that the camera would be reset too, similar to unplugging it from the USB port and re-inserting it. So in all probability the camera would be undetectable until its reset had completed.

  2. There was a case where a RealSense user was having problems with accessing the camera streams if the computer was reset unexpectedly due to events such as power failures. That user was advised to combine a hardware_reset instruction with the rs2::device_hub instruction to get a synchronous flow.

#1086 (comment)

That RealSense user also added the additional measure of a 5 second sleep period on the script line after initiating the hardware_reset in order to give the reset time to fully complete.

  1. If you wanted to check whether a particular error had already occurred before requesting hardware_reset, you could perhaps set up a logic variable that is set to 'true' if the error is detected by a catch and then set to 'false' at the point when the hardware_reset is called so that the variable is ready to be set to 'true' again next time that it is caught.

@maltetoelle
Copy link
Author

Hi @MartyG-RealSense

thank you very much for your answer!

  1. Yes, you are right. This is the behaviour if only a single usbreset is fired. The problem just appears if multiple are fired with very short pauses inbetween.
  2. I know about the synchronous flow that can be used in C++ and also wanted to try that but for now I am using Python and I could not find the rs2::device_hub in the python bindings. How can I use it in Python?
  3. I did try that but I could definitely invest more time into that.

I did, however, track the problem down further and believe found an option that is working. The problem appears, when I want to access my device_list:

try:
    dev = dev_list[0]
except RuntimeError:
    print("Failed to set power state")
    return

It that fails, the hardware_reset will also fail. Why is that the case? Why does accessing the device_list (making an rs2::device from an rs2::device_list which was returned from rs2::context.querying_devices()) raise an error when the device was found beforehand? If I do that first I can set a try ... except block around my hardware_reset, which still also fails sometimes, but atleast the program does not crash anymore.

I believe the segmentation fault appears, when I try to start the stream, when it was already running. Is that impossible?

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Feb 20, 2021

When I am writing code where there is a risk of an instruction being multi-triggered in a loop, I use the logic-statement approach to set up a system where the code inside an "If" statement is only allowed to activate if all logic conditions of the If statement are met. I create a secondary logic condition such as 'allowreset' and set the If statement up as something like this:

If ('x' condition = true && allowreset = true)
{
allowreset = false
[Rest of the If statement code]
}

When allowreset is true, the conditions of the If statement are met and the code within is allowed to activate for the first time. The first line of that code sets allowreset to False. This prevents the If statement from having its conditions met when it loops round to the start of the If instruction to test whether its conditions can be met again, so the code inside the If block cannot multi-trigger.

At some point (such as when reset has successfully completed), I set allowreset back to True so that the code inside the If block is able to be run again next time that a reset is required. When the code is activated, it again sets allowreset to False and prevents multi-triggering. Later, allowreset is set back to True. And so the cycle goes on.

In regard to rs2::device_hub, I was unable to locate a reference to an equivalent instruction for Python.

Occurrence of a Failed to set power state error may sometimes set up a situation where a complete reset of the camera through means other than hardware_reset() that is not dependent on camera detection to activate may be necessary (e.g a USB unplug-replug of the camera or a computer reboot).

It is possible for a segmentation fault to occur after the stream has started. Below is an example link to such a case. The amount of time before the fault appears varies though.

#1821

In past cases of checking if a stream has started, RealSense users have tried to catch an exception in order to detect it. This subject is discussed in regard to Python in the official Python documentation link below.

https://intelrealsense.github.io/librealsense/python_docs/_generated/pyrealsense2.pipeline.html#pyrealsense2.pipeline.start

@MartyG-RealSense
Copy link
Collaborator

Hi @maltetoelle Do you require further assistance with this case, please? Thanks!

@maltetoelle
Copy link
Author

Excuse me for not answering that long, I tried the different approaches but I am still not totally happy with the result. If I get a RuntimeError when accessing the device, I do now unbind and bind the usb controller, which brings back my camera. However, the acquisition process sometimes gets stuck in the call

depth_sensor, color_sensor = dev.query_sensors()

which means the application just stops without throwing an error. When using the rs2::device_hub in C++ the application runs more stable, but still crashes sometimes.

@MartyG-RealSense
Copy link
Collaborator

MartyG-RealSense commented Feb 26, 2021

Some multicam programs that make use of a ctx device list set up their code to use the instruction rs2::device dev = list.front(); if no devices are currently detected in the list. Here is an example of such a routine:

image

The script that uses the above code can be found at this link:

#4158

@MartyG-RealSense
Copy link
Collaborator

Hi @maltetoelle Do you require further assistance with this case, please? Thanks!

@maltetoelle
Copy link
Author

No, thank you very much! The problem is solved by checking if the list is empty.

@MartyG-RealSense
Copy link
Collaborator

Great news @maltetoelle - thanks for the update!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants