Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ROS2 Iron: Action Servers Become Unresponsive After Modifying Collision Checking #1337

Open
Infraviored opened this issue Aug 19, 2024 · 1 comment

Comments

@Infraviored
Copy link

Infraviored commented Aug 19, 2024

I'm working on a ROS2 Iron project and have encountered a perplexing issue with my action servers. I've implemented collision checking modification without using MoveIt Task Constructor (which isn't available for Iron). The core functionality works, but I'm facing a critical problem:

The Issue:
After executing a collision checking modification, ALL action servers in my main node become unresponsive. Here's what I've observed:

  1. The modify_collision_checking function executes successfully and completes.
  2. I receive a success message: [INFO] Completed modify_collision_checking: test_box, ['r1_gripper'], True, telling me the funciton executed all the way.
  3. After this, no action servers in the main node respond to messages.

What I've Tried:

  • Other action servers work fine when used independently.
  • If I replace the content of modify_collision_checking with pass, it works repeatedly without issues.

The Code:
Here's a simplified version of my CollisionObjectHandler class:

class CollisionObjectHandler:
   def __init__(self, node: Node):
       self.node = node

   def modify_collision_checking(
       self, object_id: str, target_names: list, allow: bool
   ):
       try:
           get_planning_scene = self.node.get_planning_scene
           while not get_planning_scene.wait_for_service(timeout_sec=1.0):
               self.node.logger.info(
                   "get_planning_scene service not available, waiting..."
               )


           req = GetPlanningScene.Request()
           req.components.components = PlanningSceneComponents.ALLOWED_COLLISION_MATRIX


           future = get_planning_scene.call_async(req)
           rclpy.spin_until_future_complete(self.node, future)
           response = future.result()


           if response is None:
               self.node.logger.error("Failed to get current planning scene")
               return


           current_scene = response.scene


           if current_scene.allowed_collision_matrix.entry_names:
               acm = current_scene.allowed_collision_matrix
           else:
               acm = AllowedCollisionMatrix()


           self.node.logger.debug(f"Current ACM entries: {acm.entry_names}")


           for name in [object_id] + target_names:
               if name not in acm.entry_names:
                   acm.entry_names.append(name)
                   new_entry = AllowedCollisionEntry()
                   new_entry.enabled = [False] * len(acm.entry_names)
                   acm.entry_values.append(new_entry)


           for entry in acm.entry_values:
               while len(entry.enabled) < len(acm.entry_names):
                   entry.enabled.append(False)


           obj_index = acm.entry_names.index(object_id)
           for target_name in target_names:
               target_index = acm.entry_names.index(target_name)
               acm.entry_values[obj_index].enabled[target_index] = allow
               acm.entry_values[target_index].enabled[obj_index] = allow
               self.node.logger.debug(
                   f"Modified collision between {object_id} and {target_name}: {allow}"
               )


           planning_scene_msg = PlanningScene()
           planning_scene_msg.is_diff = True
           planning_scene_msg.allowed_collision_matrix = acm


           self.node.scene_pub.publish(planning_scene_msg)


           action = "Allowed" if allow else "Disallowed"
           self.node.logger.debug(
               f"{action} collisions between {object_id} and {target_names}"
           )


           self.node.logger.debug("Updated ACM:")
           for i, entry_name in enumerate(acm.entry_names):
               self.node.logger.debug(f"{entry_name}: {acm.entry_values[i].enabled}")


       except Exception as e:
           self.node.logger.error(f"Error modifying collision checking: {str(e)}")


       self.node.logger.info(
           f"Completed modify_collision_checking: {object_id}, {target_names}, {allow}"
       )





   def change_object_collision_callback(self, goal_handle):
       self.node.logger.info(
           f"Received new change_object_collision request: {goal_handle.request}"
       )
       object_id = goal_handle.request.object_id
       allow_collision = goal_handle.request.allow
       gripper_name = "r1_gripper"


       if allow_collision:
           self.modify_collision_checking(object_id, [gripper_name], True)
       else:
           self.modify_collision_checking(object_id, [gripper_name], False)

       success = True
       message = f"{'Allowed' if allow_collision else 'Disallowed'} collisions between {object_id} and gripper"
       goal_handle.succeed()
       return ChangeObjectCollision.Result(success=success, message=message)

And I'm setting up the action server like this in my main node (which imports CollisionObjectHandler:

from rclpy.action import ActionServer
from rclpy.node import Node
from script1 import CollisionObjectHandler

class PlanningAndSceneManagerServer(Node):
    """
    A ROS2 node for managing the scene, including mesh acquisition and object placement.
    """

    def __init__(self):
        super().__init__("scene_manager_server")
        self.collision_object_handler = CollisionObjectHandler(self)
        self.scene_pub = self.create_publisher(
            PlanningScene, "/planning_scene", qos_profile
        )
        self.get_planning_scene = self.create_client(
            GetPlanningScene, "/get_planning_scene"
        )

        self._change_collision_server = ActionServer(
            self,
            ChangeObjectCollision,
            "change_object_collision",
            self
        self._change_attachment_server = ActionServer(
            self,...

The Question:
What could be causing all action servers to become unresponsive after modify_collision_checking executes? Is there something in this function that might be interfering with the ROS2 communication infrastructure?

Again, the function itself does not get stuck. It terminates successfully and the ActionServer sends a Success message. Only afterwards, all action servers (also these fully independent of CollistionObjectHandler) of the main script no longer receive.

Any insights or suggestions for debugging this issue would be greatly appreciated. I'm open to alternative approaches or workarounds if necessary.

@fujitatomoya
Copy link
Collaborator

Thanks for posting issue.

To be sure, I would want to see the complete application but guessing like below.

So i am expecting that modify_collision_checking is called as ActionServer.execute_callback, and modify_collision_checking calls service request get_planning_scene.call_async(req) and waiting on the result via rclpy.spin_until_future_complete in the execute_callback in the executer.

This is not recommended design, that calling the blocking call in the callback of executor.

What happens here is after rclpy.spin_until_future_complete completed, self.node is removed from the executor.
That said, modify_collision_checking function can be completed as you mentioned, but no more executable entity cannot be taken or executed in this executor anymore since the node has been removed from the executer.

This behavior is originally bug, reported #1313 and fixed with #1316.
downstream distro backports are still under review, so your distribution iron does not have this fix yet.

besides above issue and bug fix, i would like you to try async & await design for modify_collision_checking.
the followings are pseudo code, but i believe it should fix your blocking problem in the application.

   async def modify_collision_checking(
       self, object_id: str, target_names: list, allow: bool
   ):
       try:
           get_planning_scene = self.node.get_planning_scene
           while not get_planning_scene.wait_for_service(timeout_sec=1.0):
               self.node.logger.info(
                   "get_planning_scene service not available, waiting..."
               )


           req = GetPlanningScene.Request()
           req.components.components = PlanningSceneComponents.ALLOWED_COLLISION_MATRIX


           future = get_planning_scene.call_async(req)
           # REMOVE THIS: rclpy.spin_until_future_complete(self.node, future)
           result = await future
           
           response = result

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants