Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OA Failsafe improvement #11187

Merged
merged 3 commits into from
Jan 16, 2019
Merged

OA Failsafe improvement #11187

merged 3 commits into from
Jan 16, 2019

Conversation

MaEtUgR
Copy link
Member

@MaEtUgR MaEtUgR commented Jan 10, 2019

Describe problem solved by the proposed pull request
Fit the safety requirements of obstacle avoidance testing #11058
To test OA you typically plan a mission through an obstacle. The OA "module" then processes the setpoints and adjusts if needed. If this module for some reason fails the drone should not continue the mission and crash into the obstacle.

Describe your preferred solution
In my eyes this error case should be handeled through the following path:

  • Obstacle Avoidance wrapper sees something went wrong, no connection anymore.
  • The flight task calling the OA wrapper gets to know via return value and returns false to its update() method to indicate the task can not be further pursued.
  • The flight task host (currently the postion control module) handles the failsafe case by switching to the appropriate reaction which is ideally the the Failsafe flight task.

EDIT: There is no OA wrapper/library yet and everything is called in the position control module class directly.

The last point is where I'm trying to find an interim solution to unblock currently unsafe tests and a future proof solution asap.

Right now I'm thinking about just informing the commander it should switch mode in that case.

Test data / coverage
SITL (I have no OA setup and just mock inputs)

@mrivi
Copy link
Contributor

mrivi commented Jan 10, 2019

@MaEtUgR what about handling the failsafe directly in commander since the information is already there to output warning to the user (#10933) ?

@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 11, 2019

@mrivi I'm open for discussion but my opinion is we should work towards freeing commander from any unnecessary things because it's too complex and big and the concept doesn't scale. Now my suggestion is that in the module/library which consumes data checks are done if it's usable and the result gets passed along in a hierarchical way. This would benefit modularity a lot.

For this pr I confused obstacle avoidance with the prevention library, sorry for that. Since the data for OA is currently used in the position controller directly I'd make it send the commander a loiter (hold mode) command at the point where it's clear the necessary data is missing.

@mrivi
Copy link
Contributor

mrivi commented Jan 11, 2019

@MaEtUgR I understand your concerns about commander but then I would also free it from the warning logic about avoidance. To me it doesn't make sense that the information is duplicated in two places, one where we output a warning to the user and one where the actual action is taken. They would get out of sync very easily if changes are made.

There were accidents because when missions lead through an obstacle
and it should be avoided but the setpoints from the external obstacle
avoidance module are suddenly missing the mission is continued into the
obstacle which results in a crash.
@MaEtUgR MaEtUgR force-pushed the failsafe-extension branch from 867f4bb to 56161a4 Compare January 13, 2019 12:37
@MaEtUgR MaEtUgR requested review from mrivi and baumanta January 13, 2019 12:38
@MaEtUgR MaEtUgR self-assigned this Jan 13, 2019
@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 13, 2019

@mrivi I totally agree, the warning should come from where the decision is made.

Based on the chat with @baumanta last Friday I figured a new simpler suggestion now. When OA is handled in the flight task it should live in its own library which is instanciated exclusively there. For now it makes sense to extend the use_obstacle_avoidance() method because that's what currently decides and we can move that into a modular obstacle avoidance library in the next step. Feel free to add land as reaction during RTL if that's the requirement and a warning message.

@baumanta Sorry for force pushing, I still have the original branch.

@baumanta
Copy link
Contributor

@MaEtUgR this looks like a nice solution to me. It does not exactly do all I wished for, but I think it avoids most of the dangerous situations we had :)

What I tested in SITL:

  • Mission with OA, I killed the planner and the drone went to Loiter
  • Position control with MPC_OBS_AVOID = 1, loosing RC connection. Drone went to RTL (it was able to execute RTL without getting OA points)

What I'm still a bit concerned about/ would be nicer if different:

  • Once in Loiter, if RC connection is lost the drone will just stay there and there is no way to get it down (that actually happened to us once). I think for safety we should implement a timeout after which it goes to RTL.
  • Mission with OA, planner timout, switch to loiter. Now when the planner start sending points again, the drone stays in loiter and does not continue the mission. This has not first priority, but I might be worth discussing whether a mission continue would be desired.

@baumanta
Copy link
Contributor

I also think, that the send_vehicle_cmd_do(LOITER) is probably not the cleanest option. Because it overwrites the nav_state and the vehicle does not know anymore, that it is supposed to fly a mission. It is just happy staying there and loiter. Switching to a Failsafe Flightask would imo be the cleaner solution. But I do not know if that is possible in that function (maybe it will be overwritten in the start_flight_task function). Also I think a timeout is necessary to swith out of LOITER and to LAND/RTL.

@MaEtUgR
Copy link
Member Author

MaEtUgR commented Jan 14, 2019

Once in Loiter, if RC connection is lost the drone will just stay there and there is no way to get it down (that actually happened to us once). I think for safety we should implement a timeout after which it goes to RTL.

That is supported, you just need to configure according to https://docs.px4.io/en/config/safety.html#rc-loss-failsafe.

Summary: Set NAV_RCL_ACT = 2 and NAV_DLL_ACT = 0. Because loiter doesn't require RC but could also be used with QGC only but if you don't configure a failsafe for when QGC is lost it will failsafe because of rc loss. See the code doing that here: https://github.com/PX4/Firmware/blob/master/src/modules/commander/state_machine_helper.cpp#L539

With this configuration you should not need 0fa6eeb. Could you check?

Mission with OA, planner timout, switch to loiter. Now when the planner start sending points again, the drone stays in loiter and does not continue the mission. This has not first priority, but I might be worth discussing whether a mission continue would be desired.

Yeah, now you need to switch to mission again but I thought safety > convenience. We can still implement this in the next step. I don't plan to stop here.

Switching to a Failsafe Flightask would imo be the cleaner solution. But I do not know if that is possible in that function (maybe it will be overwritten in the start_flight_task function).

Exactly, solving this is the next step. Then it's also easy to continue again. But I think the overall failsafe strategy needs to be very clear for that to succeed in the long term. If we hack in every case in a different way I'm sure it would fall apart soon.

@baumanta
Copy link
Contributor

@MaEtUgR That's nice! I tested it in SITL and with the parameters you specified it goes to LOITER as long as an RC connection is available and as soon as the RC is disconnected it switches to RTL. This is fine with me so far, it solves the problems I had and is much cleaner than my suggestion :)

@MaEtUgR MaEtUgR merged commit 7b9cfac into master Jan 16, 2019
@MaEtUgR MaEtUgR deleted the failsafe-extension branch January 16, 2019 18:13
@MaEtUgR MaEtUgR changed the title [WIP] Failsafe improvement OA Failsafe improvement Jan 16, 2019
@MaEtUgR MaEtUgR mentioned this pull request Jan 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants