Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mavlink threadding fixes #7226

Merged
merged 11 commits into from
May 18, 2017
Merged

Mavlink threadding fixes #7226

merged 11 commits into from
May 18, 2017

Conversation

bkueng
Copy link
Member

@bkueng bkueng commented May 12, 2017

This fixes point 5 from #7223 by moving the mavlink mission, param manager, ftp & log downloader into the receiver thread.

Tested on:

  • SITL
  • Pixracer, USB & WiFi
  • Pixhawk with 3DR telemetry

Test cases & results:

  • Param loading: pretty much the same observable behavior as before
  • Large mission upload & download (ca. 400WP): transaction times are roughly the same before and after
  • Log download: download speed is a bit faster now via USB, unchanged via telemetry (at 1.5KB/s, not that I think someone would actually do that :). Looking at the mavlink inspector, the attitude rate goes down to ~45Hz from 50Hz during download, which is expected. I get the same numbers on current master.

I also tested ULog streaming + log download in parallel via USB, and it works well, the attitude was still received with ~40Hz.

@lovettchris can you test if you still have FTP issues with this?
@ecmnet @tops4u can you test as well?

@tops4u
Copy link
Contributor

tops4u commented May 12, 2017

I'll be flying Missions on the Weekend currently on an RC Build a few weeks old. I don't have my own Build infrastructure so I'll have to wait for a Master Build being available for Download.

@dagar
Copy link
Member

dagar commented May 12, 2017

@tops4u https://s3.amazonaws.com/px4-travis/Firmware/mavlink_threadding_fixes/px4fmu-v2_default.px4

@ecmnet
Copy link
Contributor

ecmnet commented May 12, 2017

@bkueng I just tested your branch. No significant change compared to master in my case. Lost package-rate with companion connection at 921600 baud is about 1,5 packages per 10 seconds. This increases significantly when reading a log via Mavlink (slightly faster, ~10 Packages per second, 5MB file read at a rate of 14kb/s). Streaming ULOG returns a ULOG package lost ratio of ~12% (similar with master). But I never experienced a MAVlink freeze with my setup. Px4fmu-V2.

Update: Transfer of logs seem to abort after about 10secs, causing many retry package requests and the complete file can finally be retrieved successfully.

2017-05-12 17:13:03.745: [6] [mgc] Request latest log
2017-05-12 17:13:03.792: [6] [mgc] Importing Log (4) - 5048 kb
2017-05-12 17:18:42.306: [6] [mgc] Import completed (14 kb/sec)

Update 2: I am a bit confused about the update rate of attitude, as QGC shows > 200Hz.

@bkueng
Copy link
Member Author

bkueng commented May 12, 2017

@ecmnet what is your mavlink rate set to (or SYS_COMPANION param)? What does mavlink status say with and without ULog streaming?

@ecmnet
Copy link
Contributor

ecmnet commented May 12, 2017

My companion baud rate is default at 921600 baud - default settings (no flow control).

Without ULOG streaming:

a

with ULOG streaming:

b

Note, this is MAVlinkShell. Second mavlink status reply seems to be messed up a bit.

Maybe it is good to know that several days ago, I reduced the rate of all major telemetry streams. But this had no influence on my package loss rate.

@bkueng
Copy link
Member Author

bkueng commented May 12, 2017

You can see txerr: 14.778 which explains the high drop rate. This either means the physical link is saturated (or limited by the mavlink rate, but from the output I conclude it's set to 80k), or more likely the NuttX TX buffer is full. You can try to increase it, it's one of the CONFIG_USARTx_TXBUFSIZE configs, for example here, I don't know which UART number it is in your case (do a make clean every time you change it).
Also the ULog rate is quite high, it could occasionally hit the limit of 70%, which would also explain ULog drops.

@ecmnet
Copy link
Contributor

ecmnet commented May 12, 2017

Ok thanks, I'll give that a try.

@ecmnet
Copy link
Contributor

ecmnet commented May 13, 2017

@bkueng I had a significant improvement switching the buffer sizes of USART3 (for Telem2 on Px4fmu-V2) to 600 for rx and 1500 for tx. ULOG has (nearly) no drops anymore. Reading a log via mavlink still has some drops, but much less than before (15kb/sec). Telemetry has no losses at all. MavlinkShell not messed up anymore while streaming ULOG. My tests were based on your branch:

c

Does the increase of the buffer sizes have any other (negative) implications?

Increasing the txBuffer to 2000 is even better. There are no ULOG streaming losses any more. Loading logs also was improved again and transfer band with was now 32kb/sec using a 5Mb log file.

d

What size of the buffer would you recommend? Would it be worth to include a buffer size increase to master?

Increasing the txBuffer again to 3000 does not have any significant effect on the loss-ratio while loading logs. The transfer rate however was increased again and is now at 37kb/sec.

Maybe it would also be an idea to stop telemetry streaming while transferring log files.

Comparing master including the buffer increase (txBuffer 2000 bytes) to your branch shows, that master is slightly worse in terms of package loss. Transfer rate of log files falls back to 20kb/sec.

@bkueng bkueng force-pushed the mavlink_threadding_fixes branch from bbed5fb to 63294b9 Compare May 15, 2017 12:24
@bkueng
Copy link
Member Author

bkueng commented May 15, 2017

Thanks for your extensive testing! Is increasing the RX buffer needed as well?
The thing is that increasing the buffers directly reduces the free RAM by the same amount (it's a fixed allocation), so I'd like to keep it as small as possible.
I think we could increase TX to something between 1500 and 2000, but I leave the last call on this up to @LorenzMeier.

@LorenzMeier What do you think about including this PR into the release?

@ecmnet
Copy link
Contributor

ecmnet commented May 15, 2017

@bkueng I increased the RX buffer to 600 (as it was set for USART2). I did not perform any tests on RX side.

@LorenzMeier
Copy link
Member

Probably makes sense to get this into the release for the stability sake.

Copy link
Member

@LorenzMeier LorenzMeier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost good to go. I just have a few minor questions.

@@ -101,20 +102,6 @@ MavlinkMissionManager::~MavlinkMissionManager()
orb_unadvertise(_offboard_mission_pub);
}

unsigned
MavlinkMissionManager::get_size()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this? Doesn't this lead to buffer saturation when mission transfers are active?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the object is now entirely in the receiver thread, the main thread does not know about it anymore, and thus does not include it in the rate calculations.
I was a bit concerned about this too, but my tests showed that it still works well (with fast and slow links).
With regard to #7250, we could generally reduce the streams, when a mission/param transfer or log download is in progress.

@@ -86,8 +86,9 @@ MavlinkMissionManager::MavlinkMissionManager(Mavlink *mavlink) : MavlinkStream(m
_offboard_mission_sub(-1),
_mission_result_sub(-1),
_offboard_mission_pub(nullptr),
_slow_rate_limiter(_interval / 5.0f),
_verbose(false)
_slow_rate_limiter(100 * 1000), // Rate limit sending of the current WP sequence to 10 Hz
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not desirable as some links will get heartbeat timeouts if you fill them with mission items while other fast links will take many seconds for a couple hundred mission items even though they could transfer much faster.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This rate limiter is only applied for sending the current waypoint. i.e. during a mission, when the vehicle reaches a waypoing, this will change, and QGC can update the currently marked waypoint.


int i = 0;

while (i++ < max_num_to_send && send_one());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense.

@bkueng bkueng mentioned this pull request May 18, 2017
21 tasks
@LorenzMeier LorenzMeier merged commit 3d77102 into master May 18, 2017
@LorenzMeier LorenzMeier deleted the mavlink_threadding_fixes branch May 18, 2017 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants