-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
insufficient performance in the QoS demo using default parameters #202
Comments
Another data point: a new ThinkPad X1 (i7-8550U) running Ubuntu 18.04. Qualitative performance is the same as reported by @dirk-thomas : FastRTPS-FastRTPS
Connext-Connext
|
For people testing this today: if you can afford to rebuild a workspace before testing, please consider using the To use it:
Example of resulting output:
Subscriber:
So I can confidently say that for me Fast-RTPS is smooth and doesnt show latency at 30 Hz (3ms of delay after 400 frames). |
Do you see non-trivial jitter? I'm not sure how to quantify that other than gathering a lot of samples and showing the distribution in a log-log plot or something. We could compute first-order statistics (median, std.dev, etc) but those tend to not fully capture the things we worry about, like the worst-case or |
I tested Fast-RTPS on my Fedora machine as well (all ros2 repos on latest master). The Fast-RTPS -> Fast-RTPS was also very stuttering, just like what @dirk-thomas mentioned in the initial comment. I'm having some trouble rebuilding on the |
Seeing the same stuttering with fastrtps on my 16.04 desktop. These are some screen captures of the performance: |
I finally got the timestamps compiling, and took some data on my machine. First, this is the default of 320x240, and the cam2image process took ~12% of a core and the showimage took ~10% of a core (so I was not CPU bound). Second, once I took the data I post-processed it with a small python script to calculate the delta between successive publish/receives (I can share that script later if needed). The cam2image publishing data is here: https://gist.github.com/clalancette/def97b55da9c6c2af52f9a6195b4845d , and it shows about what we expect; we publish an image every ~33ms, for a frame rate of around 30Hz. The showimage subscribe data is here: https://gist.github.com/clalancette/1d6259b52e6679a96c8f3eafb2ffd9de . It shows that the data is being received in a bursty manner. Looking at line 57-82, for instance, we see that it is receiving data at ~33ms, and then at line 83 there is an 'event' that causes the next frame to take ~5 times longer. That causes a backlog of 5 items which are processed quickly (lines 84-88), at which point things get back to normal. |
We were commenting the issue in our daily meeting. We are now working in some performance improvements, including the rmw layer, so it is perfect timing to analyze this case. We will update you guys soon. |
Result of the 640x480 test on the 3 rmw implementations with the following setup:
640x480:
Serious stuttering with FastRTPS, very smooth for the other ones. |
I experienced the same behavior as @mikaelarguedas on my 2017 macbook pro (macOS 10.13.3). I noticed a few behaviors of interest to me:
I experienced similar behaviors on my Windows desktop, not exactly the same, but the same trend (enough to convince me that it's not OS dependent at this point). I can provide timestamped data if needed on any platform. |
@andreaspasternak can you weigh in here please? |
In this branch I've decreased default values for some timers and it improves the responsive at RTPS level. I've noticed it using cam2image and showimage. Can anyone test with this branch? |
Thanks @richiware for looking into it. The behavior on the hotfix branch is much better indeed! Could you breifly describe what are the tradeoffs being made using the new timer values? Result on my XP15 using the new branch:
With the hotfix and for this test, Fast-RTPS has less latency than the other rmw implementations using default values: Note: I haven't tested other frequencies of message sizes |
The branch definitely shows a better performance with the burger demo. Thank you for the quick update 👍
Yeah, any information what you expect as "side effects" for other use cases would be good to hear. |
Hi all, Regarding this:
Next week we will release as open source a performance measurement tool for various communication systems. Its core features are:
We were working together with eProsima (FastRTPS) to improve performance under real-time conditions for autonomous driving with great success. In the following graphs, note that y-axis is latency in ms and x-axis is the duration of the experiment in seconds. Using directly FastRTPS on a real-time system with real-time settings shows good results for medium-sized data at 1000 Hz: Large data also works well at 30 Hz after applying the settings in http://docs.eprosima.com/en/latest/advanced.html#finding-out-system-maximum-values: We found for example that busy waiting (meaning not using a waitset) leads to a very significant performance degradation due to expensive mutex locking/unlocking. Using the current dynamic type support of FastRTPS together with ROS 2 does not provide optimal performance, but this will be very likely shortly resolved by the upcoming new RMW implementation for FastRTPS with static type support. If you like to test other RMW implementations you can do that using the RMW_IMPLEMENTATION environment variable. We took care to implement the performance test tool real-time capable by:
Here the latency at 100 Hz with real-time functionality enabled: |
@richiprosima Can you please respond to the above questions? We would like to get this resolved (merged into master) rather sooner than later. Thanks. |
@mikaelarguedas @dirk-thomas I've decreased two QoS values, one for publishers and other for subscribers.
Decreasing these values the response to lost data improves. A lost sample takes less time to be forwarded. This is more appreciated in the video scenario because you can see the stuttering. Why data is lost? A video frame is to large to default linux socket buffer size and some RTPS fragments are lost. This week I want to play more with these values and it will be merge at the end of the week. |
|
I've updated my comment ↑ |
Thanks @richiware for the updated comment! (sorry if it's obvious) Can you give examples of scenarios where a shorter As a made up example of the kind of "scenarios" I'm thinking of: Is it possible that in network environments with significant latency, fragments will be considered lost while they may still be on the way, and with a shorter |
|
It's concerning to me that these parameters are fixed times. It would probably help if they were related to network latency somehow so you don't have to tune them manually for the type of network you are using. |
@richiware As far as I can see the patch hasn't been merged yet. Can you clarify the timeline when that is planned to happen? |
@dirk-thomas , @MiguelCompany and I were working in a new release. We are integrating all new features, and also this patch, in our internal develop branch. I want to release it to master tomorrow Wednesday. Sorry for the inconveniences. |
With the hotfix merged upstream I will go ahead and close this. |
@dirk-thomas for traceability it would be nice to link to the upstream commit where the fix happened. |
Commit that changed the timer values: eProsima/Fast-DDS@201a393 |
The scope of this ticket is focus on the performance of the QoS demo with default values between publisher and subscriber with the same RMW implementation. The cross vendor results are only mentioned for completeness / context.
To reproduce run the easiest example from the QoS demo:
ros2 run image_tools cam2image -b
ros2 run image_tools showimage
Which means:
The following results show significant differences in the performance depending on which RMW implementation is chosen on the publisher and subscriber side (collected with the default branches on Ubuntu 16.04 on a Lenovo P50). Only the diagonal highlighted with "quality" colors is of interest for now:
When increasing the image size to
-x 640 -y 480
the problems become even more apparent:The acceptance criteria to resolve this ticket are:
PS: please don't suggest variations of the QoS parameters in this thread but keep this ticket focused on this very specific case. Other cases can be considered separately.
The text was updated successfully, but these errors were encountered: