-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiling Ui - Round 1 #31
Comments
I also ran a benchmark on the texture_swap example to get an estimate of how fast it could be. On my machine, one textured rectangle should take approximately 0.000016 seconds. If we include the frame and extra stuff, let's say 0.00005, then rendering 21 buttons should take no more than 0.00105 seconds. About 1 millisecond in worst case. Need an estimate on how much Conrod uses per button. |
Notice: This was tested in debug mode, so the result is invalid. Running in bench mode for 2000 frames. 21 buttons: 20.701
This is 7.5 times slower than it could be. |
Hmmm does the profiler's call tree give you a percentage distribution of where the most time is being spent? The last time I checked, I think the drawing in elmesque is a big bottleneck for conrod - maybe it's worth checking out? |
@mitchmindtree I suspect that the drawing is not the biggest bottleneck, but when the elements are constructed. They make use of allocation which is a slow operation for rendering. I haven an idea: Since most of the interface consists of rectangles, perhaps we can find a solution where rectangle shapes are cheap? |
Btw, I don't want to draw conclusions at this point. There could be something in the drawing that makes it slow. However, the profiler shows |
Ahh I see! I always worried that all the boxing elmesque uses in its recursive I've been thinking about this for a while too (making layout cheaper). All of the unnecessary boxing that occurs happens within the 1. Re-think how elmesque's recursive
|
This profiling was done in debug mode, so it doesn't represent a reliable benchmark. The previous estimates of how long time spent rendering a button is not representative of the time spent in a released application. Conrod is a lot faster in release mode. Improved optimizations in the Rust compiler might also have affected the results. I forgot to include the version used in this test. In order to evaluate PistonDevelopers/conrod#626 properly we need to redo the estimates before Elmesque is removed from Conrod. This gives us a closer base to what level of performance of improvement we can expect by switching to primitives. Ran again for 2000 frames, but now in release, making sure that the overhead from Cargo is removed.
Code changes: let mut frames = (0..2000).into_iter();
for mut e in window.bench_mode(true) {
if let Some(_) = e.render_args() {
if frames.next().is_none() { break; }
}
...
if !capture_cursor {
ui.handle_event(&e);
e.draw_2d(|c, g| {
use conrod::*;
widget_ids!(REFRESH);
Button::new()
.color(color::blue())
.top_left()
.dimensions(60.0, 30.0)
.label("refresh")
.react(|| {})
.set(REFRESH, &mut ui);
for i in 0..20 {
Button::new()
.color(color::blue())
.down(0.0)
.dimensions(60.0, 30.0)
.label("refresh")
.react(|| {})
.set(REFRESH + 1 + i, &mut ui);
}
ui.draw(c, g);
});
}
} Ran 3 times using Conrod 0.22.2, deleting the slowest:
Ran 3 times, deleting the slowest using mitchmindtree/conrod@861726a (before Elmesque was removed from Conrod):
We see that PistonDevelopers/conrod#626 is in the same ballpark, but a little slower. Notice that Elmesque is not removed yet, and the PR is still work-in-progress, so it looks promising. |
One weakness with the texture_swap estimate is that performance is sensitive to the size of the textures. I expect it to be have approximately same characteristics across hardware, such that you could calculate the worst case for a texture of a given size. |
I generalized the spread sheet for estimating O(N) stuff in Turbine. Here I measure buttons using Conrod 0.22.2 with I get about 69 microseconds per button. You can see the curve bends slightly up, which is probably why the accuracy of the prediction is around 85%. The more buttons, the longer time it spends per button. Notice that one button is ignored, it becomes part of the background overhead. |
Here are buttons with Conrod mitchmindtree/conrod@861726a (before Elmesque is removed) on As before when I measures total time, this is a little slower. It also shows that Conrod spends more time per button when adding more buttons, in comparison to 0.22.2. An ideal O(N) algorithm would have accuracy of 100%, but this shows 79%. This type of estimation could be useful, not just checking how fast it is, but also see if changes improves algorithm complexity. |
Will take a look at this more closely soon, but just thought I'd mention On Mon, 16 Nov 2015 08:26 Sven Nilsen notifications@github.com wrote:
|
@mitchmindtree Yeah, I knew that. I'm doing this to test the method so we know what it says. I wrote "before Elmesque is removed" where it is relevant. |
Measuring buttons in debug mode using Conrod 0.22.2 with About 416 microseconds per button, this is 6 times slower than release mode. This shows something interesting, that the algorithm becomes almost linear in debug mode. I think it is because the overhead by design drowns in the noise and only becomes significant when the compiler generates optimized machine code. Maybe an indicator that extra allocations doesn't matter compared to optimization, which is a bit surprising. |
Testing to see if we can make some performance improvements in Conrod.
Rendering 21 buttons. Profiling using Instruments on OSX.
With UI:
Without UI (the spikes are when turning UI on to start/stop profiling):
The text was updated successfully, but these errors were encountered: