-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to improve some performace #5
Comments
Yes, there are many things suboptimal about this video player. It really comes down to two problems:
So, to improve the performance, you would need to do the following:
That should net you close to the best performance possible. |
Now I know about them, I will try those optimizations. Thanks for your reply! It solves my puzzles. |
@jazzfool, according to README.md, it looks like some of the performance issues have been fixed (commit e347a9b):
Can you confirm this? Is the minimal example using hardware decoding by default? Are there any other performance issues still to be aware of? Anyway this repository is of great help, thanks! |
From my testing, yes, the performance is a lot better since I last wrote. My earlier points still stand to squeeze out more performance but currently with hardware decoding (which seems to be working by default now) it's very usable. |
Thanks @jazzfool, I am also able to run it with hardware decoding now but I still face some performance issues. I leave here the walkthrough and results of some tests. Whether hardware acceleration is used or not it now depends entirely on gstreamer system setup, its plugins and the underlying video driver. By enabling gstreamer logs I could see that I was using
I was testing this on PC with Nvidia GTX 1060 GPU, Arch linux, nvidia drivers 550.54.14. HW acceleration should be handled by NVDEC/NVENC codec, handled by the However by inspecting the This because I was missing the
After this gst-inspect was properly showing encoder and decoder features:
and the minimal test was then using the
However despite using the hardware decoder there is still some performance issue, because I get about 80% CPU in both tests with and without hardware decoding but if I directly play the video with gstreamer:
it says it uses nvh264dec decoder and takes about 20%. @jazzfool , would you expect such a difference? Is it due to what you were referring in your point above?
The same video played by mpv takes about 10% cpu and it also says it uses nvdec hw decoder. On an another PC with an Intel Celeron N3350 (Dual core CPU and Intel HD Graphics 500) I had to install
it says it's using hw decoder vah264dec but I still face high CPU usage (~80%) and video lagging. However if I play the same video with mpv:
it still says it's using hardware decoding (vaapi, so it should be the same underlying codec) but I only have 20% CPU usage and a smooth video playback. I also tried to force using the vulkan decoder by setting the env var |
Thanks for the detailed tests! Yes, I would expect greater CPU usage, based on my earlier points. MPV and gst-launch almost certainly skip the CPU overhead by keeping everything on the GPU when using hw decoding. However, I do not expect to see a difference on the order of 20% vs 80%. I tested MPV and gst-launch vs the minimal example in release build, and saw closer to 2-3% vs 5-6%. I expect there's something going on with gstreamer and the system configuration to cause such a big difference - perhaps something with how the gstreamer sink pipeline is setup. It's hard to reproduce this myself as I would want to capture some profiles and look at what gstreamer is doing in more detail. |
I have noticed a significant performance gap between OpenGL Renderer provided by gstreamer plugin autovideosink and our iced video player. The latter has higher latency and consumes more resources than OpenGL Renderer. Is it possible to close up the gap? |
I found a small bug which should improve performance slightly. However, regarding the performance overall, I did find the source of the issue as to why performance ends up being slower than e.g., gst-launch: Video frames are usually encoded in a YUV colour space, to help with spatial compression. The problem is that converting YUV to RGBA is not a simple operation, and in this case is being performed on the CPU (by the 'videoconvert' plugin). With GPU colour space conversion I anticipate that CPU usage% would drop by roughly 15-20% (from my local tests). The rest of the CPU usage comes from Looking into the future, the biggest leap forward would be gfx-rs/wgpu#2330, but there's no sign of that feature any time soon. |
Thanks for your reply. I profiled my pipeline and color conversion did make a huge part of running time(around 80% of total time on my M1 Pro MacBook). By the way, I am just curious why you address codec in WGPU is a biggest leap, so is your point that if codec in WGPU achieved then we can discard gstreamer pipeline then eliminate unnecessary memory copy? |
That's right. If WGPU implements the native video decoding extensions for each API then that would result in almost no overhead since the memory doesn't need to move anywhere. For now I may investigate compute shaders as an alternative for speeding up the YUV -> RGB conversion. |
But to my knowledge gstreamer appsink cannot return GPU memory(D3D, OpenGL, CUDA) so it seems that we need to rewrite almost the entire gstreamer pipeline in rust if we want to eliminate all unnecessary memory copy. It does sound like a hell of a work. |
Referring to the external memory extensions, actually gstreamer does expose |
I have implemented hardware accelerated NV12 to RGB conversion that does not rely on the WGPU feature gate in 9d60f26. With that, CPU usage has been reduced by around 30-40%. From my testing, the CPU usage is now comparable with other video players. At this point the only further CPU-side optimization that could be made is zero-copy frames (currently it copies from GPU to CPU to GPU), but without changes in wgpu that is not currently possible to avoid. As such, I will be closing this issue. |
Dear author, this repo is the only lead for me to study about video player in Iced with gstreamer. Thanks a lot for sharing!
It works but the video can be laggy if the video has a higher resolution like 1920 x 1080. So I wonder whether the problem is in appsink callback because of writing video data to the frame property, or it's iced having trouble to refresh it's Image from frame data.
As any of those tools seems to lack debugging facilities (can't find a way to debug gstreamer in rust). So I ask is there any idea to improve the performace of video playing based on your code.
I've updated gstreamer to 0.21 and iced to 0.10 in dependencies.
The text was updated successfully, but these errors were encountered: