-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose an explicit set/get low-latency versus "smoothing" MSE API rather than relying on implementation-specific, implicit bytestream hints that the stream might be "live" #21
Comments
It sounds like a set/get low latency API might solve this. |
This issue requests app control over the latency model, and that's clearly a new feature request. It might be possible to detect a live stream and set lower latency buffering, but it's not clear that would be the best thing to do on all live streams. An API that lets the app communicate intent is likely needed to resolve this adequately. On V.Next already. |
The Media Task Force has agreed to designate this issue as V.Next: |
Feature proposals/"requests": a) The low latency mode should also support video streams (e.g. H.264) with an initial single key frame followed by P-frames only. b) The low-latency mode should work well with adding each new video frame individually to the source buffer. |
What you want to do, and the type of video data you use (a single starting keyframe, followed by P-frame), is currently fundamentally incompatible with the sourcebuffer architecture and spirit. MSE requires regularly spaced keyframes to work, in particular in order to be able to evict data from the sourcebuffer. An alternative would be to sourcebuffer::remove to take either a percentage, or a byte offset. seeking would have to be disallowed. |
Yes, I see the point that it is for now fundamentally incompatible with the current MSE philosophy. But what I have in mind is: Low-latency MSE is a very interesting feature for many applications, and as this issue shows we are not the first ones being interested into that ;-) And single-keyframe video streams are one important aspect for good low-latency I think. So extending the MSE architecture to make that possible would be useful and worth it. |
Have there been any updates on this or a real live low latency mode for MSE vNext? |
Not tangible, though I have discussed some approaches face-to-face with @jyavenard earlier this year. |
@greentorus / #21 (comment): It sounds like you're requesting a different feature (though for live low latency as goal): seeking and sourcebuffer::remove (and background video suspension, and video track de/re-selection) would need to be constrained to not involve reconfiguring the decoder, because the implementation would be unable to pre-roll from an ancient (and likely no longer buffered) keyframe to satisfy those scenarios. Have you considered using the I propose we keep this issue (renamed and refocused) to be more like what #133 wants (an explicit MSE API to set/get the implementation's low vs "smoothing" latency model. Please file a separate issue if the "single keyframe plus lots of P frames" scenario is not a better fit for the |
Yes, we are also considering the MediaStream/WebRTC API. However, compared to MSE, MediaStream/WebRTC involves a lot of unnecessary high-level complexity and protocol restrictions for only displaying a live video stream. Also, as a minor secondary reason, it seems the MSE video pipeline is better optimized for higher resolution in many browser implementations. For example, the MSE implementation in Firefox under Windows seems to use hardware decoding based on the Windows Media Foundation, but its MediaStream implementation seems to use software-only decoding.
We don't care about what the solution is, as long as it provides low-latency. So an explicit latency model sounds good. However, real low-latency seems to be not possible without "single keyframe plus a lots of P frames". For example, suppose the user has a 20 Mbps network connection. This supports a 2160p 60fps video stream, typically with 1 keyframe per second. Depending on the scenario, in many situations a keyframe often consumes up to 1/2 of that total bandwidth of even more (in this case around 10 Mb), while the P frames are very small (around 150 kb). This is no problem when using high-latency buffering. But it means that transferring a keyframe takes 1/2 second. This means the minimum possible latency is also 1/2 second. When only using P frames, the minimum possible latency is 1/2 of 1/60 = 1/120 second. Note that decreasing the number of keyframes per second decreases bandwidth, but doesn't decrease latency, which will stay at 1/2 second. The only exception seems to be: Not sending any keyframes anymore after a single initial one. Then, after some initialization hickup, the minimum latency is 1/120 latency. This problem was the reason why we started experimenting with a "single keyframe plus a lots of P frames". How could this problem be avoided in the "low vs 'smoothing' latency model" proposal when still having regular keyframes? |
That's a good point! This sounds to me like a transport layer issue, rather than the proposed "low latency model". From my understanding of the feature need and the discussion, low latency model is going to conceptually disable the MSE receiver jitter buffer, in a way that frames are rendered as they come. The handling of any artifacts, and data loss whose consequence is loss of playback"smoothness" is by definition abstracted to outside the MSE, perhaps to the application/system layer. |
Migrated from w3c bugzilla tracker. For history prior to migration, please see:
https://www.w3.org/Bugs/Public/show_bug.cgi?id=28379
It was previously assigned to Adrian Bateman. Editors will sync soon to determine who to take this bug.
The text was updated successfully, but these errors were encountered: