Simplifying and optimizing VideoFrame readback #157

chcunningham · 2021-03-26T22:16:07Z

(Full credit to @sandersdan for the following analysis and design. I'm just turning his doc into a bug.)

The VideoFrame.planes API allows reading back individual planes, but it has drawbacks:

The implementation may map and unmap the underlying frame for each plane, which can be slow when all planes are read.
It’s overly complicated to read all planes (partners have filed bugs showing its easy to get wrong)
Compared to its complexity, it doesn’t actually offer much control.

This issue proposes a simpler API for copying all planes at once.

Concepts

Region

A region is a {top, left, width, height} dictionary. VideoFrame should be updated to use this structure:

Add codedRegion, which is always {top: 0, left: 0, width: codedWidth, height: codedHeight}.
- We could deprecate codedWidth and codedHeight, but this only seems to complicate constructing VideoFrames.
Add visibleRegion, replacing the crop parameters.

The default region is visibleRegion.

elementSize

The number of bytes in a sample. Each Plane has an elementSize attribute.

Stride

The number of bytes from the start of one row to the start of the next row, must be at least elementSize * region.width.

The default stride is exactly elementSize * region.width.

Layout

A layout is a list of {offset, stride} dictionaries, one for each plane. offset is the location in a buffer that a plane starts.

The default offset is tightly packed (each plane immediately follows the previous), and either all offsets should be provided or none should be.

API

// Returns the size in bytes of a buffer that can hold a copy of |region|.
VideoFrame.allocationSize({region});
Plane.allocationSize({region, stride});

// Copies all planes into an output buffer |dst|.
VideoFrame.readInto(dst, {region, layout});  // returns |layout|
Plane.readInto(dst, {region, stride});       // returns |stride|

// Returns a new frame containing a copy of a region. Holding the new frame for an
// extended period of time will not stall decoding.
VideoFrame.copy({region});

Changes to the planes API

Plane
- Add codedRegion, visibleRegion.
- Add elementSize.
- Rename columns to codedWidth and rows to codedHeight.
- Deprecate length. Apps should call allocationSize().
- Deprecate stride. It’s not relevant anymore.
PlaneInit
- Deprecate rows.
- Add offset.

Examples

Copy visibleRegion to a new buffer (packed)

let buf = new ArrayBuffer(frame.allocationSize());
let layout = frame.readInto(buf);

// For 1080p I420, |layout| = [{offset: 0, stride: 1920},
//                             {offset: 2073600, stride: 960},
//                             {offset: 2592000, stride: 960}]

Copy codedRegion to a WASM frame pool

let region = frame.codedRegion;
let layout = wasmApp.allocateYUV(region.width, region.height);
frame.readInto(wasmApp.memory, {region, layout});

The text was updated successfully, but these errors were encountered:

dalecurtis · 2021-04-28T20:48:17Z

One of the key aspects of the current Planes.readInto() is that it's synchronous. Since frames are already being delivered asynchronously via decoding callbacks and the MediaStreamTrackProcessor, it would be unfortunate to add another hop in order to access information that may be readily available without an async hop.

I think this API needs to have a readIntoSync() method and potentially a readable attribute that indicates when readInto can be completed synchronously. E.g., the frame is a simple cpu memory backed frame or otherwise synchronously mappable. A texture backed frame would not be available for synchronous readback.

Without a readable attribute readIntoSync() would need to return null or otherwise fail, so I defer to the common spec idioms for whether we'd want an attribute versus just returning null.

sandersdan · 2021-04-28T23:27:38Z

Clarification: since the proposal @chcunningham posted, we determined that the default implementation of readInto() should be async so that we can do async texture readback in the implementation. Synchronous texture readback is strongly discouraged by Chrome's GPU experts.

That leaves a gap where some frames can be read synchronously but there is no way to express that without adding new methods. It's not yet clear how important the difference is, but the proposal to close the gap is as described by @dalecurtis above; add a fallible readIntoSync() and a readable attribute to signal when readIntoSync() is expected to work.

I'll also throw in my own bike shed, I have a preference to call the attribute readableSync as just readable is rather generic.

dalecurtis · 2021-04-28T23:31:56Z

Additional bike shed colors:

canReadSync
mappable

chcunningham · 2021-05-04T20:15:17Z

I opened some design discussion comments in the PR, but I'm resolving those in favor of focusing the discussion here where it's more visible/readable. Three points for discussion:

We need asynchronous readback (as Dan notes above). However, we may also want synchronous readback where feasible (with a naming bikeshed above for the method and capability signaling).
1. We are working to collect performance data on sync vs async to see if it really matters since one async method is much cleaner if there is no real performance loss.
Both sync and async methods under consideration in point 1 live on VideoFrame. Do we still have a need for a readInto() method (currently sync) on planes? Planes.readInto() affords the power to read just one plane when desired. A bit niche, and can be solved via other mechanisms using videoFrame.readInto().
1. example of another mechanism:
```
let options = {layout: [{},                // y plane, default layout
                        {discard: true},   // u plane, discarded
                        {discard: true}]}; // v plane, discarded
frame.readInto(buff, options);
```
If the outcome of point 2 is to remove planes.readInto(), should we also remove videoFrame.planes entirely? The remaining Planes attributes were mainly provided to assist in calling planes.readInto().

chcunningham · 2021-05-12T06:13:44Z

Triage note: marked 'breaking' as, as we are proposing a whole new readback API and discussing removal of old paths.

chcunningham · 2021-05-13T22:12:18Z

Another consideration: naming bikeshed method to get the pixel data

currently frame.readInto(dest)
encoded*Chunk interfaces now have chunk.copyTo(dest). Decided against copy*In*to(), as described in this thread. So maybe this should be simply frame.readTo(dest)
Open question whether read prefix is useful vs just calling this copy.. folks seem to lean keeping read as part of history (video frame "read back") and hint that this isn't always a simple copy from cpu memory (e.g. may need copy and unswizzle from gpu / texture etc). Personally, I like 'read' prefix ok.

In conclusion, I propose we call it videoFrame.readTo(dest).

dalecurtis · 2021-05-13T22:43:44Z

I think the difference is subtle enough that if every other interface will have copyTo() instead, then calling this one copyTo() sounds better to me in terms of consistency.

chcunningham · 2021-05-13T22:54:40Z

I'm also fine with that.

padenot · 2021-05-19T13:53:25Z

3\. If the outcome of point 2 is to remove planes.readInto(), should we also remove videoFrame.planes entirely? The remaining Planes attributes were mainly provided to assist in calling planes.readInto().

Access to each plane individually is also needed for WebGPU/WebGL interrop, although we don't really know the preferred API shape.

copyTo is fine to me if we want uniformity, yes.

chcunningham · 2021-06-11T22:35:35Z

Fixed by #208

chcunningham changed the title ~~Simplifying and optimizing VideoFrame~~ Simplifying and optimizing VideoFrame read-back Mar 29, 2021

chcunningham changed the title ~~Simplifying and optimizing VideoFrame read-back~~ Simplifying and optimizing VideoFrame readback Mar 29, 2021

chcunningham mentioned this issue Mar 29, 2021

VideoFrame.createImageBitmap returns an ImageBitmap with coded resolution rather than display resolution #119

Closed

chcunningham added this to the 2021 Q1 milestone Mar 29, 2021

sandersdan mentioned this issue Apr 26, 2021

webrtc and webcodec hardware VideoFrame no planes #192

Closed

chcunningham mentioned this issue Apr 30, 2021

Simplify VideoFrame Readback #208

Merged

chcunningham mentioned this issue May 5, 2021

Rename AudioFrame->AudioData, drop AudioBuffer, add ref counting semantics. #162

Merged

chcunningham added the breaking Interface changes that would break current usage (producing errors or undesired behavior). label May 12, 2021

sandersdan mentioned this issue May 19, 2021

Remove crop* from VideoDecoderConfig #250

Merged

chcunningham closed this as completed Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplifying and optimizing VideoFrame readback #157

Simplifying and optimizing VideoFrame readback #157

chcunningham commented Mar 26, 2021

dalecurtis commented Apr 28, 2021 •

edited

Loading

sandersdan commented Apr 28, 2021 •

edited

Loading

dalecurtis commented Apr 28, 2021

chcunningham commented May 4, 2021 •

edited

Loading

chcunningham commented May 12, 2021

chcunningham commented May 13, 2021

dalecurtis commented May 13, 2021

chcunningham commented May 13, 2021

padenot commented May 19, 2021

chcunningham commented Jun 11, 2021

Simplifying and optimizing VideoFrame readback #157

Simplifying and optimizing VideoFrame readback #157

Comments

chcunningham commented Mar 26, 2021

Concepts

Region

elementSize

Stride

Layout

API

Changes to the planes API

Examples

Copy visibleRegion to a new buffer (packed)

Copy codedRegion to a WASM frame pool

dalecurtis commented Apr 28, 2021 • edited Loading

sandersdan commented Apr 28, 2021 • edited Loading

dalecurtis commented Apr 28, 2021

chcunningham commented May 4, 2021 • edited Loading

chcunningham commented May 12, 2021

chcunningham commented May 13, 2021

dalecurtis commented May 13, 2021

chcunningham commented May 13, 2021

padenot commented May 19, 2021

chcunningham commented Jun 11, 2021

dalecurtis commented Apr 28, 2021 •

edited

Loading

sandersdan commented Apr 28, 2021 •

edited

Loading

chcunningham commented May 4, 2021 •

edited

Loading