Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify VideoFrame Readback #208

Merged
merged 13 commits into from
Jun 11, 2021
Merged

Simplify VideoFrame Readback #208

merged 13 commits into from
Jun 11, 2021

Conversation

chcunningham
Copy link
Collaborator

@chcunningham chcunningham commented Apr 30, 2021

Define new concepts of Regions and Layout as described in #157.

  • Regions make it easier to articulate what part of the frame is desired for reading out (e.g. only the "visible" (non padded) region).
  • Layouts make it easier to describe the position and padding of a plane within a buffer. Useful during Plane construction as well as when reading out Planes to a BufferSource.

Preview | Diff

@chcunningham chcunningham changed the base branch from main to videoframe_clone_close April 30, 2021 08:23
@chcunningham
Copy link
Collaborator Author

@padenot @aboba This makes the interface changes but does not yet include spec steps. Uploaded as-is for discussion during (or prior) to tomorrow's meeting.

@chcunningham
Copy link
Collaborator Author

Editors call:
@padenot generally looks good, going to loop in graphics folks shortly
@bernard, no objection

@padenot
Copy link
Collaborator

padenot commented May 3, 2021

@padenot generally looks good, going to loop in graphics folks shortly

(I've contacted them).

Copy link
Collaborator Author

@chcunningham chcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some additional comments for design discussion.

index.src.html Outdated Show resolved Hide resolved
index.src.html Outdated Show resolved Hide resolved
Copy link
Collaborator Author

@chcunningham chcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some additional comments for design discussion.

On second thought, lets move the design discussion to #157 for better visibility/readability. Please see this recent comment.

@padenot generally looks good, going to loop in graphics folks shortly

(I've contacted them).

Thanks. Standing by. Please give this issue priority. Of all our v1 launch blockers, this one has the most open questions.

index.src.html Outdated Show resolved Hide resolved
index.src.html Outdated Show resolved Hide resolved
@padenot
Copy link
Collaborator

padenot commented May 10, 2021

Thanks. Standing by. Please give this issue priority. Of all our v1 launch blockers, this one has the most open questions.

I'm relaying the comments I've received, paraphrasing to add a bit more context:

The changes look like a good idea overall. A few thoughts:

Most graphics APIs use Region to mean a complex shape (like a set of rectangles, or polygons). Can we just use the more-common Rect for top/left/width/height? Could we just use the existing DOMRect instead of defining a new dictionary type?

PlaneInit now takes a BufferSource and an offset, what happens if someone constructs a VideoFrame with a set of planes that each refer to different buffers? The idea of having a per-plane offset suggests that we expect all planes to be within a single buffer, so I think PlaneInit should be a single BufferSource plus a Sequence of PlaneLayouts.

Alternatively we should truly allow for separate buffers per-plane, which in theory should be possible (and describable by our internal C++ descriptors in Gecko's implementation), but doesn't seem to happen in practice (from the existing decoders in Gecko).

As is being discussed, adding an option to do an async readback seems like a really good idea.

@chcunningham
Copy link
Collaborator Author

Thanks @padenot

Most graphics APIs use Region to mean a complex shape (like a set of rectangles, or polygons). Can we just use the more-common Rect for top/left/width/height? Could we just use the existing DOMRect instead of defining a new dictionary type?

The main thing we disliked about DOMRect was that it's values are floating point. There are media APIs that handle the crop as floating point values, but this only makes sense for rendering, not for copying raw bytes.

We definitely aren't attached to the name "Region". VideoFrameRect or similar would be fine. We also slightly prefer the top/left naming to DOMRect's x/y, but that's not too important.

PlaneInit now takes a BufferSource and an offset, what happens if someone constructs a VideoFrame with a set of planes that each refer to different buffers? The idea of having a per-plane offset suggests that we expect all planes to be within a single buffer, so I think PlaneInit should be a single BufferSource plus a Sequence of PlaneLayouts.

Alternatively we should truly allow for separate buffers per-plane, which in theory should be possible (and describable by our internal C++ descriptors in Gecko's implementation), but doesn't seem to happen in practice (from the existing decoders in Gecko).

The current shape is flexible: each plane can use the same buffer or a different buffer. The offset refers to whatever they give (should default to zero).

But, we agree that using the same buffer will be more common. We're open to taking that path if you prefer. If so, we propose the following single-buffer constructor
VideoFrame(BufferSource buffer, VideoFramePlaneInit init)

And we would add a new sequence<PlaneLayout> layout; member to VideoFramePlaneInit to describe the offset/stride of each plane in the single buffer.

As is being discussed, adding an option to do an async readback seems like a really good idea.

Awesome.

@chcunningham chcunningham changed the base branch from videoframe_clone_close to main June 2, 2021 03:49
@chcunningham
Copy link
Collaborator Author

@padenot I still need to define some spec steps to tie everything together, but this should now at least be at a point where the definitions are all there and crop stuff is fully removed. I'll finish it up Thursday afternoon.

... and VideoFrame codedRect and visibleRect attributes.

Still TODO: define steps for copyTo and allocationSize. Probably a *big*
TODO.
@chcunningham
Copy link
Collaborator Author

@padenot

The latest commit does everything except impl steps for vf.allocationSize() and vf.copyTo(). Listing those steps is so far looking like a large amount of work. Still pondering the approach.

Re: #165, I think we at least want each VF pixel format to specify

  • num planes
  • sample size of plane[i]
  • bytes per sample of plane[i]

Maybe more than that, but those were the dfns I found myself wanting so far. Chat tomorrow.

@chcunningham
Copy link
Collaborator Author

@padenot @aboba this is now completely specified and ready for review.

@sandersdan you might double check the new VF algos.

@padenot, above I wrote:

Re: #165, I think we at least want each VF pixel format to specify

  • num planes
  • sample size of plane[i]
  • bytes per sample of plane[i]

Having now written the algorithms, I would add to that list

  • "plane"
  • stride

Copy link
Collaborator

@padenot padenot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, a few questions though.

Makefile Outdated Show resolved Hide resolved
3. Increment |row| by `1`.

4. Let |resourceCodedWidth| be the coded width of |resource|.
5. Let |resourceCodedHeight| be the coded height of |resource|.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link those two.

|resource|.
8. Let |resourceCropTop| be the top offset of the crop origin of
7. Let |resourceVisibleTop| be the top offset for the visible rectangle of
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link as well.

2. Assign {{VideoFrame/[[coded width]]}} and
{{VideoFrame/[[coded height]]}} to {{VideoFrameRect/width}} and
{{VideoFrameRect/height}} respectively.
2. Return |rect|.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note that (0,0) is top left and that this matches canvas?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is pretty clear since we called our attributes "top" and "left", but I'm happy to add a note if you still like.

index.src.html Show resolved Hide resolved
3. Otherwise (|resource| does not use a recognized {{PixelFormat}}):
1. Assign `""` to {{VideoFrame/format}}.
2. Assign `null` to {{VideoFrame/planes}}.
2. If |resource| uses a recognized {{PixelFormat}}, assign the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll link this after my PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!


1. Let |plane| be the Plane identified by |planeIndex| as defined by
{{VideoFrame/[[format]]}}.
2. Let |sampleWidth| be the horizontal pixel size of each subsample
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll link this as well.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

index.src.html Show resolved Hide resolved
index.src.html Outdated Show resolved Hide resolved
index.src.html Outdated Show resolved Hide resolved
@chcunningham
Copy link
Collaborator Author

Editors call: good to merge % comments

@chcunningham
Copy link
Collaborator Author

VideoFrame(BufferSource buffer, sequence layout, VFPlaneInit init)

but this drops ability to add alpha without a copy. options to extend (if needed):

VideoFrame((BufferSource or sequence) buffer, sequence layout, VFPlaneInit init)

or We could add it to VFPlaneInit

Copy link
Collaborator Author

@chcunningham chcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing! I left the link comments unresolved so it would be easy to find them later. A few minor comments also still open w/ reply questions, but I'll send a quick follow up patch for those if needed.

Makefile Outdated Show resolved Hide resolved
index.src.html Outdated Show resolved Hide resolved
index.src.html Outdated Show resolved Hide resolved
index.src.html Show resolved Hide resolved
3. Otherwise (|resource| does not use a recognized {{PixelFormat}}):
1. Assign `""` to {{VideoFrame/format}}.
2. Assign `null` to {{VideoFrame/planes}}.
2. If |resource| uses a recognized {{PixelFormat}}, assign the
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!


1. Let |plane| be the Plane identified by |planeIndex| as defined by
{{VideoFrame/[[format]]}}.
2. Let |sampleWidth| be the horizontal pixel size of each subsample
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack

6. If |displayAspectWidth| and |displayAspectHeight| are provided,
1. Let |frame| be a new {{VideoFrame}}, constructed as follows:
1. Assign `false` to {{VideoFrame/[[detached]]}}.
2. Let |resource| be the [=media resource=] described by |output|.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say it as "resource ownership is shared by all the frames that reference the resource. it may additionally be shared by a codec that is using/emitting the resource"

But I worry I maybe misunderstand your intent? We document the sharing in the Memory Model section. We also have this note in decode():

NOTE: Authors should call close() on output VideoFrames immediately when frames are no longer needed. The underlying media resources are owned by the VideoDecoder and failing to release them (or waiting for garbage collection) may cause decoding to stall.

LMK if more is needed, will send a quick follow up.

index.src.html Show resolved Hide resolved
2. Assign {{VideoFrame/[[coded width]]}} and
{{VideoFrame/[[coded height]]}} to {{VideoFrameRect/width}} and
{{VideoFrameRect/height}} respectively.
2. Return |rect|.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this one is pretty clear since we called our attributes "top" and "left", but I'm happy to add a note if you still like.

@chcunningham chcunningham merged commit a96a6d7 into main Jun 11, 2021
@chcunningham chcunningham deleted the videoframe_regions branch June 11, 2021 22:34
github-actions bot added a commit that referenced this pull request Jun 11, 2021
SHA: a96a6d7
Reason: push, by @chcunningham

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit that referenced this pull request Jun 11, 2021
SHA: a96a6d7
Reason: push, by @chcunningham

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit that referenced this pull request Jun 11, 2021
SHA: a96a6d7
Reason: push, by @chcunningham

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to tguilbert-google/webcodecs that referenced this pull request Jun 15, 2021
SHA: a96a6d7
Reason: push, by @tguilbert-google

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to tguilbert-google/webcodecs that referenced this pull request Jun 15, 2021
SHA: a96a6d7
Reason: push, by @tguilbert-google

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
github-actions bot added a commit to tguilbert-google/webcodecs that referenced this pull request Jun 15, 2021
SHA: a96a6d7
Reason: push, by @tguilbert-google

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants