Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libobs: Add Broadcast Performance Metrics (BPM) #11004

Merged
merged 5 commits into from
Sep 5, 2024

Conversation

lexano-ivs
Copy link
Contributor

@lexano-ivs lexano-ivs commented Jul 18, 2024

Description

This PR implements initial support for Broadcast Performance Metrics (BPM). BPM currently sends the following metrics inband with the video bitstream using SEI (for AVC/HEVC) and OBU (for AV1) with each IDR-frame, which is typically at 2 second intervals:

  • UTC-based wall-clock timestamps using RFC3339 format
  • Session frame counters: rendered, lagged, dropped, output
  • Encoded rendition frame counters: input, skipped, output

The metrics are intended to be used by live-streaming services such as Twitch and Amazon IVS, for example in plugins such as multitrack/eRTMP. BPM generation and delivery is enabled for the output by registering the bpm_injection() callback with obs_output_add_packet_callback(). This packet callback mechanism is introduced with this PR after the initial implementation (within libobs) was removed in favour of a callback API at the packet processing level.

Detailed BPM documentation is available in the Multitrack Video Integration Guide.

Motivation and Context

To enable ongoing improvement of automatic stream configuration, to deliver the best possible stream settings, broadcast performance metrics (BPM) must be measured and sent. The metrics are collected and sent in-band via either SEI (for AVC/HEVC) or OBU (AV1) messages. Two classes of data are collected:

  • Timestamps are collected to measure end-to-end latency between the broadcaster and the viewer. They are useful for:

    • Providing the broadcaster or audience with an estimate of end-to-end latency
    • Analyzing timestamp jitter that may indicate system stress or poor first-mile network connectivity
    • Referencing real-world event time for aligning and aggregating time-series counter data
  • Frame counters are collected to measure the performance of the broadcast software and video encoders at the frame level. They are useful for:

    • Providing broadcasters with a performance dashboard that includes additional signals, to help
      them improve their streaming setup
    • Providing a proactive signal that may correlate with environmental changes like newly released GPU drivers or OS versions/patches
    • Providing feedback to enable video services to safely iterate and release improvements to GetClientConfiguration, including support for new hardware vendors, new GPU models, new codecs, new driver features, additional video-encoder setting tuning, and new user-controlled presets (e.g., “Dual PC Setup” vs. “Gaming+Streaming Setup”).

The timestamp sent from the broadcaster is based on a global common reference clock, typically an NTP-synchronized clock using the UTC+0 timezone. RFC3339 is commonly used for this scenario of "Internet time." This provides an absolute reference, making temporal difference calculations trivial. The timestamps have millisecond resolution which is necessary to provide resolution suitable for measuring end-to-end latency, and to provide enough resolution for frame-level timestamps.

The events being timestamped for video frames are:

  1. CTS (Composition timestamp): When the composited frame was rendered by OBS
  2. FER (Frame encode request timestamp): When the frame was requested to be encoded
  3. FERC (Frame encode request complete timestamp): When the frame encode request was completed
  4. PIR (Packet interleave request): When the compressed frame (packet) was sent to the output plugin.

The time difference between PIR and CTS represents the majority of the client latency.

How Has This Been Tested?

BPM was tested in the Twitch Enhanced Broadcasting beta, streaming multi-rendition and single-track sessions into the Twitch ingest service, parsing the metrics, and sending the results to the control plane back-end. Additionally, it was tested with AVC, HEVC, and AV1, with and without closed captions injections. BPM and closed captions use a very similar mechanism of injecting data using SEI/OBU hence the need to test both alone and simultaneously. Multiple bitstreams were also captured and verified with a stream analyzer for correctness.

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code has been run through clang-format.
  • I have read the contributing document.
  • My code is not on the master branch.
  • The code has been tested.
  • All commit messages are properly formatted and commits squashed where appropriate.
  • I have included updates to all appropriate documentation.

@alinsavix
Copy link
Contributor

I'm curious, does/will this information get passed through to the client side (of twitch, at least)? Because that would be really useful.

(For bonus points, make it so the timestamps represent the time the associated frame was created, rather than when it was output from the encoder, which would not only make them even more useful, but also measure encoder delay as part of the end-to-end measurement. At least, I don't think that's what it's doing right now)

libobs/obs-output.h Outdated Show resolved Hide resolved
libobs/obs-output.c Outdated Show resolved Hide resolved
@koolscooby
Copy link

I'm curious, does/will this information get passed through to the client side (of twitch, at least)? Because that would be really useful.

Short answer: yes. Long answer, a few points!

  • Yes, the embedded SEIs do get passed through transmuxing to the delivered video segments on Twitch.
  • However, we haven't decided exactly if/how we will intentionally expose it to either broadcasters or viewers; we are considering ideas like putting it in Twitch Inspector or in "Video Stats for Nerds" in the player. First, we have to validate that the data pipeline works at scale and that we trust the data to expose it to end users.
  • In the interim, this data is absolutely crucial for us to continue iterating on Automatic Stream Configuration settings (e.g. deploying HEVC or 1440p/4k ladders.) Without it, we can't safely iterate and change client settings (because we don't want to break users' performance.)

@norihiro
Copy link
Contributor

The title of the commit fea442e should start from obs-webrtc: instead of WHIP: .

@WizardCM WizardCM added the New Feature New feature or plugin label Jul 20, 2024
@tt2468
Copy link
Member

tt2468 commented Jul 20, 2024

A few things:

  • I would highly prefer that we include a millisecond-accurate signed 64 bit integer UNIX epoch timestamp, instead of an RFC3339 string.
  • Would it be possible to split out BPM from libobs, and instead add functionality to add private SEI payloads for encoders? Something along the lines of obs_encoder_add_private_sei(obs_encoder_t *, void *buf, size_t size). The location of the BPM code could probably be in the UI, running off of a QTimer to trigger.

@lexano-ivs lexano-ivs force-pushed the lexano/bpm branch 2 times, most recently from 3fd5043 to 1c9f5bc Compare July 22, 2024 21:18
@lexano-ivs
Copy link
Contributor Author

The title of the commit fea442e should start from obs-webrtc: instead of WHIP: .

Thank you @norihiro. I've updated the commit with the correct name.

@lexano-ivs
Copy link
Contributor Author

A few things:

* I would highly prefer that we include a millisecond-accurate signed 64 bit integer UNIX epoch timestamp, instead of an RFC3339 string.

In the integration guide, on page 28, we defined a type system for timestamp data. timestamp_type==2 is the duration_since_epoch_ts, as well as a delta_ts format to signal differences. At the moment we're using type 1, which is the RFC3339 format, mainly because it was the best way for us to get started in the beta testing because the ingest servers were already speaking RFC3339. We can add type 2 and type 3 formats in the future, and any parser would have to respect them. Adding type 2 and/or type 3 (or any other types we come up with) is on the backlog, just simply not scoped/planned right at the moment.

* Would it be possible to split out BPM from libobs, and instead add functionality to add private SEI payloads for encoders? Something along the lines of `obs_encoder_add_private_sei(obs_encoder_t *, void *buf, size_t size)`. The location of the BPM code could probably be in the UI, running off of a QTimer to trigger.

I need to think about this a bit more. Initial thought revolves around trigger. The current implementation is triggered from the send_interleaved() function because it was the best spot I could find to identify IDR frames to align with. We need the data aligned to IDR frames, so moving the trigger to an async QTimer seems like it might introduce jitter between the IDR time and the QTimer firing, with the assumption the we want tight (preferably synchronous) alignment.

Copy link
Member

@tt2468 tt2468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to suggest that we implement a packet callback for outputs. This callback would be triggered for every video packet, closely before being passed to the output to be transmitted. The BPM SEI rendering code can be moved to the common folder in the base of this repository, similar to happy eyeballs and media-playback.

Here some suggested APIs:

  • Typedefs
    • struct array_output_data *(*obs_output_packet_callback_t)(obs_output_t *, struct encoder_packet *pkt, struct encoder_packet_timing *timing, void *)
  • Functions
    • `obs_output_add_packet_callback(obs_output_t *output, obs_packet_callback_t callback, void *data);
    • obs_output_add_packet_remove(obs_output_t *output, obs_packet_callback_t callback, void *data);

It is important to note that this callback is implementer per-output, so the primary buffer in the packet is going to likely need to be duplicated before libobs inserts the SEI payload, to prevent SEI data from being sent to other outputs subscribed to the same encoders.

libobs/obs-internal.h Outdated Show resolved Hide resolved
libobs/obs-internal.h Outdated Show resolved Hide resolved
libobs/obs-internal.h Outdated Show resolved Hide resolved
@lexano-ivs
Copy link
Contributor Author

I would like to suggest that we implement a packet callback for outputs. This callback would be triggered for every video packet, closely before being passed to the output to be transmitted. The BPM SEI rendering code can be moved to the common folder in the base of this repository, similar to happy eyeballs and media-playback.

Here some suggested APIs:

* Typedefs
  
  * `struct array_output_data *(*obs_output_packet_callback_t)(obs_output_t *, struct encoder_packet *pkt, struct encoder_packet_timing *timing, void *)`

* Functions
  
  * `obs_output_add_packet_callback(obs_output_t *output, obs_packet_callback_t callback, void *data);
  * `obs_output_add_packet_remove(obs_output_t *output, obs_packet_callback_t callback, void *data);`

It is important to note that this callback is implementer per-output, so the primary buffer in the packet is going to likely need to be duplicated before libobs inserts the SEI payload, to prevent SEI data from being sent to other outputs subscribed to the same encoders.

Thanks for the suggestions. The "BPM SEI rendering code" can be thought of as 2 parts: 1. The calculations of the frame counters and timing deltas (once we implement delta time); and 2. The rendering of this data into SEI payload syntax. Are you envisioning moving both parts 1 and 2 into the callback, or just part 2? I'm leaning towards both parts should be handled in the callback function, but wanted to understand what you were thinking as well.

Also, if we employ this approach, the closed caption SEI could be supported with the same callback technique. I'd like to keep the migration of the current closed caption code to the potentially new callback mechanism as a separate workstream, and focus purely on the BPM support for this PR. Is this OK with you?

I'd also like to understand if this packet callback should be available for use by any plugin, or would this be only for "internal" plugins (for lack of a better term) compiled natively with OBS? I believe it's the former (any plugin could use it), but would like to know for sure.

@lexano-ivs
Copy link
Contributor Author

With the latest updates, frame timing is now propagated from the encoder array to each output array. This allows the frame timing to work in with multiple output services in tandem. The initial implementation would drain the encoder frame timing array after the first consumption of the data.

I'm moving the PR to "Ready for review", and will work on renaming bpm_frame_time to encoder_packet_timing (or something along these lines).

@lexano-ivs lexano-ivs marked this pull request as ready for review August 9, 2024 14:43
@tt2468
Copy link
Member

tt2468 commented Aug 10, 2024

Thanks for the suggestions. The "BPM SEI rendering code" can be thought of as 2 parts: 1. The calculations of the frame counters and timing deltas (once we implement delta time); and 2. The rendering of this data into SEI payload syntax. Are you envisioning moving both parts 1 and 2 into the callback, or just part 2? I'm leaning towards both parts should be handled in the callback function, but wanted to understand what you were thinking as well.

Uhh I think both parts would be handled in the callback? I don't know if the frame counters and timing deltas are before or after the encoder_packet_timing struct finalization.

Also, if we employ this approach, the closed caption SEI could be supported with the same callback technique. I'd like to keep the migration of the current closed caption code to the potentially new callback mechanism as a separate workstream, and focus purely on the BPM support for this PR. Is this OK with you?

It is definitely not necessary to do any immediate migration of the captions code to this callback, just as long as it's kept in mind.

I'd also like to understand if this packet callback should be available for use by any plugin, or would this be only for "internal" plugins (for lack of a better term) compiled natively with OBS? I believe it's the former (any plugin could use it), but would like to know for sure.

As with other libobs APIs of similar design, this would be available to any code which has a copy of the encoder's pointer. That is usually the same code that created the encoder, but there are places where a third party plugin could fetch an encoder it is not managing, like through the frontend API. Realistically, if you register a callback of your own on this API, you would be expected to disconnect the callback as a part of your own cleanup procedure, or it would be disconnected by libobs if the encoder is destroyed.

@lexano-ivs
Copy link
Contributor Author

@tt2468 @RytoEX I've pushed a series of commits earlier today that should satisfy the requested changes:

  1. Renamed bpm_frame_time to encoder_packet_time
  2. encoder_packet_time handling remains within libobs and flows from encoding functions (stored in the encoder_t packet_times array) to the output_t packet_times arrays.
  3. BPM code has been completely removed from libobs, and moved to deps/bpm
  4. The enable_bpm flag has been removed, along with the public API.
  5. A packet callback mechanism has been created. Registered packet callbacks are invoked from a loop in send_interleaved().
  6. BPM is the first user of the new packet callback mechanism. bpm_injection() is introduced at the callback function as well as bpm_destroy() to deallocate. bpm.h is the public interface to BPM and simply contains the 2 function signatures.
  7. BPM is enabled in multitrack video by using the new packet callback add/remove functions.
  8. I've disabled BPM in WebRTC for the moment as it needs more testing with the new mechanism.

We can squash the new commits into old ones, and/or rework the branch as needed. I'd like to get an initial review first as I don't want to rewrite the branch history until the new code seems OK. I'll definitely keep a local copy of the original branch though for safekeeping before squashing anything.

@lexano-ivs lexano-ivs requested a review from tt2468 August 21, 2024 17:55
Copy link
Member

@tt2468 tt2468 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a reminder that we'll want to also update the docs in this PR with the new methods. You may also want to include a note that any reallocation of the packet's buffer MUST be done with a util that allocates with bmem, otherwise a memory leak may occur.

We will want to verify that the packet struct is copied appropriately and that no memory leaks are occuring. I assume you've already done that, though.

deps/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

@tytan652 tytan652 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

plugins/obs-outputs/CMakeLists.txt Outdated Show resolved Hide resolved
plugins/obs-outputs/CMakeLists.txt Outdated Show resolved Hide resolved
@lexano-ivs lexano-ivs force-pushed the lexano/bpm branch 2 times, most recently from e8ed4c7 to 7c70634 Compare August 23, 2024 18:00
@lexano-ivs
Copy link
Contributor Author

@tt2468 I've rebased against master as of this morning, fixed conflicts, moved from deps/bpm to shared/bpm, and added a commit with documentation for the add/remove packet callback functions. This should be close to complete now, and if it looks OK, I can squash commits to make it prettier.

Regarding memory leaks, I have been running with debug mode and checking for the num of memory leaks, which is 0, so I'm confident there's no issue there.

I don't understand why the "gersemi" validation is complaining about the format of the CMakeLists.txt files and could use some guidance on what might be wrong.

@kkartaltepe
Copy link
Collaborator

I don't understand why the "gersemi" validation is complaining about the format of the CMakeLists.txt files and could use some guidance on what might be wrong.

Gersemi si the new cmake formatter being used in CI, you should change from using cmake-format to https://github.com/BlankSpruce/gersemi to keep your files formatted.

shared/bpm/bpm.h Outdated Show resolved Hide resolved
lexano-ivs and others added 4 commits August 27, 2024 10:53
Introduce support for the `encoder_packet_time` struct
to capture timing information for each frame, starting
from the composition of each frame, through the encoder,
to the queueing of the frame data to each output_t.

Timestamps for each of the following events are based on
`os_gettime_ns()`:

CTS: Composition time stamp (in the encoder render threads)
FER: Frame encode request
FERC: Frame encoder request completely
PIR: Packet interleave request (`send_interleaved()`)

Frame times are forwarded through encoder callbacks in the
context that runs on the relevant encoder thread, ensuring
no race conditions with accessing per encoder array happen.
All per-output processing happens on data that is owned by
the output.

Co-authored-by: Ruwen Hahn <haruwenz@twitch.tv>
Packet callbacks are invoked in `send_interleaved()` and
are useful for any plugin to extend functionality at the
packet processing level without needing to modify code in
libobs. Closed caption support is one candidate that is
suitable for migration to a packet callback.

The packet callback also supports the new encoder packet
timing feature. This means a registered callback will have
access to both the compressed encoder packet and the associated
encoder packet timing information.
Introduce support for delivering BPM (Broadcast
Performance Metrics) over SEI (for AVC/H.264 and
HEVC/H.265) and OBU (for AV1) unregistered messages.
Metrics being sent are the session frame counters,
per-rendition frame counters, and RFC3339-based
timestamping information to support end-to-end
latency measurement.

SEI/OBU messages are generated and sent with each IDR
frame, and the frame counters are diff-based, meaning
the counts reflect the diff between IDRs, not the running
totals.

BPM documentation is available at [1].

BPM relies on the recently introduced encoder packet timing
support and the packet callback mechanism.

BPM injection is enabled for an output by registering
the `bpm_inject()` callback via `obs_output_add_packet_callback()`
function. The callback must be unregistered using
`obs_output_remove_packet_callback()` and `bpm_destroy()`
must be used by the caller to release the BPM structures.

It is important to measure the number of frames successfully
encoded by the obs_encoder_t instances, particularly for
renditions where the encoded frame rate differs from the
canvas frame rate. The encoded_frames counter and
`obs_encoder_get_encoded_frames()` API is introduced
to measure and report this in the encoded rendition
metrics message.

[1] https://d50yg09cghihd.cloudfront.net/other/20240718-MultitrackVideoIntegrationGuide.pdf
@lexano-ivs
Copy link
Contributor Author

@tt2468 I've squashed the commits now. Much cleaner. Please review and let me know next steps, if any.

@lexano-ivs lexano-ivs requested a review from tt2468 August 27, 2024 16:08
Copy link
Member

@RytoEX RytoEX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parts I reviewed seem fine. Builds and runs on Windows, and confirmed that the functionality is working. This has gone through multiple rounds of review and design revision, and it's as good as it's likely to get before merging.

@RytoEX RytoEX added this to the OBS Studio 31 milestone Sep 5, 2024
@RytoEX RytoEX merged commit ea9e033 into obsproject:master Sep 5, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New Feature New feature or plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants