Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gapless audio playback on multi-period DASH source #4899

Closed
ghexoplayerquestion opened this issue Oct 2, 2018 · 4 comments
Closed

Gapless audio playback on multi-period DASH source #4899

ghexoplayerquestion opened this issue Oct 2, 2018 · 4 comments
Assignees
Labels

Comments

@ghexoplayerquestion
Copy link

Issue description

I am playing a custom created MPEG-DASH manifest that includes multiple periods. Each period contains a single FMP4 segment that was created from ffmpeg using the following command line:
ffmpeg -i - -f segment -segment_attclocktime 1 -strftime 1 -c:a libfdk_aac -b:a 32k -segment_format mp4 -segment_format_options movflags=empty_moov+default_base_moof+frag_keyframe ~/test/%FT%H-%M-%S%z.mp4
Each period contains a single segment, and uses the period duration along with the presentationTimeOffset to trim the first and last sample off of the period. The segments, and thus the period durations vary, which is why each segment is in its own period.

When playing back, there is an audible gap between each period. Because the period start-times are configured without a gap, I would expect the audio playback to be gapless.

Reproduction steps

A reproduction app is available at:
https://github.com/ghexoplayerquestion/Repro

with the relevant code in the Android activity at:
https://github.com/ghexoplayerquestion/Repro/blob/master/app/src/main/java/com/example/ghexoplayerquestion/repro/MainActivity.java

During playback, the following is output on the debug console:

I/ExoPlayerImpl: Init 42b7916 [ExoPlayerLib/2.9.0] [generic_x86, Android SDK built for x86, Google, 28]
I/Choreographer: Skipped 48 frames!  The application may be doing too much work on its main thread.
I/OpenGLRenderer: Davey! duration=827ms; Flags=0, IntendedVsync=69241345581930, Vsync=69242145581898, OldestInputEvent=9223372036854775807, NewestInputEvent=0, HandleInputStart=69242153074100, AnimationStart=69242153958800, PerformTraversalsStart=69242156035700, DrawStart=69242157118400, SyncQueued=69242158809400, SyncStart=69242159321100, IssueDrawCommandsStart=69242159614500, SwapBuffers=69242164402600, FrameCompleted=69242173244900, DequeueBufferDuration=1560000, QueueBufferDuration=3441000, 
W/VideoCapabilities: Unrecognized profile 4 for video/hevc
I/VideoCapabilities: Unsupported profile 4 for video/mp4v-es
D/NetworkSecurityConfig: No Network Security Config specified, using platform default
I/OMXClient: IOmx service obtained
I/ACodec: codec does not support config priority (err -2147483648)
I/OMXClient: IOmx service obtained
I/ACodec: codec does not support config priority (err -2147483648)
I/OMXClient: IOmx service obtained
I/ACodec: codec does not support config priority (err -2147483648)
I/OMXClient: IOmx service obtained
I/ACodec: codec does not support config priority (err -2147483648)
I/OMXClient: IOmx service obtained
I/ACodec: codec does not support config priority (err -2147483648)
I/OMXClient: IOmx service obtained
I/ACodec: codec does not support config priority (err -2147483648)
W/AudioTrack: getTimestamp() location moved from kernel to server
D/AudioTrack: stop() called with 587776 frames delivered

Link to test content

The DASH manifest that reproduces this issue is:

<MPD xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" profiles="urn:mpeg:dash:profile:isoff-main:2011" mediaPresentationDuration="PT0M11.508982S" minBufferTime="PT6S" xmlns="urn:mpeg:dash:schema:mpd:2011">
  <Period start="PT0S" duration="PT2.017435S">
    <AdaptationSet mimeType="audio/mp4" codecs="mp4a.40.2" contentType="audio">
      <Representation audioSamplingRate="44100" id="1" bandwidth="32000">
        <BaseURL>https://rhhhloggermediastore.blob.core.windows.net/rh-logger-share/1b4966c2-26f2-403a-8808-67b4d5488c8c.mp4</BaseURL>
        <SegmentBase timescale="1000000" presentationTimeOffset="21333" />
      </Representation>
    </AdaptationSet>
  </Period>
  <Period start="PT2.017435S" duration="PT1.809909S">
    <AdaptationSet mimeType="audio/mp4" codecs="mp4a.40.2" contentType="audio">
      <Representation audioSamplingRate="44100" id="1" bandwidth="32000">
        <BaseURL>https://rhhhloggermediastore.blob.core.windows.net/rh-logger-share/f0d7b23b-878c-4be4-9832-6cf8539c6331.mp4</BaseURL>
        <SegmentBase timescale="1000000" presentationTimeOffset="21333" />
      </Representation>
    </AdaptationSet>
  </Period>
  <Period start="PT3.827344S" duration="PT1.842343S">
    <AdaptationSet mimeType="audio/mp4" codecs="mp4a.40.2" contentType="audio">
      <Representation audioSamplingRate="44100" id="1" bandwidth="32000">
        <BaseURL>https://rhhhloggermediastore.blob.core.windows.net/rh-logger-share/a17539ae-8064-42a0-bd10-7d04cd2b2886.mp4</BaseURL>
        <SegmentBase timescale="1000000" presentationTimeOffset="21333" />
      </Representation>
    </AdaptationSet>
  </Period>
  <Period start="PT5.669687S" duration="PT2.102876S">
    <AdaptationSet mimeType="audio/mp4" codecs="mp4a.40.2" contentType="audio">
      <Representation audioSamplingRate="44100" id="1" bandwidth="32000">
        <BaseURL>https://rhhhloggermediastore.blob.core.windows.net/rh-logger-share/dbb6292a-b0fd-4036-86fc-eb8f2f2bff6c.mp4</BaseURL>
        <SegmentBase timescale="1000000" presentationTimeOffset="21333" />
      </Representation>
    </AdaptationSet>
  </Period>
  <Period start="PT7.772563S" duration="PT1.952418S">
    <AdaptationSet mimeType="audio/mp4" codecs="mp4a.40.2" contentType="audio">
      <Representation audioSamplingRate="44100" id="1" bandwidth="32000">
        <BaseURL>https://rhhhloggermediastore.blob.core.windows.net/rh-logger-share/21190726-7468-4c5d-893c-fe30f642788d.mp4</BaseURL>
        <SegmentBase timescale="1000000" presentationTimeOffset="21333" />
      </Representation>
    </AdaptationSet>
  </Period>
  <Period start="PT9.724981S" duration="PT1.784001S">
    <AdaptationSet mimeType="audio/mp4" codecs="mp4a.40.2" contentType="audio">
      <Representation audioSamplingRate="44100" id="1" bandwidth="32000">
        <BaseURL>https://rhhhloggermediastore.blob.core.windows.net/rh-logger-share/c662bb42-97ac-4eb4-b699-52a97932fd38.mp4</BaseURL>
        <SegmentBase timescale="1000000" presentationTimeOffset="21333" />
      </Representation>
    </AdaptationSet>
  </Period>
</MPD>

Version of ExoPlayer being used

ExoPlayer version 2.9.0

Device(s) and version(s) of Android being used

Reproduces on Android emulator:
Nexus 5X, 5.2 1080x1920 xxhdpi
Android API 28 x86

A full bug report captured from the device

The bug report is attached.
bugreport.zip

@ojw28 ojw28 self-assigned this Oct 3, 2018
@ojw28
Copy link
Contributor

ojw28 commented Oct 3, 2018

Thanks for the interesting manifest. Things we found:

  1. There's a bug in the buffered position reported by the player (during playback you may observe the buffering position show that all 5 periods are buffered, then incorrectly snap back to the period boundaries during playback). This is unrelated to what you're actually asking about, but we'll fix it :).
  2. We're not clipping the samples that end up with negative timestamps in this case, or the ones that end up extending beyond the period duration. We'll fix this too and it makes playback a bit better, but I can still here some slight discontinuities.

A few observations about the manifest itself:

  1. It doesn't really matter (and we'll still handle this case), but the DASH specification discourages what you're doing: "Media Segments should not contain any presentation time that is smaller than the value of the @presentationTimeOffset". I believe the outcome of recent DASH-IF discussions was that this should be allowed, but specifically in the case where each representation is a segment consisting of multiple sub-segments (and a segment index that's accessible from the manifest).
  2. Trying to clip with sample accuracy via the manifest seems very error prone. As an example of why, your sample uses different timescales in the manifest (1000000) and media (1000). In the media the second sample has timestamp 21000, but the clipping specified by the manifest clips at 21333. We end up clipping two samples as a result, rather than one.
  3. The durations of the periods in the manifest don't appear to correspond to clipping one sample from the end of each period.
  4. Even when I manually edit the manifest to clip exactly one sample from the start and end of each period, there is still a slight audible discontinuity. It's unclear whether the content has been prepared in a way that allows for true gapless playback.

At a higher level, it's unclear what you're trying to achieve. There's no need to start a new period to accommodate different segment durations. Periods are typically for use when the content actually changes (e.g. transition from one song to another, or from content to an ad). I also don't understand why you're having to clip from the start and end of each segment. Why are they there in the first place? It's pretty common to have variable segment length AAC in DASH and I've never seen this being necessary before, so there's likely a shortcoming with how you're preparing the media.

So TLDR - We'll fix the issues identified at the top. It'll make things a bit better, but there will still be an audible discontinuity. I think the real fix is to prepare the content in a better way.

@ojw28 ojw28 added the bug label Oct 3, 2018
@ghexoplayerquestion
Copy link
Author

Thank you ojw28 for looking at this and the detailed response. I will continue to look at the encoding to handle this better.

I was trimming the beginning and end of the segments to account for the encoder start-up delay (https://www2.iis.fraunhofer.de/AAC/gapless.html) but it looks like I was not calculating the trim values correctly. I'll also look at using the MPEG Edit List to trim the segments and then including the segments in a single Period with a SegmentList.

In addition (not shown in this repro), I am also using the presentationTimeOffset of the first segment and Period duration of the last segment / MPD mediaPresentationDuration to allow the manifest to trim the beginning and end of the entire presentation to bounds that fall inside of segments. Is that affected by the issue (2) you brought up, and if so is there a better way to handle this?

@ghexoplayerquestion
Copy link
Author

Re: my last paragraph - I see if I get my media presentation to a single period I can use the ClippingMediaSource. I wasn't using ClippingMediaSource because I was using the multiple periods to trim the segments.

@ojw28
Copy link
Contributor

ojw28 commented Oct 3, 2018

I still don't really understand what you're doing. Pretty much any DASH stream will segment AAC audio into multiple segments, and they wont ever put each segment in its own period or do anything special to deal with start-up delay and padding.

ojw28 pushed a commit that referenced this issue Oct 15, 2018
This makes the following changes to improve consistency among the PlaybackInfo
values:
 1. Update buffered position and total buffered duration after loading period
    is set as both values are affected by a loading period update.
 2. Add copyWithPosition to allow updating the position without resetting the
    loading period.
 3. Forward the total buffered duration to playing position updates
    as it may have changed with the new playing position.

Issue:#4899

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=215712328
ojw28 added a commit that referenced this issue Oct 15, 2018
- Always clip to period duration for the last chunk. We previously
  did this only when the last chunk explicitly exceeded the period
  end time. We now also do it when the chunk claims to end at the
  period boundary, but still contains samples that exceed it.
- If pendingResetPositionUs == chunk.startTimeUs == 0 but the
  chunk still contains samples with negative timestamps, we now
  clip them by setting the decode only flag. Previously we only
  clipped such samples if the first chunk explicitly preceeded the
  start of the period.

Issue: #4899

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=215763467
ojw28 pushed a commit that referenced this issue Oct 20, 2018
This makes the following changes to improve consistency among the PlaybackInfo
values:
 1. Update buffered position and total buffered duration after loading period
    is set as both values are affected by a loading period update.
 2. Add copyWithPosition to allow updating the position without resetting the
    loading period.
 3. Forward the total buffered duration to playing position updates
    as it may have changed with the new playing position.

Issue:#4899

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=215712328
ojw28 added a commit that referenced this issue Oct 20, 2018
- Always clip to period duration for the last chunk. We previously
  did this only when the last chunk explicitly exceeded the period
  end time. We now also do it when the chunk claims to end at the
  period boundary, but still contains samples that exceed it.
- If pendingResetPositionUs == chunk.startTimeUs == 0 but the
  chunk still contains samples with negative timestamps, we now
  clip them by setting the decode only flag. Previously we only
  clipped such samples if the first chunk explicitly preceeded the
  start of the period.

Issue: #4899

-------------
Created by MOE: https://github.com/google/moe
MOE_MIGRATED_REVID=215763467
@ojw28 ojw28 closed this as completed Oct 20, 2018
@google google locked and limited conversation to collaborators May 16, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants