Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use mka as the fallback container for any other audio formats (including Enhanced AC-3) #98

Closed
CXwudi opened this issue Sep 15, 2024 · 4 comments · Fixed by #102
Closed
Assignees

Comments

@CXwudi
Copy link
Owner

CXwudi commented Sep 15, 2024

When downloading videos from https://vocadb.net/L/15285, we encountered a new audio format that was never seen before

【初音ミク】こころのキ ラリ【shishy】.zip

Here is the MediaInfo:

General
Unique ID                      : 163624331291903806959873331136622681307 (0x7B18E66195F88671BBFEA6E5040ADCDB)
Complete name                  : D:\coding-workspace\Vocaloid Coding POC\Project VD Run Env\2024年V家新曲-downloaded\【初音ミク】こころのキラリ【shishy】[661223]-pv.mkv
Format                         : Matroska
Format version                 : Version 4
File size                      : 16.9 MiB
Duration                       : 3 min 43 s
Overall bit rate               : 636 kb/s
Frame rate                     : 29.970 FPS
Writing application            : Lavf61.1.100
Writing library                : Lavf61.1.100
ErrorDetectionType             : Per level 1

Video
ID                             : 1
Format                         : VP9
Format profile                 : 0
Codec ID                       : V_VP9
Duration                       : 3 min 43 s
Bit rate                       : 240 kb/s
Width                          : 1 920 pixels
Height                         : 1 080 pixels
Display aspect ratio           : 16:9
Frame rate mode                : Constant
Frame rate                     : 29.970 (30000/1001) FPS
Color space                    : YUV
Chroma subsampling             : 4:2:0
Bit depth                      : 8 bits
Bits/(Pixel*Frame)             : 0.004
Stream size                    : 6.37 MiB (38%)
Language                       : English
Default                        : Yes
Forced                         : No
Color range                    : Limited
Color primaries                : BT.709
Transfer characteristics       : BT.709
Matrix coefficients            : BT.709

Audio
ID                             : 2
Format                         : E-AC-3
Format/Info                    : Enhanced AC-3
Commercial name                : Dolby Digital Plus
Codec ID                       : A_EAC3
Duration                       : 3 min 43 s
Bit rate mode                  : Constant
Bit rate                       : 384 kb/s
Channel(s)                     : 6 channels
Channel layout                 : L R C LFE Ls Rs
Sampling rate                  : 48.0 kHz
Frame rate                     : 31.250 FPS (1536 SPF)
Bit depth                      : 32 bits
Compression mode               : Lossy
Stream size                    : 10.2 MiB (60%)
Title                          : ISO Media file produced by Google Inc.
Language                       : English
Service kind                   : Complete Main
Default                        : Yes
Forced                         : No
VENDOR_ID                      : [0][0][0][0]
Dialog Normalization           : -9 dB
compr                          : 0.53 dB
dialnorm_Average               : -9 dB
dialnorm_Minimum               : -9 dB
dialnorm_Maximum               : -9 dB
@CXwudi CXwudi changed the title Extraction and Tagging for Extraction and Tagging for Enhanced AC-3 Sep 15, 2024
@CXwudi CXwudi self-assigned this Sep 15, 2024
@CXwudi
Copy link
Owner Author

CXwudi commented Sep 16, 2024

PoC finished, here is the implementation route:

Extraction: ffmpeg -i .\【初音ミク】こころのキラリ【shishy】[661223]-pv.mkv -vn -acodec copy .\【初音ミク】こころのキラリ【shishy 】.mka.

Tagging: mkvpropedit.exe '.\【初音ミク】こころのキラリ【shishy】.mka' --tags all:tag.xml

Where tag-file.xml specification can be found in:

https://www.matroska.org/technical/elements.html (see Tagging section)
https://www.matroska.org/technical/tagging.html

Here is a sample XML file from GPT-4o:

<?xml version="1.0" encoding="UTF-8"?>
<Tags>
  <!-- Tag for the whole file -->
  <Tag>
    <Targets>
      <TargetTypeValue>50</TargetTypeValue>
    </Targets>
    <Simple>
      <Name>ENCODER</Name>
      <String>Lavf61.1.100</String>
    </Simple>
    <Simple>
      <Name>CUSTOM TAG</Name>
      <String>Wudi</String>
    </Simple>
  </Tag>

  <!-- Tag for the artist and date recorded -->
  <Tag>
    <Targets>
      <TargetTypeValue>30</TargetTypeValue>
    </Targets>
    <Simple>
      <Name>ARTIST</Name>
      <String>some artist</String>
      <TagLanguage>und</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>DATE_RECORDED</Name>
      <String>2024</String>
      <TagLanguage>und</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
    <Simple>
      <Name>CUSTOM TAG 2</Name>
      <String>Wudi 2</String>
    </Simple>
  </Tag>

  <!-- Tag for the title -->
  <Tag>
    <Targets>
      <TargetTypeValue>30</TargetTypeValue>
    </Targets>
    <Simple>
      <Name>TITLE</Name>
      <String>some title</String>
      <TagLanguage>und</TagLanguage>
      <DefaultLanguage>1</DefaultLanguage>
    </Simple>
  </Tag>

</Tags>

To add cover image, an extra command is needed: mkvpropedit.exe '.\【初音ミク】こころのキラリ【shishy】.mka' --attachment-name "cover.webp" --attachment-mime-type "image/webp" --attachment-description "cover image" --add-attachment .\【初音ミク】こころのキラリ【shishy】[661223]-thumbnail.webp

Be aware that we need to detect the mime-type, we can reuse mediainfo we already have

@CXwudi
Copy link
Owner Author

CXwudi commented Sep 16, 2024

Looks like Mka can be a versatile container for any audio format. Hence, we can use mka as the fallback for any other unrecognized format.

@CXwudi CXwudi changed the title Extraction and Tagging for Enhanced AC-3 Extraction and Tagging for Mka as the fallback for any audio format (including Enhanced AC-3) Sep 16, 2024
@CXwudi
Copy link
Owner Author

CXwudi commented Sep 16, 2024

Matroska is not supported in mutagen quodlibet/mutagen#3, so no need to think about workaround in python

@CXwudi
Copy link
Owner Author

CXwudi commented Sep 16, 2024

Just a sidenote, the eac3 format can use APEv2 format, which is supported by mutagen. However, it doesn't support cover image and the format itself is not widely recognized. Hence discarding

@CXwudi CXwudi changed the title Extraction and Tagging for Mka as the fallback for any audio format (including Enhanced AC-3) Use mka as the fallback container for any other audio formats (including Enhanced AC-3) Sep 16, 2024
@CXwudi CXwudi linked a pull request Dec 16, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant