Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matroska tags #3

Open
lazka opened this issue Jul 4, 2014 · 14 comments
Open

Matroska tags #3

lazka opened this issue Jul 4, 2014 · 14 comments

Comments

@lazka
Copy link
Member

lazka commented Jul 4, 2014

Originally reported by: Christoph Reiter (Bitbucket: lazka, GitHub: lazka)


From steven.strobe.cc@gmail.com on June 16, 2009 08:28:39

This one seems... unlikely. From https://code.google.com/p/quodlibet/issues/detail?id=167 :

Ex Falso currently cannot edit mka tags. The ability to do so would be a
useful addition.

Original issue: http://code.google.com/p/mutagen/issues/detail?id=3


@lazka
Copy link
Member Author

lazka commented Jul 4, 2014

Original comment by Christoph Reiter (Bitbucket: lazka, GitHub: lazka):


From Micah.Wa...@gmail.com on January 18, 2013 00:31:15

I agree.  Matroska files are far more versatile and widely supported than flac or ogg in some applications (like managing both video and audio libraries with a variety of codecs, and expecting them to play on a commercial device).  Because of its potential influence on the world's use of open/free technology, I would place supporting Matroska meta-tags for mutagen and Ex-Falso above providing full mp4 container support.

@lazka
Copy link
Member Author

lazka commented Sep 25, 2014

Original comment by Ben Ockmore (Bitbucket: LordSputnik, GitHub: LordSputnik):


I've begun work on this.

My plan is as follows:

  1. Create a robust EBML parser, and tweak and fine tune it to perform in the optimal way.
  2. Create a separate Matroska-specific parser, able to read the tags stored within the Matroska EBML container.
  3. Create a dict-like metadata object, using native strings (utf8) as keys, and allowing bytes and unicode to be set as values. Byte data will be interpreted as the Matroska "binary" type, while unicode data will be converted to utf8 and stored.
  4. Write tests as I go along, and fill in any gaps at the end.
  5. Possibly implement support for WebM, since it is derived from Matroska.

Useful documents:

@lazka
Copy link
Member Author

lazka commented Sep 25, 2014

Original comment by Ben Ockmore (Bitbucket: LordSputnik, GitHub: LordSputnik):


I've created a branch called "matroska" for steps 1-4, so that code can be reviewed and shared without polluting the default branch.

@lazka
Copy link
Member Author

lazka commented Nov 25, 2014

Original comment by Freso Fenderson (Bitbucket: Freso, GitHub: Freso):


Remember to also do docs/api/matroska.rst or something like that.

@lazka
Copy link
Member Author

lazka commented May 9, 2016

Here is some code from the exaile project: https://github.com/exaile/exaile/blob/master/xl/metadata/_matroska.py

@Moilleadoir
Copy link

Still on the list somewhere?

@lazka
Copy link
Member Author

lazka commented Jan 13, 2018

yes

@phw
Copy link
Collaborator

phw commented Aug 21, 2018

Might be interesting: https://github.com/QBobWatson/python-ebml . It's GPLv3, though.

@Freso
Copy link

Freso commented Feb 18, 2019

I'm starting to need WebM manipulation. Is there any way I'd be able to speed this along?

@lud4ik
Copy link

lud4ik commented Oct 16, 2019

https://github.com/exaile/exaile/blob/master/xl/metadata/_matroska.py
doesn't work

72057594037927935 1
Traceback (most recent call last):
  File "test.py", line 164, in parse
    key, type_ = self.tags[id]
KeyError: 524531317

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 287, in <module>
    parse('/home/lud4ik/work/chats/audio/1348379-1570779757.webm')
  File "test.py", line 282, in parse
    return Ebml(location, MatroskaTags).parse()
  File "test.py", line 186, in parse
    value = self.parse(tell, tell + size)
  File "test.py", line 166, in parse
    self.seek(size, 1)
  File "test.py", line 57, in seek
    self.file.seek(offset, mode)
OSError: [Errno 22] Invalid argument

@lud4ik
Copy link

lud4ik commented Oct 16, 2019

@lud4ik
Copy link

lud4ik commented Nov 12, 2019

What is the proper condition if I want parse only header with metadata, not blocks of actual data (audio)? The "Cluster" element contains data, so I must read everything before it and stop until I find it?

@ffe4
Copy link

ffe4 commented May 27, 2021

The last commit to the matroska branch was in 2014. Anyone know the state of the implementation by @LordSputnik, and whether there were major challenges, or if it is even still compatible with how mutagen works today?

In the related ticket for Picard (link) there has been some discussion about whether to tag on the container or the stream level. As I understand the docs, in the case of mp4 there can only be container level tags, and only the first track is considered. Would container level tagging also be a sufficient for Matroska?

@LordSputnik
Copy link
Contributor

This was quite some time ago, but from what I remember the parsing was trickier than I expected. Sorry I can't be more helpful!

I don't think there would be much lost if somebody were to pick this up and start from scratch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants