New DublinCore-based metadata implementation #1266

gregchapman-dev · 2022-04-07T22:30:10Z

A first cut so you can see what I'm up to, and I can get some idea from you folks if I'm headed in a good direction.
Fixes #1262.

gregchapman-dev · 2022-04-07T22:31:28Z

I have deleted this very obsolete description of my approach.

gregchapman-dev · 2022-04-10T00:58:29Z

This PR now has proposed modifications to xmlToM21.py and m21ToXml.py to use the new Metadata APIs. Anything unrecognized goes into the miscellaneous tag, with the key containing both namespace and key, so if a writer understands one of those namespaces, it can interpret that data with no loss.

gregchapman-dev · 2022-04-11T20:10:48Z

Another deleted, updated-but-still-obsolete description of my approach.

gregchapman-dev · 2022-04-18T23:40:01Z

No more class hierarchy, it was not working for me the way I wanted. class Metadata has all the old APIs, which are re-implemented, but fully backward compatible, and all the new APIs, which should be used instead of the old ones. The MusicXML parser and writer can run using either (chosen at run-time). I'll remove the old code when I have other good backward compatibility tests written.

gregchapman-dev · 2022-04-18T23:41:36Z

Next step: get the current set of tests (all backward compatibility-based) succeeding. Then write a bunch of new tests for the new APIs, and some new tests for the old APIs as well.

coveralls · 2022-04-19T22:52:09Z

Coverage increased (+0.007%) to 93.09% when pulling 3f1d5ea on gregchapman-dev:gregc/metadataDublinCore into 66cba42 on cuthbertLab:master.

gregchapman-dev · 2022-04-19T22:54:22Z

This is ready for review (I am working on writing a bunch of new tests to increase coverage significantly).

gregchapman-dev · 2022-04-22T22:01:04Z

This is now ready for review. I have attached a text file containing a description of the nature of the changes here
description.txt
.

music21/metadata/__init__.py

mscuthbert

Hi! I just did a once over and it looks really powerful and a good step in the right direction. My main concern is that I'd like the standard API to look more like Python and less like Java with getItem/setItem etc. Is there a way forward with that?

Some smaller changes asking for more diverse examples and attention to docs generation also.

This isn't a full review -- just what I could do on a long plane flight.

music21/metadata/__init__.py

music21/metadata/testMetadata.py

gregchapman-dev · 2022-05-14T20:30:45Z

That latest big push was just a rebase to latest master, and a bunch of "t." additions to my new code to match the new "import typing as t" thing.

mscuthbert · 2022-05-25T08:02:56Z

thanks for the amazing work on this -- I'm liking what I'm seeing and I'm getting more and more confident this will be a major feature in v.8

music21/metadata/__init__.py

mscuthbert · 2022-05-25T20:44:58Z

Looking a little more, I thought we were leaning more towards .get('composer') returning a single item (like .getFirst does) and .getAll() returning a tuple.

I'm still digging through the roots, but why can't we use __setattr__ and __getattr__ to allow what .get() and .set() do, and suggest using those only for items that aren't valid Python attributes such as name-spaced attributes. (and raise an AttributeError if trying to set something custom, with an error on set: AttributeError: use .setCustom() to add a custom attribute or .set() to add a namespace?

mscuthbert · 2022-05-25T20:47:09Z

The two most Pythonic ways to add or remove an attribute are md.librettist = ... or md['librettist'] = .... Maybe we can use both, the .librettist form for getting a string and the ['librettist'] form for getting/setting a tuple of objects?

gregchapman-dev · 2022-05-25T23:04:46Z

I switched away from get -> t.Any and getAll -> t.Tuple today (yesterday?), changing them to getFirst -> t.Any and get -> t.Tuple, in order to make it clear that clients should generally be prepared to deal with multiple values for any given item. getFirst should only be called if you are in a situation where only one can be used (e.g. exporting to musicxml where the “first” title goes in as the “one true title", and the rest go in miscellaneous a bit later; take a look at my version of m21ToXml.py to see how I do this). set() and add() have been modified very recently as well, to take either a single item or a list of items (setAll and especially addAll sounded weird to me). I’m attracted to the idea of using setattr/getattr and __getitem__ (or whatever it is) to make it possible to do things like: item = md.librettist items= md[‘librettist'] items = md[‘marcrel:LBT’] # that’s ‘librettist’ as well, in ’namespace:name’ form md.librettist = item md[‘librettist’] = items md[‘marcrel:LBT’] = items But it’s a little weird differentiating plural vs singular getters that way. Dict-style keying doesn’t feel obviously more plural. And how do you do an add? I could, on the other hand, have customized plural forms for each unique name (but not for the namespace:names, which could always return a tuple, I guess). I would just add a pluralUniqueName field to each PropertyDescription. And the customized plurals can get weird. ‘librettists' is pretty straightforward, but what’s the plural of ‘rights’? ‘rightses’? Maybe ‘rightsList’ or ‘multipleRights’, or something, as a fallback for difficult terms. You will notice that musicxml actually supports multiple rightses, and I have changed the musicxml converter to take advantage of that. And we still have to have an add() call, so the properties don’t feel great as a complete solution. As an aside, there are also compatibility issues with adding all the new metadata terms as properties. md.librettist (being an existing property, not a new one) currently returns str. And it would need to change to return Text (to match most of the new properties), which breaks backward compatibility. Is that OK? If so, I can remove a lot of weird code that tries very hard to keep all those old properties behaving exactly as before. Further thoughts? I’ve been mulling this issue for a while, and ended up just leaving the old properties as is, and not trying to add new ones (or plural new ones). It just got too complicated, so I’m currently stuck with “you really need to be using get/set/add, and occasionally getFirst”. I was even considering deprecating the old singular properties, since singular isn’t really what anyone should usually be doing. (But md.librettist sure does sound like it returns only one librettist.) For the record, I like the feel of get/set/add; there is other code in music21 that does this (e.g. for XML elements), and dict itself has a get API. Also for the record, I am motivated by a desire to make it obvious in the API that asking for a single librettist (or alternativeTitle, or whatever) is not generally a good idea. Greg

…

On May 25, 2022, at 1:47 PM, Michael Scott Asato Cuthbert ***@***.***> wrote: The two most Pythonic ways to add or remove an attribute are md.librettist = ... or md['librettist'] = .... Maybe we can use both, the .librettist form for getting a string and the ['librettist'] form for getting/setting a tuple of objects? — Reply to this email directly, view it on GitHub <#1266 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AR6X47F44WMYLBDSSE3D7GDVL2GVRANCNFSM5S2Y34GA>. You are receiving this because you authored the thread.

mscuthbert · 2022-05-26T09:03:39Z

I’m attracted to the idea of using setattr/getattr and getitem (or whatever it is) to make it possible to do things like: item = md.librettist items= md[‘librettist'] items = md[‘marcrel:LBT’] # that’s ‘librettist’ as well, in ’namespace:name’ form md.librettist = item md[‘librettist’] = items md[‘marcrel:LBT’] = items But it’s a little weird differentiating plural vs singular getters that way. Dict-style keying doesn’t feel obviously more plural.

It's not about plural vs. singular, it's about returning a single Python primitive (generally a string) with __getattr__, and returning the underlying representation (generally a tuple of music21.metadata.primatives.Text/Date, etc. objects) with __getitem__.

We could also do a little Django-style magic and let print(md.marcel__LBT) work, where __ substitutes for :. But that might be overkill.

What I would very much like is that most people can continue using metadata, even expanded metadata, as something like:

In [3]: corpus.parse('mozart').metadata.composer
Out[3]: 'W.A. Mozart'

but they might get:

In [3]: corpus.parse('mozart').metadata['composer']
Out[3]: (<music21.metadata.primitives.Text 'W.A. Mozart'>,)

in case they want to work with something deeper.

If there is more than one composer, I would be okay with something like:

In [3]: corpus.parse('mozart').metadata.composer
Out[3]: 'MULTIPLE'

to signify to look further?

gregchapman-dev · 2022-05-26T16:25:04Z

This is appealing. Let me think a bit on it.

gregchapman-dev · 2022-05-26T20:48:31Z

It's not about plural vs. singular, it's about returning a single Python primitive (generally a string) with getattr, and returning the underlying representation (generally a tuple of music21.metadata.primatives.Text/Date, etc. objects) with getitem.

I did not understand this at first, and I do now. I like this approach a lot.

We could also do a little Django-style magic and let print(md.marcel__LBT) work, where __ substitutes for :. But that might be overkill.

Yeah, I agree that's overkill.

In [3]: corpus.parse('mozart').metadata.composer
Out[3]: 'W.A. Mozart'

Yes.

In [3]: corpus.parse('mozart').metadata['composer']
Out[3]: (<music21.metadata.primitives.Text 'W.A. Mozart'>,)

Yes.

In [3]: corpus.parse('mozart').metadata.composer
Out[3]: 'MULTIPLE'
to signify to look further?

Yes.

I think we still need further discussion on similar shortcuts for set(). I think add() is the only way to add.

gregchapman-dev · 2022-05-26T21:18:04Z

Oh! Are you proposing that __getattr__ and __getitem__ would replace get and getFirst? I would actually be OK with that.

gregchapman-dev · 2022-05-26T23:47:18Z

Latest push is new implementation of __getattr__ and __getitem__.

gregchapman-dev · 2022-05-29T20:44:04Z

As of the latest push, get/getFirst/set are replaced by __getattr__, __getitem__, __setattr__, and __setitem__. This also means that you can get/set single items as strings with md.uniqueName (plus the already existing workId and workId abbreviation attributes. add() remains, as the only way to add more items of a particular name. Let me know what you think.

gregchapman-dev · 2022-05-30T18:12:26Z

Oh, and all the parsers have been updated to use add() as much as makes sense for that file format, and the writers are all dealing with multiples as much as makes sense for that file format.

mscuthbert · 2022-05-30T20:36:05Z

Oh! Are you proposing that __getattr__ and __getitem__ would replace get and getFirst? I would actually be OK with that.

I actually meant that __getattr__ and __getitem__ would replace getFirst and get (=getAll) so the dot-form gets the most simple, ready to go version, and the bracket-form gets a representation that someone needs to know about music21 metadata to work with. I'll look at the code soon to see if that's what was implemented.

gregchapman-dev · 2022-08-04T20:20:52Z

OK, that big push was the rebase to latest master. On to uniqueName keys in md._contents.

gregchapman-dev · 2022-08-04T21:13:44Z

The test failure is unrelated to my changes:
FAIL: _rawQuery (configure.SelectFromList)
Doctest: configure.SelectFromList._rawQuery

mscuthbert · 2022-08-04T21:20:41Z

ack -- the push of all previous commits has wiped out Github's memory of what has already been reviewed, what has changed since last review, etc. -- can you point me to the place you've made the most changes since yesterday so I can look at the diffs manually. I was already planning on doing a Squash Merge when it's ready, but definitely will now. :-)

mscuthbert

Figured out that all changes were in the last commit. Whew!

Looking good. I'm still trying to grok when custom attributes are included and when they're excluded in various routines; I think I've found a few places where they're not included. It'd be great to have custom attributes set in more of the tests to confirm that they're being searched/returned by various .all() etc. routines.

Besides the namespace name differences (TLAs), I'm confused about the difference between .all() and .getAllNamedValues() -- when should a user use the first and when the latter? If the namespace:TLA is the main difference, then the method name doesn't do a good job of distinguishing them (and couldn't that be an attribute on .all()?) THANKS!

mscuthbert · 2022-08-04T21:24:11Z

music21/metadata/primitives.py

@@ -1488,7 +1488,7 @@ def testDateSelection(self):


 ValueType = t.Union[DateSingle, DateRelative, DateBetween, DateSelection,
-                    Text, Contributor, Copyright]
+                    Text, Contributor, Copyright, int]


where is "int" included in this? It doesn't seem to make sense as part of this definition. In the few places where an int can be returned would it be better to type that as t.Union[ValueType, int] -- otherwise you won't be able to do x: ValueType = function(); print(x._data) because ints don't have a ._data attribute.

ValueType is "the types of the values that might be found in self._contents". I didn't realize it also needed to mean "types with _data". I found ValueType really useful (instead of using t.Any), and I'm not sure using t.Union[ValueType, int] everywhere will actually solve your problem (if function returns t.Union[ValueType, int], you still can't print(x._data). Perhaps we shouldn't use int in _contents at all, and have a new type with an int x._data field?

Oh, and int is in there because of fileNumber.

mscuthbert · 2022-08-04T21:26:15Z

music21/metadata/properties.py

+UNIQUE_NAME_TO_MUSIC21_WORK_ID: t.Dict[str, str] = {
+    x.uniqueName if x.uniqueName
+    else x.name:
+        x.oldMusic21WorkId if x.oldMusic21WorkId
+        else ''
+        for x in STANDARD_PROPERTY_DESCRIPTIONS
+        if x.oldMusic21WorkId
+}


please rewrite as a standard loop. I can't figure out what if goes to what. Only singly-nested generator functions appear (or should appear) in music21 code and this is triply nested.

Aw man, I was so proud of writing that crazy thing. I can't read it either, though, so yeah I'll redo those.

it was indeed remarkably impressive. the part that made me say, "nah, as amazing as this is, it's gotta go" was seeing else x.name: with a line break after it, and I was like, "wait a second, else can't take an expression after it, he's thinking of elif..." before I realized that the : was the separator of the key:value of the dictionary.

mscuthbert · 2022-08-04T21:27:33Z

music21/musicxml/xmlToM21.py

+        # Metadata.all to here, since it is clearly MusicXML-specific, and I don't
+        # want to corrupt the actual metadata in other code paths/converters. Perhaps


mscuthbert · 2022-08-04T21:28:06Z

music21/romanText/writeRoman.py

+                             f'Analyst: {self.analyst}',
+                             f'Proofreader: {self.proofreader}',


mscuthbert · 2022-08-04T21:32:32Z

music21/metadata/__init__.py

+                    if not self._isStandardUniqueName(uniqueName):
+                        # custom metadata, don't search it
+                        continue


will this be found earlier in the self. _getPluralAttribute(field) line? We would like to be able to search customMetadata if possible.

No, custom metadata is currently not searchable. If you want to search custom metadata there's a bunch of work needed in the search algorithm, because in some (most? all?) cases it will only report the first matching piece of metadata. So a custom piece of metadata that has "title" somewhere in its custom name will mess your expected results up badly. I'm not sure I want to bite that off just now. It would be a pretty big behavior change.

Ah. That can be done later, I suppose, since it's not a feature in the current system apparently (I thought it was, but just checked). I'll put that as a followup PR. Your code is so clear that I'm pretty sure I know how to do it pretty easily.

mscuthbert · 2022-08-04T21:42:45Z

music21/metadata/__init__.py

@@ -1809,23 +1768,6 @@ def _getPluralAttribute(self, attributeName: str) -> t.Tuple[str, ...]:
                properties.MUSIC21_ABBREVIATION_TO_NAMESPACE_NAME[attributeName]
            )

-        # The following are in searchAttributes, and getattr will find them because


this is a comment about the above lines (from prior commits). Couldn't this, faster and more legibly be:

if attributeName in self._contents: ...do something with self._contents[attributeName] try: attributeName = ...standardize attribute name ... do something with self._contents[attributeName] except (KeyError, AttributeError) as ae: raise AttributeError(f'invalid attributeName: {attributeName}')

that way we get customElement searching in here as well, and short circuit the expensive standardization process when the correctly spelled name exists and the metadata exists.

BTW -- What is the return result for a correctly spelled standard attributeName but without information/entry in md._contents? ('',), None, or AttributeError?

It's None for singular attributes, and an empty tuple for plural ones.

It's interesting to expose custom names as attributes, although it would only work for those custom names that are valid Python attribute names. For example, custom names might be things like 'myNamespace:blah', and that would never end up in an attribute name.

Also, I think the side-effect of adding (some) custom element searching is not something we want just yet (from a different discussion).

AHH!!!! This is for attributes not for getItem -- never mind. You're right, customElements shouldn't get a .customName interface.

music21/metadata/__init__.py

mscuthbert · 2022-08-04T21:46:06Z

music21/metadata/__init__.py

+        >>> rmd.all(skipContributors=True)
+        (('ambitus',
+            AmbitusShort(semitones=48, diatonic='P1', pitchLowest='C2', pitchHighest='C6')),
+         ('copyright', '© 2014, Creative Commons License (CC-BY)'),
+         ('dateCreated', '1689/--/-- or earlier'),
+         ('fileFormat', 'musicxml'),
+         ('filePath', '...corpus/corelli/opus3no1/1grave.xml'),
+         ('keySignatureFirst', -1),
+         ('keySignatures', [-1]),
+         ('localeOfComposition', 'Rome'),
+         ('movementName',
+         'Sonata da Chiesa, No. I (opus 3, no. 1)'),
+         ('noteCount', 259),
+         ('numberOfParts', 3),
+         ('pitchHighest', 'C6'),
+         ('pitchLowest', 'C2'),


all this doesn't need to be repeated just to show one thing has changed. use ... to skip identical lines.

music21/metadata/__init__.py

…t namespaceName (or custom name).

gregchapman-dev · 2022-08-04T23:22:59Z

I'm going to pause to let you finish reviewing everything including my (just pushed) change to make self._contents keyed by uniqueName instead of namespaceName. That simplified things quite a bit.

mscuthbert · 2022-08-04T23:28:01Z

I'm going to pause to let you finish reviewing everything including my (just pushed) change to make self._contents keyed by uniqueName instead of namespaceName. That simplified things quite a bit.

Done -- it's AWESOME!

Is the method to translate a uniqueName to namespaceName (and vice-versa) public or private? I suspect that people will want to do this, depending on their needs. (and if we eventually get in an attribute to .write(namespaceMetadata=True), we'll want it to be public)

mscuthbert · 2022-08-04T23:28:54Z

failing test is my fault -- will fix on master and then rerun.

gregchapman-dev · 2022-08-04T23:35:15Z

Metadata.uniqueNameToNamespaceName and Metadata.namespaceNameToUniqueName are both public.

mscuthbert · 2022-08-04T23:42:16Z

Greg -- after the force push, Github seems to be treating pr/1266 and gregchapman-dev:gregc/metadataDublinCore as two different branches, so the tests are failing right now against master because of a boneheaded direct push I did yesterday. When you make your next edits, can you make sure that you can merge master into this first so that we can see if the tests all run. Thanks!

…le fix.

gregchapman-dev · 2022-08-04T23:54:11Z

Until (very very) recently, all returned uniqueNames and getAllNamedValues returned namespaceNames. That is no longer the case, they both return uniqueNames. There may be other slight differences (I was trying to keep all somewhat backward-compatible). I'll take a look to see if I can just have one (probably called all) and make it do one thing.

gregchapman-dev · 2022-08-05T00:15:07Z

I take that back about all. There is still a big remaining difference. all returns all the values as str (except for the RichMetadata stuff, which can be t.Any). getAllNamedValues returns the values as the full ValueType (i.e. Text, Contributor, etc).

So I think we just need a new name for getAllNamedValues. Any suggestions?

mscuthbert · 2022-08-05T00:38:53Z

I take that back about all. There is still a big remaining difference. all returns all the values as str (except for the RichMetadata stuff, which can be t.Any). getAllNamedValues returns the values as the full ValueType (i.e. Text, Contributor, etc).

So I think we just need a new name for getAllNamedValues. Any suggestions?

allPrimitives() ?

though would it be hard to make returnPrimitives=True be a keyword option to all()? Otherwise we'll eventually need 4 methods if we also have namespaceNames=True as options.

If you go this route, we'll probably need an @overload operator on the typing (see, for instance, Chord.closedPosition() which gives a different return value based on inPlace=True or False) -- but for now the typing can be t.Dict[str, t.Union[str, ValueType]]

gregchapman-dev · 2022-08-05T00:46:17Z

Typing is hard here, because RichMetadata overrides this, and returns some randomly typed stuff. And you can't have the RichMetadata version typed differently than the Metadata version.

gregchapman-dev · 2022-08-05T00:48:03Z

I like the @overload thing though; I didn't know about that. I'll see what I can do.

mscuthbert · 2022-08-05T00:51:49Z

Typing is hard here, because RichMetadata overrides this, and returns some randomly typed stuff. And you can't have the RichMetadata version typed differently than the Metadata version.

Yeah, I think that I'm going to make RichMetadata also return primitives of some sort in a future PR, hopefully before v8, so that everything in metadata is some sort of primitive.

I'll probably also make them all inherit from a base class like MetadataPrimitive or something like that. When music21 first started, additional class hierarchies really slowed down Python -- deciding to put in GeneralNote and NotRest between Music21Object and Note was really agonizing because of the speed difference. Now it's basically negligible:

In [1]: class A:
   ...:     pass

In [2]: class B(A):
   ...:     pass

In [3]: %timeit A()
75.9 ns ± 2.9 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [4]: %timeit B()
78.3 ns ± 2.59 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

less negligible if there's an init and super() call though, but still in the ns range:

In [1]: class A:
   ...:     def __init__(self):
   ...:         pass
 
In [2]: class B(A):
   ...:     def __init__(self):
   ...:         super().__init__()
 
In [3]: %timeit A()
144 ns ± 2.96 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)

In [4]: %timeit B()
415 ns ± 10.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

1. Delete some duplicative test output. 2. Use explicit loops to initialize dictionaries in properties.py. 3. Remove getAllNamedValues() and getAllContributorNamedValues(), replacing them with new options on all(): skipNonContributors, returnPrimitives, and returnSorted. 4. Tweak titleSummary return code in bestTitle. 5. (not from review) Remove dead code left over from a bad merge, in hopes it will improve coverage enough.

gregchapman-dev · 2022-08-05T20:01:15Z

I think this last push addresses everything.

gregchapman-dev · 2022-08-05T20:06:56Z

If you want to change skipNonContributors to onlyContributors, I'll do it. I couldn't pick between the two, so I flipped a coin. skipNonContributors makes sense next to skipContributors when you're reading the docstring, but when you call all() for contributors, saying skipNonContributors=True makes you scratch your head every time.

gregchapman-dev · 2022-08-05T20:11:41Z

Oh, and I left the difficult type-hinting on all() to you. I couldn't figure out how to do it. For now the values returned are t.Any instead of t.Union[str, whatever RichMetadata returns] (if returnPrimitives is False) or t.Union[ValueType, whatever RichMetadata returns] (if returnPrimitives is True).

gregchapman-dev · 2022-08-05T20:31:46Z

A thought about namespace names in MusicXML <miscellaneous>: Perhaps in the MusicXML writer I should add a single <miscellaneous-field> for each namespace that we used in the section, as follows:

<miscellaneous-field name=dcterms>https://purl.org/dc/terms/<\miscellaneous-field>
<miscellaneous-field name=marcrel>https://id.loc.gov/vocabulary/relators/<\miscellaneous-field>
<miscellaneous-field name=humdrum>https://www.humdrum.org/reference-records/#<\miscellaneous-field>

That way we are documenting what the namespaces mean, which makes the TLAs less horrible. You can look them up. And it's both human-helpful and other-people's-software-helpful.

@ahankinson what do you think?

ahankinson · 2022-08-05T21:21:55Z

The "correct" XML way to do it is to use the xmlns attribute:

<root-element xmlns:marcrel="https://id.loc.gov/vocabulary/relators/">
    ...
    <miscellaneous-field>... </miscellaneous-field>
</root-element>

Any child element of an element with an xmlns declaration will inherit the declaration.

There are only two issues with this:

I don't know if all elements within MusicXML accepts the xmlns:... attribute. You probably want to hand-write one in, and then verify it with the MusicXML schema to see whether it passes. Typically you might declare all used namespaces in the document on the document root element.
XML Namespaces are technically only for elements and attributes; they don't really apply for attribute values. So you don't get the automatic namespace expansion facilities in XML processors. But it probably doesn't hurt anything either.

Here are a few references:

https://docs.microsoft.com/en-us/previous-versions/windows/desktop/ms754539(v=vs.85)

https://www.w3.org/TR/xml-names/

gregchapman-dev · 2022-08-05T21:51:03Z

@ahankinson I believe what you are saying is that I would put three xmlns declarations in the <miscellaneous> tag (which is the root element for all the <miscellaneous-field> tags). You're saying that this is the standard way to allow elements and attributes (but not attribute values) to use those three prefixes, and various XML tools will know about that. It will be obvious to humans (but the tools will not really know) what it means when we use those prefixes in attribute values (e.g. name=marcrel:XXX). And software that parsed the MusicXML could know about it, too, but by interpreting that standard info in a slightly non-standard way (applying the xmlns info to the attribute values).

If that's what you meant, I think I should make the MusicXML writer do it. (But I'll try it by hand first, to make sure the MusicXML schema is OK with it.)

@mscuthbert do you agree?

mscuthbert · 2022-08-05T21:57:43Z

If you want to change skipNonContributors to onlyContributors, I'll do it. I couldn't pick between the two, so I flipped a coin. skipNonContributors makes sense next to skipContributors when you're reading the docstring, but when you call all() for contributors, saying skipNonContributors=True makes you scratch your head every time.

yeah, I think I'll flip it to 'onlyContributors' but after this PR is merged.

mscuthbert · 2022-08-05T22:01:27Z

@ahankinson I believe what you are saying is that I would put three xmlns declarations in the <miscellaneous> tag (which is the root element for all the <miscellaneous-field> tags). You're saying that this is the standard way to allow elements and attributes (but not attribute values) to use those three prefixes, and various XML tools will know about that. It will be obvious to humans (but the tools will not really know) what it means when we use those prefixes in attribute values (e.g. name=marcrel:XXX). And software that parsed the MusicXML could know about it, too, but by interpreting that standard info in a slightly non-standard way (applying the xmlns info to the attribute values).

If that's what you meant, I think I should make the MusicXML writer do it. (But I'll try it by hand first, to make sure the MusicXML schema is OK with it.)

@mscuthbert do you agree?

When I've tried to put other namespaces into MusicXML files in the past, I ran into major problems with external readers and with Python parsing as well. But it was a long time ago, maybe things have gotten better. In any case, that can be a following project/PR.

I'm just running some speed tests and if they pass, will merge.

mscuthbert · 2022-08-05T22:09:41Z

Speed tests:

>>> %timeit corpus.manager._metadataBundles.clear(); corpus.manager.cacheMetadataBundleFromDisk(corpus.corpora.CoreC
   ...: orpus())
1.33 s ± 18.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# master:  1.89s; previous push: 1.62s ✅

>>> md = corpus.parse('luca').metadata
>>> %timeit md.search('credo')
16.1 µs ± 297 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
# master: 48µs; previous push: 307µs ✅

>>> %timeit corpus.corpora.CoreCorpus().search('458')
571 ms ± 6.06 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# master: 953ms; prev-push: 7600ms  ✅

Actual performance might be even better, since I ran the prior tests plugged in but I'm on battery now.

Looks good to merge.

Thank you for the huge work!

mscuthbert · 2022-08-05T22:12:45Z

I'll open up an issue where we can continue to have conversations about some of the important things that Andrew brings up, and look at smoothing out some rough edges before we need to commit to all aspects of the public API

gregchapman-dev marked this pull request as draft April 7, 2022 22:30

gregchapman-dev mentioned this pull request Apr 7, 2022

Metadata rework #1262

Closed

gregchapman-dev marked this pull request as ready for review April 19, 2022 22:54

gregchapman-dev force-pushed the gregc/metadataDublinCore branch from 9d50355 to f660326 Compare April 20, 2022 20:14

mscuthbert reviewed May 12, 2022

View reviewed changes

music21/metadata/__init__.py Outdated Show resolved Hide resolved

mscuthbert requested changes May 13, 2022

View reviewed changes

gregchapman-dev force-pushed the gregc/metadataDublinCore branch from 894ec21 to 881a587 Compare May 14, 2022 20:29

gregchapman-dev force-pushed the gregc/metadataDublinCore branch from a8b2132 to 46177eb Compare May 23, 2022 00:10

mscuthbert reviewed May 25, 2022

View reviewed changes

music21/metadata/__init__.py Show resolved Hide resolved

mscuthbert requested changes Aug 4, 2022

View reviewed changes

Metadata's _contents are now keyed by uniqueName (or custom name), no…

ed4bd2a

…t namespaceName (or custom name).

Merge branch 'master' into gregc/metadataDublinCore to pick up a sing…

76bda31

…le fix.

mscuthbert merged commit 4d48584 into cuthbertLab:master Aug 5, 2022

ahankinson mentioned this pull request Aug 8, 2022

Rough edges/add-ons to Metadata rewrite #1355

Open

		# Metadata.all to here, since it is clearly MusicXML-specific, and I don't
		# want to corrupt the actual metadata in other code paths/converters. Perhaps

		f'Analyst: {self.analyst}',
		f'Proofreader: {self.proofreader}',

New DublinCore-based metadata implementation #1266

New DublinCore-based metadata implementation #1266

Conversation

gregchapman-dev commented Apr 7, 2022 • edited Loading

gregchapman-dev commented Apr 7, 2022 • edited Loading

gregchapman-dev commented Apr 10, 2022

gregchapman-dev commented Apr 11, 2022 • edited Loading

gregchapman-dev commented Apr 18, 2022

gregchapman-dev commented Apr 18, 2022

coveralls commented Apr 19, 2022 • edited Loading

gregchapman-dev commented Apr 19, 2022

gregchapman-dev commented Apr 22, 2022

mscuthbert left a comment

Choose a reason for hiding this comment

gregchapman-dev commented May 14, 2022

mscuthbert commented May 25, 2022

mscuthbert commented May 25, 2022

mscuthbert commented May 25, 2022

gregchapman-dev commented May 25, 2022 via email

mscuthbert commented May 26, 2022

gregchapman-dev commented May 26, 2022

gregchapman-dev commented May 26, 2022

gregchapman-dev commented May 26, 2022

gregchapman-dev commented May 26, 2022

gregchapman-dev commented May 29, 2022 • edited Loading

gregchapman-dev commented May 30, 2022

mscuthbert commented May 30, 2022 • edited Loading

gregchapman-dev commented Aug 4, 2022

gregchapman-dev commented Aug 4, 2022

mscuthbert commented Aug 4, 2022

mscuthbert left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mscuthbert Aug 5, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gregchapman-dev commented Aug 4, 2022

mscuthbert commented Aug 4, 2022

mscuthbert commented Aug 4, 2022

gregchapman-dev commented Aug 4, 2022

mscuthbert commented Aug 4, 2022

gregchapman-dev commented Aug 4, 2022

gregchapman-dev commented Aug 5, 2022

mscuthbert commented Aug 5, 2022 • edited Loading

gregchapman-dev commented Aug 5, 2022

gregchapman-dev commented Aug 5, 2022

mscuthbert commented Aug 5, 2022 • edited Loading

gregchapman-dev commented Aug 5, 2022

gregchapman-dev commented Aug 5, 2022 • edited Loading

gregchapman-dev commented Aug 5, 2022 • edited Loading

gregchapman-dev commented Aug 5, 2022 • edited Loading

ahankinson commented Aug 5, 2022

gregchapman-dev commented Aug 5, 2022

mscuthbert commented Aug 5, 2022

mscuthbert commented Aug 5, 2022

mscuthbert commented Aug 5, 2022

mscuthbert commented Aug 5, 2022

gregchapman-dev commented Apr 7, 2022 •

edited

Loading

gregchapman-dev commented Apr 7, 2022 •

edited

Loading

gregchapman-dev commented Apr 11, 2022 •

edited

Loading

coveralls commented Apr 19, 2022 •

edited

Loading

gregchapman-dev commented May 29, 2022 •

edited

Loading

mscuthbert commented May 30, 2022 •

edited

Loading

mscuthbert Aug 5, 2022 •

edited

Loading

mscuthbert commented Aug 5, 2022 •

edited

Loading

mscuthbert commented Aug 5, 2022 •

edited

Loading

gregchapman-dev commented Aug 5, 2022 •

edited

Loading

gregchapman-dev commented Aug 5, 2022 •

edited

Loading

gregchapman-dev commented Aug 5, 2022 •

edited

Loading