-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] [fix #702] Validate content of <dc:language> #807
Conversation
Hej Daniel, thanks for your contribution! I'll try to review on the weekend! |
@kalaspuffar Daniel, is it this statement which makes the PullRequest a WorkInProgress at the moment?
If so: @murata0204 and @mattgarrish as you've been involved in the discussion about the topic in the original issue, could you probably answer Daniel's question? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that the proposed OPF_085 would raise false positives, in some rare but legit cases.
@@ -282,6 +282,8 @@ public Message getMessage(MessageId id) | |||
map.put(MessageId.OPF_082, Severity.ERROR); | |||
map.put(MessageId.OPF_083, Severity.ERROR); | |||
map.put(MessageId.OPF_084, Severity.ERROR); | |||
map.put(MessageId.OPF_085, Severity.WARNING); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think EpubCheck can issue a WARNING for language tag variants, as there are legitimate uses for the variant tag. See for instance W3C’s explainer on the language tags.
At best this can be a HINT, but then I'm not quite sure it's worth implementing if it's just informative?
{ | ||
Locale l = Locale.forLanguageTag(metadata.getValue()); | ||
if (l == null || l.getLanguage().length() > 3) { | ||
report.message(MessageId.OPF_086, EPUBLocation.create(path), metadata.getValue()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK.
A more thorough validation would be to use IANA’s language tag registry, but this is probably overkill?
8a39f0d
to
2803029
Compare
Hi @rdeltour I've changed the warning for variants to info. You mention that this check should be done in the schema already. Didn't see one of those checks and my implementation only use Locale.forLanguageTag from Java 1.7 to validate that you use a valid format. And if you have a variant defined I will inform you about that too. Maybe this could be done "easier" or more correct by a validation rule in the schema? |
Closing this PR for now. A more complete check (based on the IANA registry) may be implemented in the future, depending on the outcome of the spec discussion at w3c/epub-specs#1509 |
With this implementation, I try to fix the issue of #702.
This current fix will generate an error if the language prefix is longer than 3 characters.
The specification says that you could have pre-registered language subtags or could use 4 letter codes but these are reserved for future use and currently not used to my knowledge.
If this test passes it will check if this language has a variant specified and if so it will generate a warning informing the user of this discrepancy.
I hope this fix will start discussion towards a complete solution.