-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What does it mean to "support" a foreign resource? #1464
Comments
An experience I wanted to share. I prepared my EPUB3 content with "cryptocurrency-blockchain" (JSON). The book is read when logged in with the "reading key" prepared for the reader. ReadiumDesktop and Thorium viewed it successfully. And in Thorium the "bookmarks" feature worked fine. |
Reading system developers: does your RS use manifest fallbacks? Do you check mime types to decide when to use the fallback? |
No. /hides in shame |
The issue was discussed in a meeting on 2021-05-27
View the transcript2. What does it mean to "support" a foreign resource? (issue epub-specs#1464)See github issue #1464. Dave Cramer: When writing some spec tests, one of the foundational aspects is a core media type Shinya Takami (高見真也): In Japan, in some cases manifest fallback is implemented, but may be domain specific Masakazu Kitahara: [via shiestyle] Voyagers RS does not support this in RSes, but in some places we do use the feature in Japan Brady Duga: we have two pipelines at Google, 1 for publishers, and 1 for people sideloading Wendy Reid: Don't think it is supported by Kobo Brady Duga: dauwhe can you make your sample epubs available? Dave Cramer: yes, i'll let you know where Dan Lazin: I have some tests that are not checked in, because I don't know what the proper behavior is supposed to be Dave Cramer: The tests are great, as it points to where we should be investigating Wendy Reid: Since we now have several tests without clear passes Dan Lazin: To pursue, in my test sheet, I have some blank cells so we can add notes there Brady Duga: this sounds like a problem, but if nobody has implemented manifest fallbacks, then maybe we just say that you can only use core media types in the spine, period Dave Cramer: Part of this is how epub has changed over time Ben Schroeter: Where did we leave the first conversation? Dave Cramer: We have the goal of communicating the status, but where to do that is still an open question Wendy Reid: Yes! Dave Cramer: Brady can eat dinner Wendy Reid: Will reconvene at the upcoming whole hour |
The issue was discussed in a meeting on 2021-05-28
View the transcript5. What does it mean to "support" a foreign resource?See github issue #1464. Continuation of the discussion on the first vF2F meeeting Dave Cramer: came about from testing. Ivan Herman: So what? 😀 Dave Cramer: write something in doc book but may not render in all systems so you put in the HTML as a fall back. but that didn't seem to happen. Ivan Herman: is it a problem if a RS displays the data directly? having a fallback would be a nice idea but maybe have to accept it. Is it ok to put a docbook in the spine, if I link to it or put it in the spine are two separate issues. Dave Cramer: RS offering to download a some mimetype seems really bad. Matt Garrish: Why do we care what renders in spine items? Offering guidance. Wendy Reid: DMG file is a massive security issue, that an issue that RS would handle on their side. If a RS detects an executable within an EPUB it should do something. But we don't want to spec that executables be included in the spine that may open up a can of worms. Ivan Herman: It must be part of the security section in the RS document if you are a RS be careful about downloading a binary file. EPUB check could react to this. But the manifest fallback is ignored anyways… so it may not make sense there. EPUBCheck may not complain. Brady Duga: Yes core media types are important and maybe the RS are already supporting them and not worry about fallbacks. Matt Garrish: I have no issue getting rid of fallbacks. Spine XHTML / SVG, but we could make it a formal allowed spine item. But there could be neat stuff and our fallbacks don't work. We are just calling it a failed feature of EPUB. We don't really want to have them all in the spine just the two of them. Wendy Reid: agreed. Brady Duga: like ChemML and you may want to use those or have a fallback with a static page, but everyone just uses HTML & CSS so I think we weren't originally sure but we didn't want to limit people. Matt Garrish: Manifest fallbacks with replacements for HTML, there is some really bad manifest fallbacks and HTML has improved to solve that, we have the picture element etc. So sounds like its all irrelevant. Ivan Herman: spine restricted to XHTML & SVG, and epubCheck would shout at me so thats ok. Not having a fallback there epubCheck will shout at me. So for files which refer to from html, maybe something musicML and have this ML and they may have their own special renderer, but they should be able to use that without, … so what would be the fallback for MusicML so I link to HTML and for valid reasons and I render it via some extra scripts; I cannot rely on fallbacks since I cant rely on something else doing this for me. Brady Duga: I was going to do some more research on playbook side. Wendy Reid: ingested versions vs. side loaded.
Matt Garrish: wonders, restricting to XHTML/SVG are we deprecating? Can't remove for manifest fallbacks.
Brady Duga: feels like a really big change, I am hesitant to do it. To understand the nuances. Dan Lazin: we talked about things to talk about at risk, add this?
Tzviya Siegman: trying to think about 1000's of backlist titles, we don't update those. If I need to do an update it may be a more significant update, and what a retailer but this may be a breaking change. Wendy Reid: this doesn't break, just deprecate it so moving fwd. Legacy content using this is not really working anyways Ivan Herman: If we say deprecated what does EPUBCheck do today? Matt Garrish: Issue a warning Ivan Herman: that could scare off a publisher right? Matt Garrish: Yes Ivan Herman: what can we use? Matt Garrish: "Strongly Encourage" Ivan Herman: I understand what Brady says, in the meantime, Matt & I can come up with a PR what that means in the spec. Wendy Reid: warning is not a bad thing… if Publisher is scare of that, what are publishers doing I will look at the warnings or errors it may be inconvent but they are looking at them. If they are seeing warning, its not a big deal, at least for me. Tzviya Siegman: warnings and errors for some reading systems won't accept EPUBs even f they have warnings. unfortunately they are at the same level.
Charles LaPierre: we had this discussion before, we really need to go to those RS and shame them or inform them to allow warnings, because otherwise we shouldn't use warnings since its the same thing as an error. Matt Garrish: we shouldn't resolve anything now, I will work with Ivan to come up with new language on this. Brady Duga: Dave has some sample books. Wendy Reid: we can do a little more testing. |
To pick up from the F2F, the specification change here is pretty minimal - we deprecate manifest fallbacks. Done. In terms of impact, however, I see three areas of concern:
That's as far as I've gotten trying to wrap my head around the impacts of a change. Doable, but it will entail a measure of pain for some publishers. |
And I suppose one other (at least theoretical) use for manifest fallbacks was to provide options among content documents. I specifically remember this being introduced as a kind of "no script" option, so if a reading system that didn't support scripting found a content document marked as scripted it could look in the fallback chain for a non-scripted alternative. Similarly, it could look for a document without mathml markup, etc. if it didn't support the technology. I haven't heard of manifest properties being used in this way (pretty cumbersome and duplicative), or of reading systems supporting them for fallback lookup, so I'd be surprised if deprecating would have any effect here. |
:-)
At the moment, we have ways of addressing unsupported features: deprecation and legacy. The former means the generation of warning, the latter does not. Labelling fallbacks as 'legacy' is not really a way to go because those refer to EPUB 2.* features. What about introducing a third category, say, "discouraged"? Discouraged features are very close to deprecated ones, except for the last sentence in A.1:
Instead, it could say something like
We can then discuss this with the epubcheck people: by default no warning are issued for discouraged features, but there may be a separate flag that does result in warnings. Would that be a way to go? |
This is what I'm finding problematic. The example in #1911 has a link to a JPEG image. Currently we require the content author to create an SVG version of that image that will never be used, even if the reading system supported manifest fallbacks. Requiring the fallback here serves no purpose. At the very least, I don't think this makes much sense for core media types that are replaced elements. |
Allowing any core media type in the spine without fallbacks opens the door further to image-only EPUBs. Quasi-restrictions like having to be replaced content are easy to manipulate. You could list all your That's already a failing of fallbacks, of course, as you can make every non-content document fall back to the same HTML page that says "sorry, you're out of luck" to satisfy the requirement. (Or just use fallback HTML/SVG wrappers without alt tags or descriptions.) Plus it's not like fallbacks satisfy WCAG, either. If you use them and do make accessible alternatives, you can't claim your content is accessible because there's no way for users to choose whether they want the image or its alternative. And that assumes there is even widespread support for fallbacks, which there isn't, so that also makes their use an immediate failure. Fallbacks are kind of useless however you look at them, so I have no issue with them disappearing. The question seems to be how far we want to go in the direction of allowing anything in the spine. We can keep pushing the door open a little bit at a time, or we can go all-in and allow any CMTs. For accessibility, we ultimately have to rely on education, legislation and other means to push publishers to produce accessibly, anyway. Ebooks are mature enough now that novels published as a series of image-only pages isn't terribly realistic anymore. Publishers seem generally aware of the problems of image-based content for accessibility, reading on mobile, etc. |
What if we loosen the requirement that hyperlinked resources must be in the spine? For instance, we could say:
That would keep the requirement that Content Documents are in the spine if they're reachable from top-level documents, and yet allow things like Would that be reasonable? |
But that leads back to the problem of reading systems not being able to locate where the user is in the spine - what comes next, what do you go back to, how do you handle bookmark or annotation attempts, etc. |
In a scholarly book it should be reasonable to have a hyperlink to a data file, javascript code, or a chemical formula described in an XML file (with these contents being part of the EPUB). What should we do about those? I think that the fallback mechanism is o.k. for them, although I am not sure it helps anything to have these resources also appear in the spine (albeit with |
FWIW, Play Books does look at the fallback chain and will ignore images directly in the spine unless they are are SVG. We can of course loosen that, but then we have to start deciding how to style all these random images in the spine. Why are we considering this change in this version of the spec? Are we concerned that there just don't exist two implementations that actually support fallbacks? |
How serious are we about deferring to the HTML spec to describe how reading systems handle various types of content? The spec covers:
So it makes sense that XML files, text files, or PDFs don't trigger fallbacks because a rendering engine knows what to do with them. But that's not true for DMGs, .exe files, etc. |
@dauwhe I think the first two items and the third are different.
I think it would be possible to spec this better, but I also think that expanding on EPUB and RS in this direction would make a lot of sense. (Thanks for pointing out that the HTML spec has sections for these; this makes our job much easier!) |
Interesting: w3c/epubcheck#1298 |
The issue was discussed in a meeting on 2022-03-11
View the transcript1. What does it mean to "support" a foreign resource? (issue epub-specs#1464)See github issue epub-specs#1464. Dave Cramer: given that CR is approaching we want to work through our remaining issues.
Dave Cramer: similar thing might happen if you try to put XML in spine, where some RS will give you the browser style tree view. Dave Cramer: we have something of a conflict between spec and reality, so what do we do? (if anything). Brady Duga: fallbacks are at the resource level, but you still have to use the resource correctly. Dave Cramer: say you are doing book about javascript and you want to hyperlink to JSON file. Brady Duga: little weird to hyperlink to JSON in spine when author could style JSON text in their style to make it look correct visually. Dave Cramer: we had once envisioned special purpose RS that could display docbook format, for example, but where same epub could still be displayed reasonably in non-speciality RS.
Ivan Herman: re. question about whether fallback should be used - strictly speaking, your test follows what the spec allows. Matt Garrish: Not sure if opening up the spine is a good idea now. Ivan Herman: We can make deprecated, or say it is under-implemented. Dave Cramer: Can we avoid throwing away the concept, but make it clear RSes will never show the fallbacks?. Rick Johnson: Is this a case for some standard language to say there are not two implementations and it is dangerous to use?. Ivan Herman: Yes, under-implemented is our current term. Zheng Xu (徐征): From implementor side, how to support foreign resources is passed of to web engine?. Matt Garrish: If we put a big scary box in the manifest section it will help, but how do we tell people?. Brady Duga: we walks the fallback chain for something that should be displayed like html or svg. Dave Cramer: It has been common to wrap things in Ivan Herman: What does html spec say about images like that?. Tzviya Siegman: There is also linking with iframes that has been done in epub. It is clumsy, though, agree with Ivan, we should allow what html allows. Dave Cramer: [Reading processing model for non-html content]. GeorgeK: This was presented as an a11y issue, but don't see any real a11y benefits here. Ivan Herman: The big issue here is a .dmg or .exe file being linked. Brady Duga: I don't think that security statement makes sense. Dave Cramer: There is a distinction between downloading as a RS, vs as an end user clicking on a navigation link that allows me to download. Matt Garrish: Are we saying we should allow anything in the spine?. Dave Cramer: I don't think we are talking about opening the spine. Ivan Herman: I agree. Dave Cramer: But we still have the existing epubs that have fallbacks. Ivan Herman: We may have to separate two situations. Let's put the spine aside for now. Matt Garrish: These aren't different cases - if it is linked to it must be in the spine. Ivan Herman: What if it is outside the epub?. Matt Garrish: In that case it opens in a new browser context. Dave Cramer: Say you click on that image that isn't in the spine, then we have all the nav issues (how do you go forward, bookmark, etc). Matt Garrish: We have looked at pop out content, but have never gotten that far with it. Ivan Herman: Why does it have to be in the spine?. Matt Garrish: To avoid the nav issues. Brady Duga: this is related to Ivan Herman: it's in the spine but Brady Duga: it has to be in the spine to link to it. Tzviya Siegman: We have had this conversation after about every revision of the spec. Charles LaPierre: One thing a browser has is a back button.
Zheng Xu (徐征): We have a back button, but then there is still the issue with bookmarks. Dave Cramer: History and html (ie back buttons) is really complex. Zheng Xu (徐征): for this issue is the question how we can write a test?. Matt Garrish: Maybe there is some discussion to resurrect about the target of an Dave Cramer: I don't really know what epub without fallbacks looks like.
Ivan Herman: we must solve this before cr.... Dave Cramer: RSes understand JSON. Brady Duga: from a CR perspective a test for this is JSON in the spine, saying that if you see this means the RS supports JSON. Dave Cramer: And that is basically the test I made. Brady Duga: we are trying to define support, when the reading system should define it. Dave Cramer: Agree. Ivan Herman: We don't want to say RSes should support types html says they should. Brady Duga: it's up to the RS to decide if they're gonna use a fallback. Dave Cramer: And no RSes would display the fallback. Ivan Herman: Isn't it correct that all RSes that are newer and rely on browser cores would display json correctly. Brady Duga: fallbacks are not just because RSs cant display something. Dave Cramer: Want time to discuss CR. |
I try to see some specific ways forward from the current situation, also based on our discussion on the call. Fallback optional?In line with reality (following the "paving the cow-path" approach of some specs) would it make sense to reduce the obligation of fallback chains from MUST to SHOULD? The first glance of the necessary spec changes this would mean:
We could be more stringent and leave (2) as a MUST, but I am not sure whether that will fly in terms of implementations. Align with the HTML specAs it has been said, the HTML spec has actually a definite way to display text files and/or XML files. Our current approach of requiring fallbacks for those means, therefore, a further restriction on HTML. To alleviate this we could:
Note that, interestingly, json file do not fall under any of these, nor does HTML seem to say anything about them, because the media type for json is We may want to spawn to separate issues here, and closing the current one. I think that @bduga has given an answer to the original question of the issue regarding the tests... |
Another question to ponder: what does it mean to fall back to a supported media type? Consider you embed some shiny new image format into an HTML document in an You can satisfy both cases using manifest fallbacks, but reading systems apparently are supposed to follow the order created by the author in the absence of a properties attribute with more information (and there are no properties for this situation). So say you satisfy the spine requirement first and in the manifest make the image fall back to an XHTML content document. Then to satisfy the Foreign image -- falls back to --> xhtml -- falls back to --> jpeg Does this mean a reading system is supposed to use the xhtml in the Are reading systems supposed to be aware of what formats make sense in what elements? Are authors supposed to be aware that they have to craft the fallback chain to compensate for dumb reading system replacements and fall back to the image before the xhtml? This obviously isn't a big problem in reality, but just another case of how manifest fallbacks are a flaky idea for HTML. My understanding is that manifest fallbacks were introduced to work around the lack of fallbacks for Maybe even label that use a "legacy" feature only meant to support compatibility with EPUB 2? |
I would definitely not have sleepless nights if we did that... |
@mattgarrish do you want to propose a PR for #1464 (comment) or should we leave it as for now (the CR might force us to label fallbacks as under-implemented anyway...)? Regardless, I think this issue may be now closed. Cc @dauwhe |
I wonder if it might help to split the uses and talk about them separately. Right now, you get a mix of requirements that apply when the fallback chain is for the spine and requirements that apply for elements lacking intrinsic fallbacks. It might give some room to caution about deeper problems like I mentioned in that comment, too. I don't think it would hurt to caution against the HTML use, too, but we unfortunately can't actually call it legacy when I think about it more. That would mean epub 3 reading systems would have to stop supporting it and it would mean any content that has relied on it would become invalid (if it's a legacy feature, then it wouldn't count as providing a CMT anymore as far as EPUB 3 is concerned). |
But closing this issue as it's not directly related. |
One related question ... if manifest fallbacks are discouraged, and given the html object tag, could you not create the equivalent of the manifest fallback by using an xhtml file in the spine, and in that file use an object tag to load the foreign resource (in my test case an embedded pdf file) and then a second child object tag to load the fallback xhtml resource. This approach seems to work properly in a number or e-readers that do or do not support the foreign resource, it uses html object tag fallback following the whatwg latest spec, etc requiring no epub3 specific support. The only issue current epubcheck has with it is that the fallback object data url is ignored by epubcheck so the fallback html file can never be in the spine with linear = no since epubcheck thinks there is no link to it. Maybe just remove manifest fallbacks completely from the spec (deprecate them) and tell users to use the pure html object tag fallback mechanism instead with the added benefit of the spine always being pure xhtml files. |
The option of removing fallbacks completely from the spec came up several times. However, per the charter of our group, we were forbidden to remove standard features (inherited from EPUB 3.2). More exactly, it was a requirement that any valid EPUB 3.2 publications should remain valid in terms of EPUB 3.3 (the goal, obviously, that the new version should not "disrupt" any deployed EPUB publications). So, as a measure of caution, we kept fallbacks (which also got some implementations). The deprecation, possibly adding some text referring to HTML object tags, etc, is something that could be on the plate of future work in the maintenance Working Group that we are planning. |
Thank you for possibly considering it in the future. The nested object tag approach will currently allow foreign resource fallback to xhtml even in browsers/e-readers that do not support manifest fallbacks for spine items. Useful given our testing shows very few e-readers support opf manifest fallbacks at all for items in the spine. |
This may have to be looked at, as it may require some slight modification on the spec text (I have not checked). At present, the spec is soon going to the final round of becoming a Recommendation (W3C jargon for standard). Maybe it is worthwhile to raise an erratum once the Rec is published to look at this (probably minor) issue in the spec and in epubcheck |
The current epubcheck has similar issues with the Nav being in the spine with linear = "no" though all epub3 ereaders properly process the Nav and provide their own interface for accessing it. This is an issue as ebooks try to hide the Nav when providing a pure html based TOC to the reader but the Nav itself can and does link to itself as a landmark meaning it must be in the spine (again according to epubcheck). I am not sure what the 3.3 spec says about either of these recent changes to epubcheck. If you want I would be happy to open specific epubcheck issues for both cases with sample code if these are truly issues that you want or need to have tracked. Just let me know. |
This goes back to a change made in EPUB 3.1 that requires that all non-linear content in the spine be linked to so that users can reach it (in case non-linear content is suppressed by the reading system). I expressed reservations about this in w3c/epubcheck#1451 (comment) when epubcheck was updated, and it was raised again in w3c/epubcheck#1488, because it requires using the landmarks nav to satisfy the linking requirement. It's workable for the cover and toc because we have semantics for them, but not for any generic non-linear documents. I'm fine opening a new issue about this, but I don't think it's something we're going to solve before going to PR. I'd take this up in the maintenance group after we're done. |
Interesting ... but an xhtml file with an object tag with a data url pointing to another xhtml file (one listed in the spine with linear = no since it is acting as a fallback) should certainly be classified as a "link" to that file for the purposes of epubcheck meeting the 3.3 spec, shouldn't it? |
No, because once you put it in the spine it becomes part of the content and not strictly a fallback. This takes us out of fallbacks and into the thornier issue about whether non-linear content is meant to be rendered as part of the default reading order or not. Some reading systems will render all non-linear content where it is placed in the spine, some will not render it. The requirement to add links grew out of that impasse: if some reading systems aren't going to render the content, then there must be some way to reach it. You don't normally want to reach a fallback, however; only if the reading system doesn't support the foreign resource you wanted to render. Otherwise, you end up with a situation where the reader can encounter that fallback twice: once because the reading system couldn't support the foreign resource and a second time because it's in the spine. That's why manifest fallbacks weren't listed in the spine but were chained together using an attribute. To the case you're suggesting, though, I assume this is what you want to do:
In that case, there's no need to put the fallback in the spine. It's just another embedded resource. The only time it's required to be a non-linear item in the spine is if you hyperlink to it:
There are cases like cover pages, though, where there aren't easy workarounds like this. If these are in the spine, and nothing links to them, then unless you can find an out-of-band means of linking to that page, like the landmarks nav, then you're stuck with an epubcheck error. |
I'm not optimistic about this. The only use of manifest fallbacks I've heard of is to allow images in the spine exactly so they don't have to be wrapped inside of an html file -- going back to other long discussions about what is allowed in the spine without fallback. Deprecating them for a solution that requires wrapping the images in HTML might not go over well (unless we find out that use has died). |
Agreed. But nothing in the manifest fallbacks in the current spec limits it to just images ... so it applies as well to any foreign resource used in the spine, doesn't it? And given the general lack of e-reader manifest fallback support for spine items, why not encourage an html spec compliant fallback approach since it actually works and requires no special epub3 only structures or support. That way both epub users and epub developers have a way to include any foreign content in an epub with real working fallback in current e-readers with no changes needed. And fwiw, I do think the Nav document should be allowed to have linear=no as the last item of the spine without an additional link to the Nav being provided because access to the Nav is guaranteed by all e-readers, and many Nav documents local link to themselves via their own landmark section thereby requiring them to be in the spine in the first place, making it all a bit circular. Thanks for listening and considering. And for pointing out my long held definition of linear = no is not universal by any means (or even well understood by me!) |
Right, but I don't think they've been used much (at all?) outside of images. I'm not even sure if images are used in the spine much. There was a push to enable manga/comics as EPUBs a number of years back that led to some long discussions about what is allowed in the spine. I only point that out because it might make deprecating manifest fallbacks complicated. If they have been used for images, then we'd probably get pushback to deprecating them. If they haven't, then using intrinsic html fallback methods, whatever they happen to be (object, picture, etc.), is always better.
You're not alone. It's probably the most confusing feature in all of epub... 😕 |
So, I wrote a test for manifest fallbacks. I made a JSON content document (
media-type="application/json"
) with an XHTML fallback. Apple Books, Thorium, and Calibre display the JSON directly. ADE 4.5 crashes on opening the file.I suspect this is because browsers/web views will try to open JSON and render it as text.Then I made an EPUB with an XML content doc (
media-type="application/dtc+xml"
). Apple Books said the book was corrupt, although there were no EPUBCheck errors. When I opened it in Thorium, I was presented with a dialog box allowing me to download the XML file. Calibre rendered the XML as text.The text was updated successfully, but these errors were encountered: