-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spec clarification: Buffer views without buffers, buffers without URIs and GLB #1684
Comments
This test is based on under-defined assumptions, so it should not be seen as a reference.
That doesn't require changing the core spec. The question is whether required extensions can make pre-existing mandatory properties optional. /cc @bghgary @donmccurdy |
One possible way of resolving this cleanly then could be:
One issue with this is that the validator won't be able to validate files without knowing the semantics of the required extension - if I try to use this right now, validator (rightfully) complains that "Property 'buffer' must be defined". If we changed the core spec to specify that buffer in bufferView optional this could be resolved. |
I'm not sure about this. If we don't change the schema, then any implementation that uses the schema to validate will fail even if the extension is required. But changing the schema is a breaking change that shouldn't be allowed. Maybe point the unused buffer to the same buffer as the extension? |
That would be my thought, too. For the "unconditionally compressed data" workflow:
That's a (small) amount of redundancy, and obviously requires recognizing the extension, but for this workflow the extension is required anyway. |
The issue with this is buffer lengths - uncompressed data is larger than compressed data, so in some cases you can end up in situations where the entire buffer is smaller than each individual uncompressed chunk. As a result, the file will fail validation because byteLength in a bufferView is larger than the byteLength of the referenced buffer. |
There's no uncompressed data in this scenario, though – there is no need to specify the byteLength the data had before compression, use the byteLength of the compressed data instead. Validation will fail, unless/until the validator recognizes the extension, but that can be true for any required extension. |
You do need to specify the uncompressed data length - this is necessary to create the GPU buffers with the right size (byteLength in the extension example above refers to the compressed representation of data). But separately, if it's okay for validation to fail, this would suggest that the required extension makes a previously stated validation requirement not crucial. If this is possible, I don't see the difference between this and dropping the buffer reference? (e.g. if it's okay for byteLength in the bufferView to be non-sensical wrt the referenced buffer, why is it not okay to omit the buffer reference in the first place - both are violations of the spec?) |
Hm I think I misstated this. The presence of a required extension doesn't permit any-and-all validation to fail, it just means the validator may not be useful, and ideally we'd avoid that.
It wasn't my intention say the byteLength should be nonsensical, I'd missed that you wanted to preserve a reference to the size of the uncompressed data so that GPU buffers could be pre-allocated. For the earlier question, whether required extensions can make pre-existing mandatory properties optional, I don't know. It might be OK, but I share @bghgary's concern, and would prefer to avoid this if we can find a way around it. To suggest another alternative, keeping the
In that case:
If we're concerned about the redundancy, most of the properties of the extension could be made optional overrides of the base bufferView, so that this would be functionally equivalent to the version above:
|
The core issue with bufferView.bufferLength is that it's also important to keep accessors valid - accessors (rightfully!) validate that they request a correct (accessible) range from the buffer view they reference. So I don't see a clean way of avoiding specifying bufferLength on the bufferView. |
Right, here's what I'm trying to achieve:
Obviously if pt1 is completely off the table I'm fine with breaking validator, and possibly adding support for this extension to validator later, etc. But this creates confusion - e.g. if validator produces lots of errors, people will report bugs (they already have!), etc. So the optimal case is that validation is clean short of "this required extension isn't recognized by the validator". |
To focus the discussion I have reworded the original post by adding the "Alternatives: omitting the buffer reference?" section instead of "Alternatives" (more clearly spelling out what we could do), and adding "Alternatives: faking bufferLength?" to carefully explain why we can not reuse the same buffer or lie about the bufferLength. |
Ok, yeah it's tricky to make this clean. 😅 Actually feels a lot like the Draco extension discussions that ended with |
Unless I'm misunderstanding, Draco approach doesn't work well because it requires per-accessor specification. It is extremely redundant in terms of JSON size, and prevents efficient decoding directly into GPU memory. |
Well, let's see if this fits your goals. Suppose:
Example: {
meshes: [
{
primitives: [ { attributes: { POSITION: 0, NORMAL: 1 } }
extensions: { MESHOPT_compression: { bufferView: 0 } }
}
],
accessors: [
{ bufferView: undefined, byteOffset: 0, ... },
{ bufferView: undefined, byteOffset: 1024, ... },
],
bufferViews: [
buffer: 0,
byteLength: 2646,
...
extensions: {
MESHOPT_compression: {
byteLengthUncompressed: 136336,
mode: 0,
count: 34084
}
}
],
...
} There's a bit more indirection here — you can't load the accessor without knowing the mesh from which it's referenced. That's true of Draco as well, and a couple developers have complained about it, but as far as I know it hasn't actually been a barrier to adoption. I believe something like that would keep your accessors and bufferviews valid and enable decoding into GPU memory. I'll defer to @lexaknyazev on whether I've trampled on any schema requirements at this point. 😇 |
The extension can be used to compress any buffer view - indices, geometry, animation data. So this would require overriding accessors everywhere where accessors can be specified. This is dramatically increasing the scope of the extension, as well as implementation complexity - the Babylon.JS loader right now is ~30 lines of meaningful JavaScript code and I would really like to keep it that way: https://github.com/zeux/meshoptimizer/blob/master/demo/babylon.MESHOPT_compression.js#L18-L47. Additionally this substantially increases the amount of JSON overhead which can matter in some cases, such as scenes with animation where each node needs several accessors. |
Oof. That is elegant, and I congratulate you on it, and this ruins my tidy ideas above. 😣 I think it comes back to your five questions about semantics of dummy buffers, then. I don't know all of those answers but here's an attempt:
These two appear to be underdefined now. We could probably clarify this in the spec without considering it a breaking change, or your extension could impose additional restrictions.
Can the later dummy buffers omit the URI? The spec allows that.
This is left undefined; I would expect nothing useful of it. :)
I would strongly hope that if multiple BIN chunks are allowed in the future, they'll be referenced by a more explicit mechanism than sequential buffers omitting the URI. So let's say yes. I'm half tempted to suggest a property on the dummy buffers too. Not that implementations would strictly need it, but just to make the whole thing less confusing to see. Like: "buffers":[
{"byteLength": 1024 },
{"byteLength": 4096, "extensions": { "MESHOPT_compression": { "empty": true } } }
] |
That's the question! :) This started as a validator bug (KhronosGroup/glTF-Validator#121) and the rest is history. I suspect that the most frictionless course of action is to:
I was actually thinking about this as well yeah. This wouldn't serve any purpose normally but if, say, a loader is honoring the extension, it can use this to say "hey I don't actually need to load the buffer eagerly" (all JS loaders load buffers lazily but in theory you can load eagerly). |
Thinking about this further, the answers to
really must be "yes". Imagine a glTF file with an arbitrary mix of dummy buffers and buffers with a URI. Given a tool that takes the glTF file and wraps it into a GLB file without a BIN chunk, it really needs to be the case that the buffer arrangement becomes valid. Given a tool that takes one of the buffers with a URI (say, first one) and converts it to a BIN chunk, there's no reason to expect other buffers to become invalid. In other words, since dummy buffers are clearly valid, albeit slightly strange, in glTF files, they really must be valid with the same semantics in GLB files as well, modulo the first one. I also wholeheartedly agree with the desire to specify the mapping more explicitly so let's assume that it doesn't matter how multiple BIN chunks behave - if/when this happens, buffer can get a "glbIndex" attribute or some such. Thus I propose that we amend the spec, particularly the "GLB-stored Buffer" section, to say: (existing)
(addition)
This should resolve the validation issues with minimal and hopefully uncontroversial change. (I will also note that the spec technically allows the first buffer that refers to the GLB BIN chunk to have non-empty buffer.data property but let's not fix it here...) |
I'd agree with that change, and perhaps go a little further to say What do you mean by |
Sorry - I got my wires crossed thinking about buffer specification modes. Your correction sounds good as well. I will draft a PR to this effect (if there are objections to this change I’m happy to discuss alternatives but this seems safe and conservative). |
This requires some context so bear with me.
Original use case
I'm working on a new extension that allows glTF/GLB files to specify compressed buffer views, using a custom compression technique optimized for typical geometry/animation data. It's really important that the compression is applied at the bufferView level - applying it at the buffer level conflates data with multiple strides / meaning and necessitates decompressing the entire buffer into memory at once serially; applying it at the accessor level makes it impossible to upload bufferViews to the GPU efficiently, and inflates the JSON blob.
There are two important use cases I'd like to cover with this extension:
To achieve this, I'm trying to use the usual technique of extension specifying "override" data for bufferView - here's an example for usecase 2:
As you can see, there are two buffers - main buffer with compressed data and fallback buffer with uncompressed data. The bufferView refers to buffer 1 (fallback), but compression extension overrides this - if the loader supports the extension, there will not be a need to load the fallback buffer at all. Commonly used GLTF JS loaders use lazy buffer loading, loading buffers as they parse buffer views that refer to them, so this works great.
When data like this is stored in the GLB file, the JSON blob is the same except that the first, main, buffer doesn't have a URI - this also works well.
What doesn't work quite as well is the "data is unconditionally compressed" workflow - I'd like to be able to generate a file without fallback buffers (with extension marked as required).
Dummy buffers
The cleanest solution would have been to omit the buffer reference in bufferView for this case, but glTF spec doesn't allow that - buffer reference is mandatory even in presence of extension. In #1680 it was suggested that a buffer without a URI can be used instead. I've implemented it and it works well in practice - the example above turns into:
Which is basically the same modulo the URI reference. In practice, loaders won' try to load the fallback buffer since all references are through bufferViews that have the required extension, so it seems fine and the spec doesn't require "uri" or "data" to be present. However, the current validator doesn't like this technique when used in GLB file. I tried to fix it in KhronosGroup/glTF-Validator#122 but this breaks some existing tests in the validator and it's not clear whether they are spec-conformant.
Dummy buffers and GLB files
The issue is of course that the first buffer without a URI when used in a glTF file is "special" - it refers to the BIN chunk. The validator "helpfully" points out that if a buffer without a URI is not the first buffer, it might be the case that the file is encoded incorrectly.
When the file with required compression (no fallback) is encoded in GLB, I currently produce this:
However note that the semantics of the two buffers is actually quite different.
The first buffer is a BIN chunk buffer reference - it contains compressed data.
The second buffer is a dummy fallback buffer - it is not backed by any data, and it only exists to be referred-to from bufferViews that should never use this reference because they have a compressed view and the extension support is required.
What is the exact semantics of dummy buffers wrt GLB?
Thus, there are a few questions that would be good to resolve to understand the interactions here.
If a GLB file has a BIN chunk, does the first buffer have to refer to it? The specification doesn't mandate this. It's probably not very sensible to not refer to the BIN chunk, but this is not explicitly stated as far as I can tell. For example, is it valid to have a BIN chunk, and have no buffers defined in the file?
Since a GLB file does not have to have a BIN chunk, is a GLB file without a BIN chunk valid if the first buffer in the JSON content is a dummy buffer (without URI)?
If a GLB file does have a BIN chunk, and the first buffer refers to it, are all other dummy buffers valid with the same semantics that they have in glTF files?
Also, what is the semantics of a buffer without a URI or data or BIN chunk? :) What should loaders do if they do try to load data from a buffer like this?
In the future, allowing for multiple BIN chunks seems worthwhile. Are the answers to the questions above going to be compatible with this?
As a concrete example, validation suite has a test that asserts that the following JSON, when encoded into a GLB with a BIN chunk, is invalid:
But is that really true? Sure, there's a BIN chunk, and the first buffer clearly doesn't refer to it - but the validator assumes that the second buffer does and I don't see why that is. This file looks valid to me even if it's odd.
Alternatives: omitting the buffer reference?
Please note that I'm not strongly attached to the concept of dummy buffers in the first place. I would be happier in fact if I could omit the buffer reference from the bufferView. However, that requires changing the core spec - is it possible to do this at this point?
Although the semantics of GLB <-> dummy buffer interaction still should be clarified, if we can change the spec to make the bufferView -> buffer reference optional (noting that in absence of a buffer reference, a required extension is expected to specify the data source), that would make the output files cleaner and less ambiguous.
If we do this we could:
This would relax the spec without making it too lax.
Alternatives: faking bufferLength?
If the .buffer property is sacred, then we could point it at the compressed buffer instead of a dummy buffer. However, this has a problem wrt bufferLength value. For the sake of the example below, let's assume that the scene just has one bufferView (very realistic! it can be an unindexed point cloud with just point positions), and that the compressed data is smaller than uncompressed data. In this case, the buffer with compressed data will be smaller than the bufferLength implied by the accessor. Now let's follow the spec:
"Each accessor must fit its bufferView, i.e., accessor.byteOffset + STRIDE * (accessor.count - 1) + SIZE_OF_ELEMENT must be less than or equal to bufferView.length." => therefore bufferView length (presumably .byteLength is implied here) must be that of the uncompressed data, or larger.
"A bufferView represents a subset of data in a buffer, defined by a byte offset into the buffer specified in the byteOffset property and a total byte length specified by the byteLength property of the buffer view." - while this technically doesn't strictly say that a buffer view must fit its buffer, it seems that this was considered obvious through "subset" wording. Validator agrees (bufferViewTooLong error is generated otherwise). Therefore, the buffer.byteLength must also be that of the uncompressed data or larger.
As a result, we can not fake bufferLength of either the bufferView or the buffer and we need a dummy buffer or we need to cut the bufferView.buffer reference.
The text was updated successfully, but these errors were encountered: