Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glTF roadmap - what would you like to see next in glTF? #1051

Open
pjcozzi opened this issue Jul 30, 2017 · 214 comments
Open

glTF roadmap - what would you like to see next in glTF? #1051

pjcozzi opened this issue Jul 30, 2017 · 214 comments

Comments

@pjcozzi
Copy link
Member

pjcozzi commented Jul 30, 2017

Hi all - please chime in with any and all feedback to help drive the direction of glTF beyond 2.0. Even simple +1/-1's for topics are appreciated.

How much should we focus on building out the software ecosystem vs. moving forward the spec?

Should new spec features come in as extensions first, direct spec updates, or some combination? What mix?

Ecosystem Tools

What glTF software do we have the most immediate need for?

Conformance

  • How do we ensure a robust interoperable ecosystem?
  • How rigid and formal should conformance testing be?
  • What software tools are needed beyond the glTF Validator?

Infrastructure

Learning Material

  • What additional tutorials, sample models, and other learning material are needed?

Potential spec updates / extensions


Really looking forward to continuing to move the field forward! Can't wait for your input!

@alteous
Copy link

alteous commented Jul 30, 2017

The 2.0 specification is solid. I'd much prefer to see the software ecosystem building out than to see a glTF 3.0 specification.

@realvictorprm
Copy link

Is it somehow possible to get a standard for Voxel data?

@toji
Copy link
Member

toji commented Jul 30, 2017

Agreed with @alteous that the glTF 2.0 spec is feeling pretty good at the moment. It addressed some real issues that emerged with the first spec in a very direct and useful way. I'd prefer to see new features emerge as extensions at this point rather than putting too much effort into a 3.0.

I would like to see an emphasis on tool ecosystem improvements. Especially in terms of exporter and other pipeline support. At the moment finding tools that meet a variety of needs is a bit of a coin toss even with some of the Khronos-backed tools. (I recently pushed a small change to the Blender exporter for this exact reason.) One nice thing to have available would be a collection of sample export targets similar to the fantastic set of sample models: A bunch of Collada, OBJ, FBX, etc files that all have different properties which tools developers can use to sanity check their exports. (Did the model with vertex colors come across right? Did the fact that this one has no UVs screw me up?)

Another thing that I'd like to see discussed is a way to verify/sign that a glTF file has specific properties that make it safe to use in a security-sensitive environment when loaded from a third party. (For example: Oculus mentioned using glTF files as favicons in a VR browser.) It's easy in that type of environment to say "We don't support extensions, animations, etc." but it's harder to usefully say "We won't attempt to render objects over 500k triangles" or "We won't load models that are collectively over 10MB" or "We don't support models with more than 100 different materials" which something like a browser would definitely want to do. The idea being that it could be relatively simple for a malicious source to provide a model that contained a few million individual meshes, each made up of a single, transparent, overlapping triangle that each have their own material each with a collection of unique 4K textures which is intentionally aimed at tanking the client's performance. Ideally a model could digitally verify before hand that it does, indeed, contain only 3 meshes with a mere 8k triangles with two materials between them that only use a couple 512 textures, so there's not a GPU in the world that should struggle with that.

That's a Hard Problem™️ , and I can't pretend to even begin to have the answers for it, but if solved it could make glTF a useful format for a massive range of applications that would otherwise balk at accepting external meshes.

@Spaxe
Copy link

Spaxe commented Jul 31, 2017

+1 on "Progressive Geometry Streaming - Fraunhofer POP Buffer, WEB3D_streaming"

@jbaicoianu
Copy link

@toji and @alteous took the words out of my mouth. glTF seems like it's in a good place with its featureset, but it's been very difficult to find a glTF asset pipeline that works well. Solid tools for getting glTF data into and out of all popular content creation tools would be a great next step. Some of the older tools like obj2gltf and FBX-glTF desperately need updating to the new specs to be usable (obj2gltf seems like it's making good progress on 2.0 lately, but FBX-glTF hasn't been updated in a year and has dependencies on projects which have been renamed and absorbed into other projects).

Very happy to see blender import support is on the list already - export without import has left me feeling a bit hamstrung sometimes, so this is something I'm definitely looking forward to.

@c30ra
Copy link

c30ra commented Jul 31, 2017

For software ecosystem I would like to see official UE4 import and blender export. glTF is a good candidate to replace fbx, moreover it's open, so it should have better support in all pipelines/software data exchange.

@bwasty
Copy link

bwasty commented Jul 31, 2017

For conformance, I'd like to see more sample models, covering the whole spec. I already noticed a few missing things while developing my little viewer:

  • indices with component types UNSIGNED_BYTE and UNSIGNED_SHORT
  • color attributes, with all combinations of allowed types (only SmilingFace has one, but it seems to be black)
  • more than 1 texture coordinate set + a material that uses it

@AlbertoElias
Copy link

Definitely agree that the most urgent issue is improving the ecosystem so that we can all easily make use of what GLTF offers today, which is still a bit too hard to do.

Down the line though, I'd give a big +1 to streaming capabilities, it would be a massive improvement for serving GLTFs on the Web

@KageKirin
Copy link

KageKirin commented Aug 1, 2017

Hi, I recently switched a company internal project from FBX to glTF (2.0). Here are a few tidbits that I would like to see:

  • a C or C++ reference implementation for creating/modifying assets (especially one that comes with few to no external dependencies)
  • an extension for custom shaders as pre-compiled SPIRV binary buffers
  • an extension for rigid body physical parameters for nodes (weight, joint rigidness, ...)
  • an extension for soft body physical parameters (deformable vertex hull, ...)
  • a compression mechanism for buffers (could be zip or lz4)

Although I agree that the extensions listed above could be easily implemented as custom extensions, I would like to see efforts to agree on a standard for physical data.

Best regards.

@pjcozzi
Copy link
Member Author

pjcozzi commented Aug 3, 2017

This is really insightful feedback, thanks everyone, keep it coming!

To summarize so far:


a compression mechanism for buffers (could be zip or lz4)

@KageKirin check out the draft Draco extension. The bitstream spec has reached 1.0 so we expect the glTF extension to move forward quickly, #874


@jbaicoianu obj2gltf 2.0 support should be ready any day now, would love your help testing the 2.0 branch: CesiumGS/obj2gltf#67

@reduz
Copy link
Contributor

reduz commented Aug 3, 2017

Please support multiple animations (#1052). Only supporting one makes it pretty limited (and a hassle) for exporting to a game engine. All modern DCCs (Blender, Maya, Max, etc) support animation clips, and all game engines support multiple animations, so this would make the workflow much better.

@CentaurMare
Copy link

I've posted both a glTF 2.0 exporter and now a glTF 2.0 importer for Sketchup on extensions.warehouse.com, but both are currently pending approval - not sure how long this process will take.

I'd like to see the specification document improved. I think the minimum and maximum values in the accessors were a difficulty for me, especially as they are conversions from float to string and I was plagued by rounding errors making the validations fail. There seems to be nothing in the document to explain why they are required, other than "there are cases when minimum and maximum must be defined". In the end my minimum and maximum values for normalized values looked like "min":[-1.0001,-1.0001,-1.0001],"max":[1.0001,1.0001,1.0001]

Decomposable matrices were another issue, and a matrix which I thought was decomposable (a matrix built from translate, scale & rotate) would still fail validation - again I don't know if it is because of rounding issues converting from float to string - but it would be nice if the spec included the code to decompose the matrix and what the expected precision and validation requirement would be. In the end I had to give up and just pre-multiply all the geometry by the transformations first, so exporting with identity matrix only for my version 1.0.0 exporter.

Other parts of the specification could have a bit more of a tidy up, e.g. "When nor sparse, neither bufferView is defined..."

@donmccurdy
Copy link
Contributor

Please support multiple animations (#1052).

I've posted more details on #1052, but this is already supported — our examples do not make it obvious. Definitely +1 from me on changing the example models to not use different animations for each node, and adding some examples that have multiple distinct animations.

@fire
Copy link

fire commented Aug 4, 2017

What is the status of hdr formats? Openexr would be standard, but I'm open to ideas.

Openexr supports both 16 bit float and 32 bit float which corresponds to OES_texture_float; OES_texture_half_float.

@erwanmaigret
Copy link

Actually in general the missing bit to me is more about being able to add extensions/plugins. I strongly agree that 2.0 is already very strong and doing a lot, and going too deep may start to make it too specific (let's not make it grow into an FBX-style format).

So my main suggestion for extension would be to introduce some type of standard plugin solution that could allow custom extensions to be added and embedded inside gltf data.

As a simple example the rigging part is pretty standard today to map what you can do in Unity, but if you are in need deeper notion of constraints or deformers then you're stuck with only skinning/shapes.

The problem being that plugins mean evaluation mechanism during playback, most often seen as a complex problem. I actually think it's pretty simple if we could have a notion of scripted based behavior with embedded js or something like this (call it scripted operators, unity behaviors, or whatever). And avoid the big language debate by making this js-only since it's targeted mostly for web.

And btw, neither FBX now dot.xsi / KL / ... really tackled this right since they are all biased as companies, we have the luxury to make it right, and this could grow a huge community of developer IMO.

@Noxime
Copy link

Noxime commented Aug 4, 2017

@erwanmaigret

I don't think adding a scripting language to a model format is a good idea, especially JavaScript. What I would rather want to see (I haven't really read spec, so this might already be supported) is arbitrary data. Meaning you could embed anything you want to the file, possibly in base64 encoded binary.

@toji
Copy link
Member

toji commented Aug 4, 2017

@Noxime: That already exists. Buffers and BufferViews in glTF are references to completely generic binary blobs which can contain anything you want in any format you want. If there's data that's unique to your use case you can easily create an object in a relevant extras property somewhere that points to a BufferView and provides whatever metadata is necessary to interpret it.

@c30ra
Copy link

c30ra commented Aug 4, 2017

@Noxime

Both javascript and arbitrary data are bad idea imo because this way you're are creating a vector of possible virus/exploit specially on pour coded library. With script you basically allow execution of arbitrary code, and arbitrary data may cause exploit by using particular string(something like happened to WhatsApp some time ago).

For conformance:
Also, I don't like much the idea of "extensions": Having an extension mean that an implementer may or not support it. This mean that a software may implement it and another one no. In case one want to pass data from this software to the other it virtually lose information. Instead a non extension spec, must be implemented to be compliant with the implemented version. There is also the problem of bloat the format with less used or exotic stuff..

@donmccurdy
Copy link
Contributor

donmccurdy commented Aug 4, 2017

More information on glTF's extension system, and why we have it, here. Extensions might specify extra data, intended to be used in a particular way by the runtime, but will not include scripts implementing that behavior.

Implementation of an extension's behavior may be handled by extending the loader for a specific engine, rather than as part of the glTF format itself. For example, we (three.js) are hoping to allow users to write plugins for THREE.GLTF2Loader without having to modify three.js directly (mrdoob/three.js#11682), so that you can experiment on extensions as needed.

@donmccurdy
Copy link
Contributor

Also, I don't like much the idea of "extensions": Having an extension mean that an implementer may or not support it.

The intention here is, features that can and should be supported by all implementers will be part of the core specification, not extensions. As such, content authors who want their models to run everywhere should stick to the core spec.

However, we don't want to limit glTF so much that it only supports features possible on low-end mobile devices. For more expensive materials (like PBR specular-glossiness, or SSS) or WebGL 2.0 features, we hope to use extensions to enable experimentation, and then these features can be brought into the core spec if/when the time is right.

@erwanmaigret
Copy link

erwanmaigret commented Aug 4, 2017

I totally agree bringing some language in the game is far from ideal. But then I don't see any other options unless we start to get into predefined constraint+deformer notions.

Both could work in the end. I am afraid that if we don't do either, then the format will be limited to embedding only skinning/shapes/animation forever and side solutions will start to be created.

Let's just consider a very simple aiming constraint, which is very commonly used today in about any rig with a character. How could we make it so it's part of the data part?

Or maybe this just does not fit into the low-level gltf format but then we should consider specing out an official solution as a standard gltf behavior manager?

@stevenvergenz
Copy link

I disagree that glTF needs an embedded behavior system. At the end of the day, glTF is specifically a data transmission format. It's the jpeg of 3d, not the javascript. I think it would be useful to be able to serialize and transmit "entities" instead of "models", but trying to shoehorn that into glTF is a mistake. That deserves its own spec IMO.

@toji
Copy link
Member

toji commented Aug 4, 2017

@erwanmaigret: Things like aiming constraints sound highly use-case (and probably engine) specific. That's what the extras attributes are for. Certainly there's nothing preventing you from putting a full scripting language into an extras attribute if you wish, and your content pipeline can be tuned to use that, but everyone else loading the file will only use the parts of it that are standardized. That sounds problematic, but in reality there's very little chance that something as specific as an aiming constraint will map cleanly between environments anyway.

@erwanmaigret
Copy link

@toji Make sense I'll write this as a rigging/behavior engine riding on top of gltf then in a separate repo.

@pjcozzi
Copy link
Member Author

pjcozzi commented Aug 5, 2017

(Minor summary update; italics indicates updates from #1051 (comment)).


@steveghee
Copy link

steveghee commented Aug 5, 2017

Personally, I'd like to see focus on complexity management e.g. supporting levels-of-detail (and streaming), being able to support structured breakdown (json file referencing other json file(s) - #37 ) and finishing up the alternate material support e.g. a simplified blinn/phong model.

@alteous
Copy link

alteous commented Aug 5, 2017

For the sake of simplicity and ease of adoption, I hope that the scene hierarchy remains a strict tree. I say this because the nodes and heirarchy section of the specification mentions that this restriction may be lifted in the future.

@steveghee
Copy link

I would assume that if this is going to be truly platform neutral, there is going to a cleanup of some of the GL-specific enums e.g. componentType : 5126 (GL_FLOAT).

@JonathanDotCel
Copy link

@JonathanDotCel aside from the GUID issue, I believe both Unity (if they're hearing) and Godot should implement two ways of importing a glTF model:

  • as it is currently imported now, that is, take all the bits of the glTF and inport it into internal nodes or whatever, so developers can further modify it, reference parts, whatever.
  • As a black box object - no changes allowed - render it exactly as it comes and as seen in any other viewer. from the point of view of the game editor it would be represented as a single game node even if it has an internal skeleton. In other words, honour that glTF is like a jpg.

The latter would probably be seen as more limiting, but on the other hand, artists and developers would have guarantees that WYSIWYG fully applies.

I think that's actually reasonably well covered by glTFast, which is maintained by some of the Unity peeps.
To quote Aras: "is made by people from Unity. The lack of built-in support for glTF inside Unity has nothing to do with Autodesk, and everything to do with <insert lots of complex and often silly/misguided reasons>."
It'll generally do a 1:1 import, with a PBR setup though.

In practice I don't think I've ever seen a developer roll out a game with a basic PBR setup vs more tailored shaders, but I believe it is an option.

@aaronfranke
Copy link
Contributor

aaronfranke commented Oct 4, 2023

Can we please move this discussion to a separate issue? Over half of the height of this page is now discussion of UIDs. Starting over in a new issue would also allow us to start from scratch without any of the misunderstandings, confusions, accusations, heated discussion, missing clarifications, etc.

The case for UIDs should be made from a clean slate, without replying to specific users or making quotes.

@emackey
Copy link
Member

emackey commented Oct 4, 2023

Unfortunately GitHub doesn't offer moderation tools for migrating comments. @JonathanDotCel Would you mind opening a new issue in this repo, and briefly summarizing where the GUID/UID issue now stands from your own point of view? Further conversation can proceed from there.

As for this old roadmap issue from 2017, I propose we close it. Anyone with future roadmap suggestions is kindly invited to open a new issue per roadmap item separately.

@javagl
Copy link
Contributor

javagl commented Oct 4, 2023

@JonathanDotCel In order to not further dissect this here: When you open a new issue, then this could probably include a summary of the important points (that caused the confusion initially), including the justifications that are derived from the workflows in #1051 (comment) (I have skimmed over that comment, but have to read it more thoroughly). But there has been a lot of text, so in doubt, some of that could be handled with links to the respective comments here.

And... I'm also in favor of closing this issue: glTF is now so mature and stable that the spirit of this issue (as in: "What should be part of the next version of glTF?") is overtaken by the question about which extensions should be added. And Extension Ideas should first be discussed in issues until they gain enough clarity, momentum and agreement to become Extension Proposals, which would then be pull requests.

(This process, and how to organize extensions, are parts of an ongoing discussion, though...)

EDIT: I'm tempted to leave the honor of closing this to @pjcozzi 😄

@aaronfranke
Copy link
Contributor

Before we close this, I'll just mention one more point. Within Khronos, there is already precedent for how to refer to nodes in a glTF scene: using their name. This is what the glXF format does with "export name": https://github.com/KhronosGroup/glXF/tree/main/specification/2.0#importing-assets This implies that changing the name of a node later is not expected to continue working with the same glXF file (and, similarly, with a game engine's imported scene).

@jimver04
Copy link

jimver04 commented Dec 8, 2023

Hi,

the 4 byte length of total size (uint32), and the 4 byte length of chunk1 binary size (uint32) are limiting the model size to 4GB (FFFF FFFF, binary plus json). In CAD/CAE engineering the models can easily reach 10GBs. It is better to have headers in 8 bytes (uint64) that has practical no limit FFFF FFFF FFFF FFFF = 18 Hexabytes . Best, Dimitrios Ververidis.
image
image

@javagl
Copy link
Contributor

javagl commented Dec 8, 2023

@jimver04 This is on the radar via #2114 , but it will be difficult to resolve this in a backward-compatible way. (Of course, there could be the usual quirky workarounds, like "When the length is 0, then this means that it is more than 4GB, and the actual length has to be read from ... [some 8 bytes in the binary buffer]", but no decision has been made here yet)

(EDIT: Also mentioned earlier in this thread at #1051 (comment) , but ... helpfully hidden by default in the GitHub web interface ...)

@jimver04
Copy link

jimver04 commented Dec 20, 2023

@javagl for the 4GB issue and also for #2114 . A solution to solve the 4GB limit is as follows:

  • Total length can steal 3 bytes from version number because they are blank (little endian "02 00 00 00") so it can go up to 7 bytes = 2^56 .
  • JSON can remain as it is: 2^32 = 4GBs. It is enough for a JSON.
  • Binary can steal one byte from BIN : (BIN\0), so it can go up to 5 bytes = 2^40, i.e 1TB

and thus practically we have an unlimitted total length with 1TB limit for Binary and 4GB limit for json.
Best,
Dimitrios

@javagl
Copy link
Contributor

javagl commented Dec 20, 2023

@jimver04 This is the "roadmap" issue. Specific discussion about the 2GB/4GB limit should better happen in #2114 . Your suggestion would fall into the category of "quirky workarounds" for me, but "even less" compatible than some alternatives. You know that implementations will do

int32_t version = read(bytes 4..7);
if (version != 2) throw "Yikes!";

and spoling the 'version' field with additional data would break all existing implementations...

@jimver04
Copy link

Apart from the size > 4GB, I would like to see also

  • Selective changes support: Why to save all the model again if only one node has changed?
  • GUIDs and Metadata per scene node / accessor / bufferview (who changed and when). See also Speckle Systems format: (https://speckle.systems/) for incremental way of adding content to an existing file and collaborative editing.
  • Save Vertices in int8, int16 or Double as well.
  • Strict limitations about perfect division with 4 should be avoided by vendors for vertice buffers, as RGB color can be saved with 3 bytes and shift the total buffer to multiplies of 3. Perhaps a template guide should be provided for buffer checks that all support it.

@jimver04
Copy link

jimver04 commented Apr 3, 2024

Hi, face colors are importand in visualizing CAE Simulation results. Stress and Strain estimation algorithms provide output in the form of face colors. GLTF 2.0 supports only vertex colors. It can be used instead, but not efficiently. A vertex may belong to two triangles that have different colors. If we average the values of the two triangles and assign it to the vertex then the colors are blurred like a low pass filter. Are face colors considered in GLTF 3.0 already?
Best,
Dimitrios

@aaronfranke
Copy link
Contributor

@jimver04 That could be done in an extension.

@jimver04
Copy link

jimver04 commented Apr 5, 2024

Hi again,
objects in industry might be malformed under various stress circumstances, therefore the initial set of indices can not be used throughout the animation sequence. It would be nice if gltf 3.0 supported varying indices as an extra animation channel so that not only vertices but also indices could change throughout the animation.
Best,
Dimitrios

@jimver04
Copy link

In gltf 3.0, can a single vertex have multiple UV coordinates depending on how many faces it belongs to?

For the time being if a vertex has two UV coordinates, then in order to export this mesh to gltf format, e.g. from Blender 3D, this vertex should be duplicated or tripled, etc, depending on how many faces it belongs to. This results into very big files because there is a lot of vertex duplication.
See my issue here: microsoft/glTF-SDK#138

and related issues at stackoverflow:
https://gamedev.stackexchange.com/questions/140132/how-can-i-use-blender-style-uvs-and-not-per-vertex-in-opengl
https://stackoverflow.com/questions/61521259/vertex-normal-texture-uv-coordinate

@vpenades
Copy link
Contributor

vertex duplication

@jimver04 although glTF file format is used to geometry exchange more often than not, glTF is actually a file format designed for rendering performance, and as of today, duplicating the vertices in these cases is way faster than having complex relational structures. Actually, any 3D file format that stores geometry as relational structures, under the hood, ends duplicating the vertices for rendering.

Instead of asking for gltf-3 to support vertex-UV dissociation (which is unlikely to happen) maybe it could be worth to push khronos into defining a completely new file format specifically aimed for data exchange.

Khronos already maintains Collada format, but due to its complexity it was not adopted as widely as glTF. In my opinion, there's a need for a new format in between glTF and collada, designed for easy data exchange.

@aaronfranke
Copy link
Contributor

@vpenades I disagree, glTF is already a good format for data exchange if we add extensions that facilitate exchanging more types of data. Such extensions could be intended only to be used between content creation tools. I don't see a reason to repeat all the work of glTF in another format. Besides, keeping the base glTF stuff available is still highly useful, even for a content designed to exchange data between content creation tools it is useful to allow regular trimeshes.

@GreatLeaderTechnus
Copy link

I would eventually like to see STL File Support and Slicing Ability for Webpages.

@martinmolin
Copy link

I would love to see support for matrix animations and not just individual SRT components. As it is now I believe that some existing models that do use matrices for animation cannot be properly converted to glTF because not all matrices can be decomposed.

@emperorofmars
Copy link

emperorofmars commented Aug 2, 2024

I need a file format for 3d assets, which is extensible and interoperable, and most importantly, can store at least the core properties that make up a 3d model. Everybody seems to be convinced glTF 2.0 is the solution.

Is glTF 2.0 intended to be used as part of a serious game development asset pipeline and for sharing such assets?

I am asking this genuinely. Originally, glTF's tagline was 'The JPEG of 3d'. For a format to store 3d assets, something which qualifies more as an 'open PSD of 3d' would be needed.

Everybody seems to be convinced that glTF 2.0 is qualified for that.

Yet, I was not able to do even the most basic of things with glTF 2.0. I specifically tried to work with its Blender, Godot 4 and UnityGLTF & GLTFast implementations, so my judgement is derived from working with these.

Some issues with that I would like to highlight:

  • The animation system doesn't support animation curves, only an interpolation type parameter.
  • Animations can't target anything other than node transforms and morphtarget values per mesh resource (not mesh instance).
  • Morphtarget values and material references sit on the mesh resource. You can instance the mesh multiple times, but if you need to have different materials or target values on these instances, you have to create multiple mesh resources. Some implementations may deduplicate the buffer views, but since this is not in the spec, so some glTF implementations may just load the same mesh resource twice.
  • Materials are specified according to a gltf-shader. No game or application will use that for a real product. Shaders are arbitrary, and so are their properties. Additional properties get slowly added as extensions that are likely not supported in your glTF implementation anyway. What if I want to add an 'Audiolink Baseband Emission' texture?
  • For some reason, Morphtarget names are not present in the spec. If glTF is really intended for video game development, then that's a mistake. Applications put morphtarget names in the extras field of either the mesh object, or the first primitive of the mesh. For a supposedly well standardized format, this is not ok.
  • The buffer system is convoluted. Until late 2023, Blender would produce comically large files (over 100MB instead of 5MB in my case), because it didn't implement sparse accessors. Why could not morphtargets be stored as just simple buffers, which can be indexed. That's simple, the construct of Buffers, Buffer Views, Accessors, Sparse Accessors is not. I don't see a reason for any of this complexity.

And last but not least, Extensibility

This I view as glTF's biggest flaw by far.

At first glance, glTF 2.0 looks supremely extensible on paper.

In practice, implementing extensions is either not supported at all (UnityGLTF, GLTFast), or if it is, it is either undocumented (Blender) or at least partially broken (Godot 4).

Actually trying to implement a custom extension (for example one for social-VR avatars), requires in some cases having to fork the entire glTF library, hard-coding your extension into it, and getting your users to use your fork.
If one extension is not supported, it will just be thrown away upon import and lost. What if I need to add a Unity specific extension and a Godot specific extension to the same file? In the case of VR avatars, which optimally would support many target applications across many game engines, this is a necessity.

Just implementing a new extension across all relevant glTF implementations is practically impossible, and nobody does it either. Even official extensions like KHR_animation_pointer, which would at least half fix the animation system, is implemented next to nowhere. An extension existing only a JSON schema file in the Khronos GitHub account is not very useful.
Sometimes people send me large lists of extensions which they think would solve all of these issues, yet next to none of them exist in any relevant glTF implementation I would need to use.

glTF 2.0 was released over 8 years ago. This is its current state.

Conclusion

At least a JPEG is smaller than the PNG image. glTF files tend to be far larger than the original project file, for example due to the need to bake animations, while still loosing a lot of information.

I am not convinced glTF 2.0 is intended for 3d Assets in a video game context.

I would like for a format to exist which is.


In order for video game development with open-source tools to become viable for anything other than limited-in-fidelity indie games, an ecosystem and possibilities for proper asset pipelines are needed.
Currently, I see only limited point to point compatibilities. Godot had to implement a custom exporter for Blender into their own format, as an example of that. The most universally compatible format currently is fbx.

I think a proper open, interoperable & extensible 3d format could really help in making open source tools viable for many and improve the industry as a whole.

@javagl
Copy link
Contributor

javagl commented Aug 2, 2024

There's a lot to unpack. And I'm not sure whether this issue is the right place to do this. This issue was opened 7 years ago, right about the incarnation of glTF 2.0. At that time, it was rather intended for discussion about the ecosystem, and maybe extensions that should be considered to be implemented (or considered for a core specification of a hypothetical glTF 2.1 or even 3.0).

A few points:

Mesh instances are not explicitly modeled by the spec. In order to have the same mesh multiple times in a scene without duplicating the mesh data (e.g. instancing it), implementations have to deduplicate buffer views. This is not in the spec, so some glTF implementations may just load the same mesh resource twice.

Mesh instances are explicit insofar that one mesh instance can be attached to 10 nodes, and the same mesh instance will then be rendered 10 times. For the (narrower) case of (static) GPU-based instancing, there is EXT_mesh_gpu_instancing, which is pretty widely supported.
People occasionally criticized aspects of the mesh/meshPrimitive structure. For example, when the same geometry should be used with 5 different materials, then there will have to be 5 mesh primitives (because that's what the material is associated with). But a mesh primitive is "lightweight". It does not carry any data. That's, in fact, the reasoning behind the accessor/bufferView/buffer strucuture (which, conversely, has also been criticized...).

Materials are specified according to a gltf-shader.

The representation of PBR materials within glTF was intentionally chosen as some sort of "smallest common denominator" of what basically all rendering engines supported anyhow. (Sometimes as spec-gloss, but that's a detail for now).
And it may be important to point out: The representation is intentionally and explicitly not related to shaders. (I mean, glTF 1.0 did contain actual GLSL shader code - but that was abandoned in glTF 2.0). The metal-roughness PRB model it is an attempt to describe the physical material properties that affect the appearance. A reasonable way to describe a "red, slightly bumpy, slightly shiny, slighlty reflective" material is to store that it's red, slightly bumpy, slightly shiny, and slightly reflective. What the rendering engine does with that is ... up to the implementor of the engine.

The buffer system is convoluted. [...] Why could not morphtargets be stored as just simple buffers, which can be indexed. That's simple, the construct of Buffers, Buffer Views, Accessors, Sparse Accessors is not. I don't see a reason for any of this complexity.

(See above...). I agree that it looks convoluted at the first glance. And one could argue ... defensively ... here: Whatever structure we could have come up with, someone would suggest that a certain other structure would have been better - so we have at least some flexibility.
Don't get me wrong: Implementing the acessor/bufferView/buffer structure properly is kind of a burden, and it can be very tricky to get it right. For creators (glTF writer libraries) it raises a bunch of engineering questions. For example: Should each accessor have its own buffer view? Should each buffer view have its own buffer? What about interleaved data? And the alignment... oh dear, getting the alignment right.
The current structure was heavily inspired by how OpenGL/WebGL manages its data. Considering that glTF was intended to be rendered on the client side without having to do too much decoding and re-shuffling of the data, this was a reasonable choice: Upload it to the GPU. Done.
Right now, the structure offers flexibility, which inevietably includes the possibility to do things wrong. Using "the right representation" may depend on the use case. There's a difference between sending a CAD model as glTF to an engineer, or rendering some low-poly-in-game-asset. I've seen CAD-like glTF assets with 200MB and 150000 accessors. After optimizing them, they contained a few hundred accessors and ~15MB. Both are valid glTF files. It's up to the users and tools to not mess things up.

And last but not least, Extensibility

First, a short remark:

Even official extensions like KHR_animation_pointer, which would at least half fix the animation system, is implemented next to nowhere.

This extension is relatively new, relatively complex (it required a clean, formal definition of the https://github.com/KhronosGroup/glTF/blob/main/specification/2.0/ObjectModel.adoc to begin with), and the work of implementing and supporting it is currently in progress.

If one extension is not supported, it will just be thrown away upon import and lost.
...
Just implementing a new extension across all relevant glTF implementations is practically impossible, and nobody does it either.

This is one aspect that the working group is aware of. It's hard to say which extensions are supported where. Keeping track of the ecosystem - and making it easer to track this! - is also an ongoing task. This is one aspect, on an "organizational" level. The other one is what supporting an extension means for implementors:

Actually trying to implement a custom extension (for example one for social-VR avatars), requires in some cases having to fork the entire glTF library, hard-coding your extension into it, and getting your users to use your fork.

That's right. Ideally, it should be the goal of the implementor of the library to anticipate extensions, and provide a mechanism for implementing them. In some libraries, like glTF-Transform, there are places to "hook in" to add support for extensions that had previously been unsupported. But in general, that's faaar easier said than done, because an extension may virtually affect any aspect of a glTF file.

@aaronfranke
Copy link
Contributor

aaronfranke commented Aug 2, 2024

Is glTF 2.0 intended to be used as part of a serious game development asset pipeline and for sharing such assets?

See also the discussion in #2337 about unique IDs, which includes discussion of what stage of the pipeline glTF is intended to be in. People who want UIDs are trying to use glTF as a base asset which is further built up in a game engine, while people who don't want UIDs want glTF to be the final deliverable only.

Animations can't target anything other than node transforms and morphtarget values per mesh resource (not mesh instance).

This is now possible thanks to the glTF Object Model and the KHR_animation_pointer extension.

In practice, implementing extensions is [...] at least partially broken (Godot 4).

Hi, I help maintain the Godot 4 glTF pipeline. Feel free to discuss the specifics with me on Discord: aaronfranke and/or join the Godot RocketChat which is where development discussion takes place.

I would like to make the pipeline as non-broken as possible, but I don't know specifically which things are broken for you.

What if I need to add a Unity specific extension and a Godot specific extension to the same file?

I don't understand, what is the problem with that? You should be able to do that just fine. However, ideally, we should try to design extensions to be portable to many implementations, so this is not the recommended approach. Why do you need to have Unity-specific extensions in the first place? That seems like the wrong approach to begin with.

Even official extensions like KHR_animation_pointer, which would at least half fix the animation system, is implemented next to nowhere.

It's implemented in Godot Engine here godotengine/godot#94165 and will be available in Godot 4.4 or later.

KHR_animation_pointer is very new, it is not surprising that it has not been widely adopted yet. You can help change this. If you want to see support in your implementation of choice a year from now, go implement it today. It takes time for things to get merged and arrive downstream in stable versions to use with your favorite game engine.

Aside from implementation, it would be good to get the sample assets merged. I had to hunt them down in order to get test files for my Godot implementation. Anyway, it should be easier to implement in other engines now that I made a Godot implementation, as you can quickly generate sample assets from Godot to test against your implementation, and test assets generated by your implementation against Godot.

Sometimes people send me large lists of extensions which they think would solve all of these issues, yet next to none of them exist in any relevant glTF implementation I would need to use.

We should try to implement the extensions in more places, then.

Currently, I see only limited point to point compatibilities. Godot had to implement a custom exporter for Blender into their own format, as an example of that.

The Godot ESCN format for Blender was created before Godot had support for glTF 2.0. It was meant to make up for the shortcomings of Collada, not the shortcomings of glTF.

Also, I see that you made your own format STF. I see that it supports multiple "assets", what glTF would call scenes. If you are going through the pain of making a new format, which I am skeptical of in the first place, I recommend at least ensuring that it avoids the mistakes of glTF, such as multiple scenes per file: #1542 (and related, multiple root nodes is also worth avoiding, see #2329 and godotengine/godot-proposals#6588, but it seems your format already does this).

@emackey
Copy link
Member

emackey commented Aug 2, 2024

Even official extensions like KHR_animation_pointer, which would at least half fix the animation system, is implemented next to nowhere.

It's implemented in Godot Engine here godotengine/godot#94165 and will be available in Godot 4.4 or later.

Experimental authoring support has also shipped just a few days ago with Blender 4.2.0. This first draft is tricky to use, you have to push it into NLA tracks and enable experimental options, but it does work. For example: KhronosGroup/glTF-Sample-Assets#140

Also, earlier today, gltf-tools v2.5.0 shipped with new support for "Go To Definition" and "Peek Definition" for animation pointers, for easier debugging of files that use them.

@emperorofmars
Copy link

Thanks for this many comprehensive replies!

@javagl

Mesh instances are explicit insofar that one mesh instance can be attached to 10 nodes, and the same mesh instance will then be rendered 10 times.

You are correct, and I fear I didn't finish my train of thought there. If you need different morphtarget values or materials per mesh instance, then you need to have separate mesh resources. I updated my original post, thanks for pointing my mistake out!

The current structure was heavily inspired by how OpenGL/WebGL manages its data. Considering that glTF was intended to be rendered on the client side without having to do too much decoding and re-shuffling of the data, this was a reasonable choice: Upload it to the GPU. Done.

I think this pretty much answers my overarching question!

glTF starts to make a little more sense with this in mind.
That means glTF is more of a competitor to a game engine's internal optimized mesh format, less of an alternative to fbx. Unfortunately, fbx remains my only choice then.

I dare to argue the requirements for such a distribution format and an interoperability format are at least partially mutually exclusive.

@aaronfranke

Feel free to discuss the specifics with me on Discord

No offense, but I won't. I long since have created issues for everything I found. 2 out of 4 have been fixed by now, so kudos to the Godot contributors!

I don't understand, what is the problem with that? You should be able to do that just fine. However, ideally, we should try to design extensions to be portable to many implementations, so this is not the recommended approach. Why do you need to have Unity-specific extensions in the first place? That seems like the wrong approach to begin with.

When you import a glTF file, the implementation will throw every unsupported extension away.
Importing a glTF file and exporting it immediately gives you no guarantee that it will be the same file.
If you need to edit a file with multiple tools, your previous edits may be thrown away.

Also, I see that you made your own format STF. I see that it supports multiple "assets", what glTF would call scenes. If you are going through the pain of making a new format, which I am skeptical of in the first place, I recommend at least ensuring that it avoids the mistakes of glTF, such as multiple scenes per file: #1542 (and related, multiple root nodes is also worth avoiding, see #2329 and godotengine/godot-proposals#6588, but it seems your format already does this).

I mean, it's a prototype/proof-of-concept. I would never recommend anyone to use it at all right now. I'm not sure whether this is the right place to discuss that, however removing support for multiple assets/scenes is a good point and I may do that sometime.
I'm aware that creating a new interoperable 3d format alone is unrealistic. If my project helps to inform/figure out some parts of a future format, it would have more than served its purpose.

@javagl
Copy link
Contributor

javagl commented Aug 3, 2024

I dare to argue the requirements for such a distribution format and an interoperability format are at least partially mutually exclusive.

Certainly! And this has been an active point of discussion. The original intention of glTF was that of a "last mile format", strongly focussed on the efficient delivery of something that can immediately be rendered (e.g. in a browser). (And 'glTF' actually was an acronym, and stood for 'GL Transmission Format', but ... now it's pretty much just a word, like 'laser' or 'radar'. Harder to pronounce, though....).

Nowadays, people adapted glTF not only as a delivery/transmission format, but also as an interchange format (e.g. between authoring applications). It simply is a lean, clean, versatile, metriculously specified, reliable, ISO-standardized format. But the goal to use it as an interchange format brings in new requirements (like "assigning IDs to elements", or more powerful ways of modelling animations and such). One has to be careful in what to address here - and how. There's the danger of glTF collapsing under its own weight (yeah, https://xkcd.com/927/ fits here). The mechanism of extensions provides many options for cleanly extending the format, but some aspects of that still have to settle (e.g. the points that you mentioned: ~"Who is implementing which extension in which application, and how to know which extension is supported?").

@javagl
Copy link
Contributor

javagl commented Aug 3, 2024

And a short note about the point of the mesh resources:

Some implementations may deduplicate the buffer views, but since this is not in the spec, so some glTF implementations may just load the same mesh resource twice.

Yes! People can do that. Imagine two mesh primitives like (pseudocode)

"attributes" : {
    "POSITION" : 1,
    "COLOR_0" : 222
},
...
"attributes" : {
    "POSITION" : 1,
    "COLOR_0" : 333
},

that share the same POSITION accessor but have different COLOR_0 accessors. For the actual rendering, the POSITION data could be uploaded to the GPU only once (good), or twice (bad). When loading something like this in an authoring application, one could make a case to actually duplicate that data (e.g. to be able to modify the positions in one mesh primitive, without affecting the positions of the other one). The format itself allows such re-use (like the positions) to manifest themself (compared to, say, storing the positions as some sort of array or binary blob directly in the mesh). Some of the responsibility to handle this in a "smart" way is certainly left to the consumer.

@emperorofmars
Copy link

@javagl that absolutely makes sense now, as do so many other choices glTF makes, thanks!

The only mystery that remains for me is, why do so many in the 'open source gamedev sphere' believe glTF 2.0 is made for interchange between authoring tools. I fell into the same trap, of course. All my points were based on that, sorry.

@emackey
Copy link
Member

emackey commented Aug 5, 2024

This is an area that needs more discussion. Different WG members have different opinions, some arguing for sticking to the original delivery mission as the only one that matters, others offering clear reasons and valid ways to extend glTF beyond its limited original design goals. Godot has gone all-in with glTF as a gamedev interchange format, and yes there were rough patches to smooth out and a few still remain, but it would be a huge mistake to ask them to migrate to another format out of some misguided design purism thinking.

I think it bears similarity to the old tagline, "The JPG of 3D." Professional Photoshop artists save their work as PSD files, not JPG, and the interchange between artists in the same department on the same toolset is PSD. The established thinking by some is that all "real" work is done in an authoring format like PSD, and JPG is just a lowly delivery mechanism at the tail end of production. But what happens outside the art department? On a global scale, outside of an individual art department, the broader exchange of images is all JPG and PNG, not authoring/interchange formats like PSD, XCF, TIFF, etc. Getting new images from an outside source often implies JPG or PNG. Many cameras capture new images directly as JPG. The lowly JPG format has, feature-wise, a tiny subset of PSD features, and yet it has far greater reach than PSD. This is the space that glTF can occupy in 3D.

Perhaps I've stretched this metaphor too far, but I think there's a valid role for glTF as this sort of global exchange format. It won't displace an internal format such as USD within a given art department, of course, and it shouldn't try. But it is well within its capabilities to be traded and exchanged between different departments and different people with different toolchains, particularly where those toolchains have difficulty agreeing on the details of where the rendering takes place and what the target platform or target environment is. Absolutely glTF can and should be used by gamedev in that kind of role.

Staying in that kind of role has proven tricky: Different people want to grow the featureset in a variety of different ways. Yet overall, the baseline featureset can't be allowed to expand so much that glTF loses the simplicity that made it attractive in the first place. We (myself included) often site the original goal of delivery as reasoning not to expand the features too far, too fast. But I think there's a balance to be found, adding features that don't break the delivery mission while smoothing out the interchange capability. Godot is certainly pushing glTF in that direction, and I think the result will be an exceptionally capable format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests