-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scene data serialization #427
base: master
Are you sure you want to change the base?
Conversation
Initial bits of |
b0b6ab1
to
7305a2c
Compare
903ca6a
to
a2374e4
Compare
2e48e66
to
0325ba7
Compare
Codecov Report
@@ Coverage Diff @@
## master #427 +/- ##
===========================================
- Coverage 78.27% 67.72% -10.56%
===========================================
Files 389 366 -23
Lines 21140 17284 -3856
===========================================
- Hits 16548 11706 -4842
- Misses 4592 5578 +986
Continue to review full report at Codecov.
|
4155494
to
99f248a
Compare
99f248a
to
528bbeb
Compare
Replaces the old & extremely useless Profiler. Doesn't have everything I want yet (missing stddev and fancier GPU queries), that'll come later.
I want to use these in command-line arguments.
At first I attempted to make the whole thing reinterpret_cast-able from a blob of memory (i.e., truly zero-overhead), but while that sounded cool and all, it moved the overhead to basically all other code -- each function had to special-case access to attribute/vertex/index data as the pointers were no longer pointers, the binary representation had various weird unexplainable gaps ("here an array deleter is stored, set that to null and don't ask"), release*() functions got more complicated and when I got to issues with move construction/assignment I knew this was not the right path. Now the MeshData internals are packed to a much more compact representation (with the first attempt it was 128 bytes, now it's just 64) and the serialization doesn't make everything else slower, more complex or harder to test, which is a win.
Time to do the following (with a 482 MB file) magnum-sceneconverter lucy.blob b.blob goes down from about 1.9 seconds to 450 ms, ~equivalent to what cp lucy.blob b.blob takes (and of course this includes all range and validity checks).
Those will simply produce serialized blobs on output. TODO: make this more generic? like, if I specify a *.ply at the end, it uses some other converter after that?
528bbeb
to
260dee1
Compare
Testing this out. I have checked out the same branch in For
I guess external dependency is But I still do not understand how |
@janbajana err, sorry for the confusion. The FrameProfiler is in master now, no need to use this branch for it (it was briefly here, but because it got stable pretty fast, I put it in master). No need for any of those plugins, either. For the |
Eh, forgot to say -- you need to press the P key to toggle it. |
Super that worked. I have statistics now.
Thanks for the help. |
Continuation of #371, goes together with a branch of the same name in
magnum-plugins
. Current state:WIP docs: https://doc.magnum.graphics/magnum-meshdata-cereal-killer/blob.html
Scene format conversion
Now done mostly independently of this PR.
AbstractMeshConverter
plugin interface for operating onthe
MeshData
... orAbstractSceneConverter
plugin interfaceAnySceneConverter
magnum-meshconverter
magnum-sceneconverter
utility (1c51b98)--mesh
selector -- 036207f--image
-- 413dc56expose useful meshtools like normal generation in itcan't, it would cause a cyclic dependency :/ a plugin? how to name it?MagnumSceneConverter
is already taken... overload its purpose?Serialization
making the
MeshData
easily serializable, allowing projects to convert arbitrary mesh formats to blobs that could be loaded directly via mmaping:Each
*Data
would contain the same header with some magic, version info and total size. Magic and version info for sanity checks, total size to be able to simplycat
multiple blobs together without having to worry about metadata. Similar to RIFF chunks (can't be 100% compatible because it's limited to 4 GB maxRIFF64?, can look there for inspiration).Chunk iteration, something like
It needs some way to differentiate between end of iteration and invalid data. Returning
nullptr
in both cases would falsely look like there's no more data when we encounter an error, returning a non-null pointer on a failure would require the user to do an additional explicit check on chunk validity.TheTried this at first but it turned out to be extremely non-feasible. Every access would need to special-case this, would makeMeshData
need to support both self-contained operation (data bundled internally in anArray
) and the above-shown "reinterpreted" operation.releaseData()
more complex, move constructors impossible and the amount of testing was just expanding beyond any reasonable bound (then imagine the corner cases serializing deserialized data). The binary representation was also containing weird "do not use" / "set to 0" fields, being much less compact than it could be. Instead, the mesh metadata are parsed from a packed representation and a MeshData instance referencing vertex/index/attribute data in the original view.In order to avoid extra work, the data layout should be consistent between 32bit and 64bit systems -- otherwise it won't be possible to serialize data on a (64bit) desktop and use them on a (32bit) EmscriptenIt needs to be endian-specific at least, currently it's also different for 32 and 64 bits to support files over 4 GB and still have compact representation on 32b. Might reconsider and inflate the 32-bit representation to 64 bits to save some implementation work (theMeshAttributeData
wouldn't suffer that much) but this might become problematic when some data type need a lot of internal offsets (animations?).MagnumImporter
andMagnumSceneConverter
that provide import/conversion of different bitness / endiannessAnySceneImporter
/AnySceneConverter
Some assets might have one huge buffer and particular meshes being just views on it. Ideally the buffer should be uploaded as-is in a single piece, with meshes then referring subranges. In this case, the serialized MeshData need to have a way to reference an external "data chunk" somehow -- some flag saying the offset not internal but to a different chunk + a data chunk "ID"? Or having that implicit -- it'll always be the first data chunk after a couple of data-less MeshData chunks?
While this won't cover all cases (there still can be a single buffer but incompatible vertex layouts / index type), what about providing mesh views (theMeshViewObjectData
) that have index / vertex offsets to a single MeshData?SceneData
(SceneData rework #525) will have a multidraw-compatible index/vertex offset & size fieldsHook directly into
sceneconverter
(detect a*.blob
suffix, provide an inline AbstractImporter/AbstractSceneConverter implementation that mmaps the input/output and calls[de]serialize
)before pinning down the MeshData binary layout, check if it's possible to store arbitrary data in MeshAttributeData (e.g. AABBs, bounding sphere etc.) -- there's 16 bytes free, once it's 64bit-only
some high-level structure describing metadata (basically everything that's now exposed only via
AbstractImporter
APIs)some way to attach a name / checksum / other stuff to a chunk -- nested chunks? needs to have the semantics clearly documented -- treat nested chunks as inseparable? for example a mesh + image (or more of those) wrapped in a chunk that tells what's the image for? and when the outer chunk is torn apart, the meshes/images no longer have a relation to each other?
ability to reference external files somehow?
make it possible to record a command-line used to produce a particular chunk? (e.g. as proposed in gcc) ... or other comments? might be useful for repeatability
#
(#cmd
,####
, ...) for arbitrary comments / info about the creator, copyright, command-line etc.?come up with something like https://github.com/ValveSoftware/Fossilize for (de)serializing whole pipelines, ability to replay them back, extract meshes out of them etc. etc.
Versioning
Ensuring forward compatibility, avoiding breakages, being able to look into 3rd party data and figure out at least a part of them.
VertexFormat
enum values to ensure backwards compatibility of serialized data with new values addedMeshPrimitive
,Trade::MeshAttribute
,MeshIndexType
Trade::MaterialAttribute
,Trade::MaterialAttributeType
Trade::SceneField
,Trade::SceneFieldType
PixelFormat
,CompressedPixelFormat
Consider storing some kind of schema in a well-known format so we (or 3rd party code) don't need to write tons of code for backwards compatibility but instead extract it using the schema (https://twitter.com/dotstdy/status/1319929427774623745) ... something like Khronos Data Format, but not just for pixel data?the high-level MeshData, SceneData, MaterialData structures are the schema already, all data in those is unpacked into (strided) arrays instead of having to describe layout of complex structures, yay!