-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do not hex-format primitives when format doesn't require it #454
Conversation
Should we have unit / snapshot tests to verify that roudtrip serialization works when is_human_readable is enabled / disabled? |
Yep. Those are already tested for in fuel-core, but it makes sense to have them tested here as well. |
fuel-types/src/array_types.rs
Outdated
serializer.serialize_str(&format!("{:x}", &self)) | ||
} else { | ||
let mut seq = serializer.serialize_tuple($s)?; | ||
for elem in self.0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem here and during deserialization is that you will iterate over each element and serialize/deserialize it separately. It is much slower than copying bytes into/from the slice.
We need to optimize that part for our primitives because we use them everywhere.
Maybe the optimizer already is clever enough to optimize it for arrays, but we need to prove it with benchmarks.
For example, you can reuse implementation form the OptimizedContract
to compare with your implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current way is what serde docs suggests to do with fixed size arrays.
You're right that the iteration here is slower than a slice copy; and indeed switching to serialize_bytes
speeds this up quite a bit (4x by my microbenchmark). However, with the currently used postcard
format, serialize_bytes
prefixes the array with it's length in the serialized form, making it larger than necessary. Some other formats (like bincode
) do seem not do this, but bincode
is much slower otherwise. I wish there was a serde format that had both of these figured out, but I haven't seen that yet.
In any case, I think having a single extra byte encoded for each of the primitives isn't ideal, but we cannot afford the slow down that comes with doing it right, so I'm updating this to use serialize_bytes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also try to bench it with zerovec
, maybe for serialize_bytes
, it doesn't use length
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, I'll do that later today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So after reading through zerovec docs, it's not apparent to me how zerovec
would help us. zerovec
is not a serialization format, but rather a set of types to assist with zero-copy deserialization using serde formats. Converting these types to use zerovec
would make them less ergonomic to use, and creating new instances would require allocation, as zerovec
doesn't do stack-allocated arrays.
What we could do instead is to use Cow<'a, [u8; SIZE]>
for the array inside the type instead. This means that the type takes a lifetime argument. Sadly that breaks over 200 different locations where the types are used, and I don't necessarily feel like spending a whole day just adding lifetimes to them in hopes that it would be useful. That would also break things like AsMut
trait impl, which is used in many many places as well.
These types are so cheap to copy that zero-copying them makes almost no sense anyways; I think we should instead focus on using zero-copy for the expensive types like Contract
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the SCALE
, they don't put additional bytes for arrays=) Because the size is known
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is SCALE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Substrate library for serialization and deserialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to have open correctness issues, but otherwise looks nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the 0-prefixing is outside of our control from the serde impl perspective. We don't really have a way around this since we use postcard downstream, and so avoiding this overhead would require using an entirely different serde adapter (which is outside the scope of this PR imo)
…elLabs/fuel-vm into dento/serde-human-readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also update the CHANGELOG.md, please?
@@ -19,6 +19,7 @@ serde = { version = "1.0", default-features = false, features = ["derive", "allo | |||
bincode = { workspace = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, maybe in the other tests we need to use postcard
instead of bincode
too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently we're using both postcard and bincode in our codebase. Tests should use the codec the feature is using, and generally the codec should be an internal implementation detail that's not directly tested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, where do we use bincode
? I can't find any usage in the fuel-core
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh apparently we don't anymore. In that case the tests should be updated as well. I'll do a follow-up PR, since that's out-of-scope for this.
Changelog updated. |
Uses
is_human_readable
feature of serde. At leastpostcard
which we're currently using for db benefits from this behavior. Works towards FuelLabs/fuel-core#1085