-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for reading and writing the .NET Half type #418
Conversation
} | ||
else | ||
{ | ||
// Float-16 values are always stored in little-endian order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For completeness I thought we should handle big-endian machines, as we have a similar check for the guid type, but we currently only build x64 and arm64 native binaries for the nuget package, and there doesn't seem to be an easy way to test this. Maybe we should just throw a not implemented exception for big endian instead? This didn't seem to affect performance at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine as it is now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, well done.
} | ||
else | ||
{ | ||
// Float-16 values are always stored in little-endian order |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is fine as it is now.
Fixes #413
This adds support for the new
Float16
logical type added in Arrow 15. I've added a new .NET 6 target to the ParquetSharp project to allow using the newHalf
type, which required fixing a few errors related to nullable reference type checking when building with the newer target.One thing to be aware of is that writing
Half
values can be a bit slower than floats because of the extra overhead of writing these as fixed-length byte arrays rather than having a dedicated physical type. We could possibly improve this in future if it turns out to be a problem. I did some quick benchmarking of reading and writing 1 million random floats, doubles and half values, with dictionary encoding disabled: