-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve numeric matching #179
Comments
oh that's neat! This should be doable within ruler though I won't be able to pick up for few weeks due to an internal launch. Placing down links of the files I'd expect we'd need to touch to enforce this d04e3f0#diff-58bfacbbf2f6f6e26165ed131f0cae9667cf41d642d57b1a154ca97236507bce. To keep codebases roughly similar, I'll try to imitate numbits.go as much as Java lets me. |
BTW I wrote a blog about it at https://www.tbray.org/ongoing/When/202x/2024/08/28/Q-Numbers-2 and Arne Hoffman has promised to write an explanation of bit-masking voodoo, will put a pointer in here when I see it. Off the top of my head, I don't think Java should get in the way, although I'm not sure there's an equivalent of Go's |
Thanks Tim. Cursory check points me to Double.doubleToLongBits and related functions. I haven't had the time to explore how these methods behaves across various scenarios but hopefully it good enough for ruler's needs. |
This change follows the guidance from #179 on using 10 byte base-128 encoded format for numbers similar to how Quamina does it. Didn't see any performance implications of supporting the new range, but had to fix a bunch of tests. I will be changing the numbers we use for testing to better test the new range of numbers before merging. During debugging, I found it challenging to make sense of the numbers to I've also added a helper method in ComparableNumbers and modified toString() methods in few places.
What is your idea?
For Quamina, a couple of folks figured how to represent the whole range of 64-bit float values in 64 big-endian bits, and then to encode them in base128, then to discard certain suffixes. Check out numbits.go https://github.com/timbray/quamina/blob/main/numbits.go
So you get a smaller size representation of numeric field values, and no more subsetting of the the numbers that can be matched.
I can't see any reason this wouldn't work in Ruler .
Would you be willing to make the change?
I seem to recall that Rishi just wired in a similar flavor of change, so I suggest this one would be easy for him.
The text was updated successfully, but these errors were encountered: