-
-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce HashCompressor
for name compression
#396
Conversation
b09d0e8
to
f31e229
Compare
f31e229
to
baab897
Compare
@ximon18: Do you think this is a breaking change? It doesn't remove any existing APIs, we just discussed that |
I'll take a look as soon as I can. |
It seems to be purely additive so I don't see why it would be a breaking change. The only observation I had when scanning it is that I saw an Ah, I see you are asking because @partim and and I added the |
@ximon18 the |
I noticed that 'TreeCompressor' will likely have a lot of overhead, as it has to copy each label into its internal tree and it maintains a separate hashmap for every known name (where each name likely has one or two parents at the most, even though hashmaps have a large capacity). 'HashCompressor' is an alternative name compressor built on top of the 'hashbrown' crate, which offers lower-level hash table access that lets entries indirectly reference the built message for hashing and equality checks. It maintains a single hashmap, it should be faster, and it does not require as much memory. The implementation here makes it clear that domain names should have been stored in reverse order, at least in the wire format. A decent chunk of the logic goes into correctly reversing the domain name. In fact, the implementation likely quadratic runtime because of this (at least when 'Name' or 'RelativeName' are used, as backward iteration on them has quadratic runtime). While 'std' essentially exposes the same data structures as 'hashbrown', it does not currently offer a way to perform low-level hash table access (although there is a nightly "raw entry" feature). This may change in the future. If not for the added 'hashbrown' dependency, I would suggest deprecating 'TreeCompressor' entirely.
The feature flag to enable it has been changed to be more specific.
b918f97
to
072474a
Compare
I noticed that 'TreeCompressor' will likely have a lot of overhead, as it has to copy each label into its internal tree and it maintains a separate hashmap for every known name (where each name likely has one or two parents at the most, even though hashmaps have a large capacity).
'HashCompressor' is an alternative name compressor built on top of the 'hashbrown' crate, which offers lower-level hash table access that lets entries indirectly reference the built message for hashing and equality checks. It maintains a single hashmap, it should be faster, and it does not require as much memory.
The implementation here makes it clear that domain names should have been stored in reverse order, at least in the wire format. A decent chunk of the logic goes into correctly reversing the domain name. In fact, the implementation likely quadratic runtime because of this (at least when 'Name' or 'RelativeName' are used, as backward iteration on them has quadratic runtime).
While 'std' essentially exposes the same data structures as 'hashbrown', it does not currently offer a way to perform low-level hash table access (although there is a nightly "raw entry" feature). This may change in the future.
If not for the added 'hashbrown' dependency, I would suggest deprecating 'TreeCompressor' entirely.