Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce HashCompressor for name compression #396

Merged
merged 4 commits into from
Oct 21, 2024
Merged

Conversation

bal-e
Copy link
Contributor

@bal-e bal-e commented Sep 23, 2024

I noticed that 'TreeCompressor' will likely have a lot of overhead, as it has to copy each label into its internal tree and it maintains a separate hashmap for every known name (where each name likely has one or two parents at the most, even though hashmaps have a large capacity).

'HashCompressor' is an alternative name compressor built on top of the 'hashbrown' crate, which offers lower-level hash table access that lets entries indirectly reference the built message for hashing and equality checks. It maintains a single hashmap, it should be faster, and it does not require as much memory.

The implementation here makes it clear that domain names should have been stored in reverse order, at least in the wire format. A decent chunk of the logic goes into correctly reversing the domain name. In fact, the implementation likely quadratic runtime because of this (at least when 'Name' or 'RelativeName' are used, as backward iteration on them has quadratic runtime).

While 'std' essentially exposes the same data structures as 'hashbrown', it does not currently offer a way to perform low-level hash table access (although there is a nightly "raw entry" feature). This may change in the future.

If not for the added 'hashbrown' dependency, I would suggest deprecating 'TreeCompressor' entirely.

@bal-e bal-e added the enhancement New feature or request label Sep 23, 2024
@bal-e bal-e self-assigned this Sep 23, 2024
@bal-e bal-e force-pushed the hash-name-compressor branch 3 times, most recently from b09d0e8 to f31e229 Compare September 23, 2024 13:08
Cargo.toml Outdated Show resolved Hide resolved
src/base/message_builder.rs Outdated Show resolved Hide resolved
@bal-e bal-e force-pushed the hash-name-compressor branch from f31e229 to baab897 Compare October 3, 2024 07:51
@bal-e bal-e requested a review from partim October 3, 2024 08:22
@ximon18 ximon18 added breaking A PR that includes a breaking change. and removed enhancement New feature or request labels Oct 7, 2024
@bal-e
Copy link
Contributor Author

bal-e commented Oct 7, 2024

@ximon18: Do you think this is a breaking change? It doesn't remove any existing APIs, we just discussed that TreeCompressor could be replaced by this for the next breaking-change release. I'd like to get it merged soon.

@ximon18
Copy link
Member

ximon18 commented Oct 7, 2024

I'll take a look as soon as I can.

@ximon18
Copy link
Member

ximon18 commented Oct 7, 2024

@ximon18: Do you think this is a breaking change? It doesn't remove any existing APIs, we just discussed that TreeCompressor could be replaced by this for the next breaking-change release. I'd like to get it merged soon.

It seems to be purely additive so I don't see why it would be a breaking change. The only observation I had when scanning it is that I saw an assert!() call which is not documented in the RustDoc as a known way the code can panic... and ideally it wouldn't panic on user input, especially for a function that is able to error out. I can't tell if hashbrown as an added dependency in some way limits the build targets compared to now, but given that hashbrown is used by the Rust standard library I assume it is fine. I don't know how you chose the feature flags to use for hashbrown, I don't see them documented in the crate docs.

Ah, I see you are asking because @partim and and I added the breaking label when looking through the list of PRs, and we at the time were talking about it replacing the current TreeCompressor but I see from your comment that is not the plan. I'll remove the label.

@ximon18 ximon18 removed the breaking A PR that includes a breaking change. label Oct 7, 2024
@bal-e
Copy link
Contributor Author

bal-e commented Oct 8, 2024

@ximon18 the assert can only fail if somebody implements ToName incorrectly, which is a programming error and not something external users can cause. The hashbrown feature flags are there to disable some unnecessary default features to speed up compilation.

@bal-e bal-e requested a review from a team October 16, 2024 07:33
bal-e added 4 commits October 16, 2024 09:39
I noticed that 'TreeCompressor' will likely have a lot of overhead, as
it has to copy each label into its internal tree and it maintains a
separate hashmap for every known name (where each name likely has one or
two parents at the most, even though hashmaps have a large capacity).

'HashCompressor' is an alternative name compressor built on top of the
'hashbrown' crate, which offers lower-level hash table access that lets
entries indirectly reference the built message for hashing and equality
checks.  It maintains a single hashmap, it should be faster, and it does
not require as much memory.

The implementation here makes it clear that domain names should have
been stored in reverse order, at least in the wire format.  A decent
chunk of the logic goes into correctly reversing the domain name.  In
fact, the implementation likely quadratic runtime because of this (at
least when 'Name' or 'RelativeName' are used, as backward iteration on
them has quadratic runtime).

While 'std' essentially exposes the same data structures as 'hashbrown',
it does not currently offer a way to perform low-level hash table access
(although there is a nightly "raw entry" feature).  This may change in
the future.

If not for the added 'hashbrown' dependency, I would suggest deprecating
'TreeCompressor' entirely.
The feature flag to enable it has been changed to be more specific.
@bal-e bal-e force-pushed the hash-name-compressor branch from b918f97 to 072474a Compare October 16, 2024 07:41
@partim partim merged commit 97c0036 into main Oct 21, 2024
26 checks passed
@partim partim deleted the hash-name-compressor branch October 21, 2024 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants