Add ByteMatch hashCode() to reduce transitions object count. #199
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
It takes many hours to compute the machine complexity of this rule on a laptop:
This is because numeric matchers are compiles as ranges for comparison. The ranges create a sequence of numbers that then get added into the byte machine however most of these transition to a small number of state. See
ByteMachine.addRangePattern
for more details.Given that the numbers are only transition to handful of end states, we should have been merging the transtion within
ByteMap.updateTransitions
. However the merge relies on transitions being comparable which isn't the case for ByteMap.After adding hashcode() and equals(), we are not able to dedupe and avoid creating duplicates. This offers nominal benefit in rule matching latency but has notable improvement for (1) rule addition / removal time, and (2) when comparing the machine size / complexity.
Benchmark / Performance (for source code changes):
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.