Add ByteMatch hashCode() to reduce transitions object count. #199

baldawar · 2024-10-31T22:00:40Z

Description of changes:

It takes many hours to compute the machine complexity of this rule on a laptop:

{
    "field1": [{
        "numeric": ["<=", 120.0]
    }],
    "field2": [{
        "numeric": [">", 300.0]
    }],
    "field3": [{
        "numeric": ["=", 60.0]
    }],
    "field4": [{
        "numeric": ["<", 60.0]
    }],
    "field5": [{
        "numeric": ["<=", 60.0]
    }]
}

This is because numeric matchers are compiles as ranges for comparison. The ranges create a sequence of numbers that then get added into the byte machine however most of these transition to a small number of state. See ByteMachine.addRangePattern for more details.

Given that the numbers are only transition to handful of end states, we should have been merging the transtion within ByteMap.updateTransitions. However the merge relies on transitions being comparable which isn't the case for ByteMap.

After adding hashcode() and equals(), we are not able to dedupe and avoid creating duplicates. This offers nominal benefit in rule matching latency but has notable improvement for (1) rule addition / removal time, and (2) when comparing the machine size / complexity.

Benchmark / Performance (for source code changes):

Running software.amazon.event.ruler.Benchmarks
Reading citylots2
Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 12131
Events/sec: 17563.9
 Rules/sec: 122947.5
Reading citylots2
Read 213068 events
EXACT events/sec: 236742.2
WILDCARD events/sec: 167112.2
PREFIX events/sec: 239402.2
PREFIX_EQUALS_IGNORE_CASE_RULES events/sec: 244344.0
SUFFIX events/sec: 238598.0
SUFFIX_EQUALS_IGNORE_CASE_RULES events/sec: 236742.2
EQUALS_IGNORE_CASE events/sec: 209506.4
NUMERIC events/sec: 126075.7
ANYTHING-BUT events/sec: 127128.9
ANYTHING-BUT-IGNORE-CASE events/sec: 131442.3
ANYTHING-BUT-PREFIX events/sec: 140361.0
ANYTHING-BUT-SUFFIX events/sec: 137908.1
ANYTHING-BUT-WILDCARD events/sec: 144355.0
COMPLEX_ARRAYS events/sec: 27149.3
PARTIAL_COMBO events/sec: 42922.6
COMBO events/sec: 17683.5
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 56.057 sec

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

It takes many hours to compute the machine complexity of this rule on a laptop: { "field1": [{ "numeric": ["<=", 120.0] }], "field2": [{ "numeric": [">", 300.0] }], "field3": [{ "numeric": ["=", 60.0] }], "field4": [{ "numeric": ["<", 60.0] }], "field5": [{ "numeric": ["<=", 60.0] }] } This is because numeric matchers are compiles as ranges for comparison. The ranges create a sequence of numbers that then get added into the byte machine however most of these transition to a small number of state. See `ByteMachine.addRangePattern` for more details. Given that the numbers are only transition to handful of end states, we should have been merging the transtion within `ByteMap.updateTransitions`. However the merge relies on transitions being comparable which isn't the case for ByteMap. After adding hashcode() and equals(), we are not able to dedupe and avoid creating duplicates. This offers nominal benefit in rule matching latency but has notable improvement for (1) rule addition / removal time, and (2) when comparing the machine size / complexity.

baldawar · 2024-10-31T22:49:22Z

Comparison from previous run https://github.com/aws/event-ruler/actions/runs/11061466520/job/30734130939#step:5:100 :

Running software.amazon.event.ruler.Benchmarks
Reading citylots2
Read 213068 events
Finding Rules...
Lots: 10000
Lots: 20000
Lots: 30000
Lots: 40000
Lots: 50000
Lots: 60000
Lots: 70000
Lots: 80000
Lots: 90000
Lots: 100000
Lots: 110000
Lots: 120000
Lots: 130000
Lots: 140000
Lots: 150000
Lots: 160000
Lots: 170000
Lots: 180000
Lots: 190000
Lots: 200000
Lots: 210000
Lines: 213068, Msec: 12087
Events/sec: 17627.9
 Rules/sec: 123395.1
Reading citylots2
Read 213068 events
EXACT events/sec: 245753.2
WILDCARD events/sec: 164151.0
PREFIX events/sec: 253350.8
PREFIX_EQUALS_IGNORE_CASE_RULES events/sec: 256091.3
SUFFIX events/sec: 246321.4
SUFFIX_EQUALS_IGNORE_CASE_RULES events/sec: 249786.6
EQUALS_IGNORE_CASE events/sec: 219884.4
NUMERIC events/sec: 128586.6
ANYTHING-BUT events/sec: 141291.8
ANYTHING-BUT-IGNORE-CASE events/sec: 142998.7
ANYTHING-BUT-PREFIX events/sec: 151542.0
ANYTHING-BUT-SUFFIX events/sec: 148376.0
ANYTHING-BUT-WILDCARD events/sec: 157129.8
COMPLEX_ARRAYS events/sec: 26929.7
PARTIAL_COMBO events/sec: 49082.7
COMBO events/sec: 17777.9
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 54.671 sec```

timbray · 2024-10-31T23:35:49Z

As often happens, Quamina is seeing action in the same space. I could never figure out a good way to hash states and dedupe, so there is memory wastage. However ehiggins0 has an approach where he builds a trie and then uses the trie path (essentially, how you got to a state) to compute what seem to be really good-quality hashes and is getting rather good memory savings. Not merged or even PR'ed yet but apparently in production in his fork. Not sure whether the technique is applicable to Ruler but it's interesting that we're both looking at the same problems. See timbray/quamina#363

baldawar · 2024-11-01T16:43:14Z

interesting indeed. I'll watch timbray/quamina#363 and Quanmina PRs for progress, the trie-based approach definitely is intriguing.

baldawar added 2 commits October 31, 2024 13:46

checkstyle violations

d5f53e7

baldawar marked this pull request as ready for review October 31, 2024 22:49

version bump

5c8f3a5

scabraha approved these changes Nov 1, 2024

View reviewed changes

baldawar merged commit 4c883d7 into main Nov 1, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add ByteMatch hashCode() to reduce transitions object count. #199

Add ByteMatch hashCode() to reduce transitions object count. #199

baldawar commented Oct 31, 2024 •

edited

Loading

baldawar commented Oct 31, 2024

timbray commented Oct 31, 2024

baldawar commented Nov 1, 2024

Add ByteMatch hashCode() to reduce transitions object count. #199

Add ByteMatch hashCode() to reduce transitions object count. #199

Conversation

baldawar commented Oct 31, 2024 • edited Loading

Description of changes:

Benchmark / Performance (for source code changes):

baldawar commented Oct 31, 2024

timbray commented Oct 31, 2024

baldawar commented Nov 1, 2024

baldawar commented Oct 31, 2024 •

edited

Loading