~2x state_sim speedup via additional caching in get_crosslink_committee #316
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Not especially pretty, but not particularly 'contagious', either, in inter-function/module/etc coupling terms, and not especially complicated, marginally risky, having too much tech debt, flexibility-reducing, or assumption-reliant.
Some benchmarks -- all numbers relative to each other in terms of keeping an overall condition/context:
To start with, the existing status quo ante, both with and without BLS validation (which adds a roughly constant additional ~4 minutes for the overall
state_sim
parameters of 130 slots and 576 validators I was using):I added two cachings, and I wanted to make sure that both were incrementally worthwhile, and one didn't subsume the other, so, with only
start_shard_cache
:Of the two individual caches, this is the better of the two, but will prove to benefit from the other, the
committee_count_cache
(here, shown alone; n=1, disclaimer):If one had to choose between
committee_count_cache
andstart_shard_cache
, the latter would be preferable. But, together they're worthwhile combined:So it goes from 8 minutes to 4:30 for 576 validators, with max slot time, even for epoch slots, of 1.6 seconds on a typical 15W/25W TDP laptop, with both new caches.