Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: swap @chainsafe/bls to use napi blst implementation #6362

Closed
wants to merge 8 commits into from

Conversation

matthewkeil
Copy link
Member

Motivation

DRAFT PR. Not ready for full review.

Branch is deployed to feat2 to collect metrics

Description

Updated @chainsafe/bls to use the Napi version of @chainsafe/blst. Added async codepaths to bls processing an added flag to allow selection of Workerpool or libuv threadpool.

Copy link

codecov bot commented Jan 28, 2024

Codecov Report

Merging #6362 (4c07c34) into unstable (e9a3f07) will decrease coverage by 1.57%.
Report is 1 commits behind head on unstable.
The diff coverage is n/a.

❗ Current head 4c07c34 differs from pull request most recent head 0a5c670. Consider uploading reports for the commit 0a5c670 to get more accurate results

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #6362      +/-   ##
============================================
- Coverage     61.72%   60.15%   -1.57%     
============================================
  Files           555      407     -148     
  Lines         58204    46523   -11681     
  Branches       1839     1551     -288     
============================================
- Hits          35925    27986    -7939     
+ Misses        22240    18505    -3735     
+ Partials         39       32       -7     

@matthewkeil matthewkeil requested a review from twoeths January 28, 2024 13:20
Copy link
Contributor

github-actions bot commented Jan 28, 2024

Performance Report

✔️ no performance regression detected

🚀🚀 Significant benchmark improvement detected

Benchmark suite Current: b5dacd0 Previous: e9a3f07 Ratio
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 48.890 us/op 176.03 us/op 0.28
Set add up to 256 items then delete middle 7.2476 us/op 21.927 us/op 0.33
Full benchmark results
Benchmark suite Current: b5dacd0 Previous: e9a3f07 Ratio
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 606.29 us/op 909.23 us/op 0.67
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 48.890 us/op 176.03 us/op 0.28
BLS verify - blst-native 1.1069 ms/op 1.4517 ms/op 0.76
BLS verifyMultipleSignatures 3 - blst-native 2.1451 ms/op 3.1202 ms/op 0.69
BLS verifyMultipleSignatures 8 - blst-native 4.6419 ms/op 6.4448 ms/op 0.72
BLS verifyMultipleSignatures 32 - blst-native 16.878 ms/op 23.893 ms/op 0.71
BLS verifyMultipleSignatures 64 - blst-native 33.207 ms/op 46.184 ms/op 0.72
BLS verifyMultipleSignatures 128 - blst-native 65.772 ms/op 92.869 ms/op 0.71
BLS deserializing 10000 signatures 779.82 ms/op 965.83 ms/op 0.81
BLS deserializing 100000 signatures 7.9253 s/op 9.5495 s/op 0.83
BLS verifyMultipleSignatures - same message - 3 - blst-native 1.1374 ms/op 1.4762 ms/op 0.77
BLS verifyMultipleSignatures - same message - 8 - blst-native 1.2884 ms/op 1.6473 ms/op 0.78
BLS verifyMultipleSignatures - same message - 32 - blst-native 2.0261 ms/op 2.5651 ms/op 0.79
BLS verifyMultipleSignatures - same message - 64 - blst-native 3.0153 ms/op 3.6115 ms/op 0.83
BLS verifyMultipleSignatures - same message - 128 - blst-native 4.9855 ms/op 6.1982 ms/op 0.80
BLS aggregatePubkeys 32 - blst-native 24.711 us/op 28.007 us/op 0.88
BLS aggregatePubkeys 128 - blst-native 95.664 us/op 110.23 us/op 0.87
notSeenSlots=1 numMissedVotes=1 numBadVotes=10 41.686 ms/op 64.292 ms/op 0.65
notSeenSlots=1 numMissedVotes=0 numBadVotes=4 46.349 ms/op 66.163 ms/op 0.70
notSeenSlots=2 numMissedVotes=1 numBadVotes=10 27.272 ms/op 51.816 ms/op 0.53
getSlashingsAndExits - default max 211.86 us/op 277.56 us/op 0.76
getSlashingsAndExits - 2k 594.19 us/op 719.97 us/op 0.83
proposeBlockBody type=full, size=empty 4.5253 ms/op 6.6901 ms/op 0.68
isKnown best case - 1 super set check 343.00 ns/op 608.00 ns/op 0.56
isKnown normal case - 2 super set checks 378.00 ns/op 647.00 ns/op 0.58
isKnown worse case - 16 super set checks 353.00 ns/op 641.00 ns/op 0.55
CheckpointStateCache - add get delete 4.8970 us/op 6.6130 us/op 0.74
validate api signedAggregateAndProof - struct 2.2396 ms/op 3.0467 ms/op 0.74
validate gossip signedAggregateAndProof - struct 2.2812 ms/op 3.0523 ms/op 0.75
validate gossip attestation - vc 640000 1.2241 ms/op 1.4803 ms/op 0.83
batch validate gossip attestation - vc 640000 - chunk 32 171.50 us/op 190.77 us/op 0.90
batch validate gossip attestation - vc 640000 - chunk 64 146.87 us/op 167.52 us/op 0.88
batch validate gossip attestation - vc 640000 - chunk 128 148.40 us/op 166.57 us/op 0.89
batch validate gossip attestation - vc 640000 - chunk 256 141.67 us/op 150.91 us/op 0.94
pickEth1Vote - no votes 1.1001 ms/op 1.3762 ms/op 0.80
pickEth1Vote - max votes 17.184 ms/op 12.587 ms/op 1.37
pickEth1Vote - Eth1Data hashTreeRoot value x2048 15.125 ms/op 22.743 ms/op 0.67
pickEth1Vote - Eth1Data hashTreeRoot tree x2048 21.743 ms/op 29.622 ms/op 0.73
pickEth1Vote - Eth1Data fastSerialize value x2048 528.98 us/op 791.90 us/op 0.67
pickEth1Vote - Eth1Data fastSerialize tree x2048 5.1425 ms/op 5.9118 ms/op 0.87
bytes32 toHexString 797.00 ns/op 749.00 ns/op 1.06
bytes32 Buffer.toString(hex) 365.00 ns/op 348.00 ns/op 1.05
bytes32 Buffer.toString(hex) from Uint8Array 614.00 ns/op 578.00 ns/op 1.06
bytes32 Buffer.toString(hex) + 0x 346.00 ns/op 319.00 ns/op 1.08
Object access 1 prop 0.28300 ns/op 0.24000 ns/op 1.18
Map access 1 prop 0.21000 ns/op 0.15700 ns/op 1.34
Object get x1000 5.3640 ns/op 7.7110 ns/op 0.70
Map get x1000 0.91900 ns/op 0.88600 ns/op 1.04
Object set x1000 54.852 ns/op 60.314 ns/op 0.91
Map set x1000 21.883 ns/op 49.842 ns/op 0.44
Return object 10000 times 0.24750 ns/op 0.25410 ns/op 0.97
Throw Error 10000 times 2.9486 us/op 4.1639 us/op 0.71
fastMsgIdFn sha256 / 200 bytes 2.2830 us/op 3.5780 us/op 0.64
fastMsgIdFn h32 xxhash / 200 bytes 400.00 ns/op 374.00 ns/op 1.07
fastMsgIdFn h64 xxhash / 200 bytes 429.00 ns/op 425.00 ns/op 1.01
fastMsgIdFn sha256 / 1000 bytes 6.6860 us/op 12.372 us/op 0.54
fastMsgIdFn h32 xxhash / 1000 bytes 542.00 ns/op 502.00 ns/op 1.08
fastMsgIdFn h64 xxhash / 1000 bytes 534.00 ns/op 495.00 ns/op 1.08
fastMsgIdFn sha256 / 10000 bytes 59.795 us/op 107.18 us/op 0.56
fastMsgIdFn h32 xxhash / 10000 bytes 2.0790 us/op 2.1650 us/op 0.96
fastMsgIdFn h64 xxhash / 10000 bytes 1.3790 us/op 1.5140 us/op 0.91
send data - 1000 256B messages 17.851 ms/op 23.393 ms/op 0.76
send data - 1000 512B messages 16.222 ms/op 33.827 ms/op 0.48
send data - 1000 1024B messages 30.033 ms/op 48.911 ms/op 0.61
send data - 1000 1200B messages 24.977 ms/op 49.969 ms/op 0.50
send data - 1000 2048B messages 44.086 ms/op 60.775 ms/op 0.73
send data - 1000 4096B messages 41.577 ms/op 54.334 ms/op 0.77
send data - 1000 16384B messages 93.478 ms/op 134.98 ms/op 0.69
send data - 1000 65536B messages 423.78 ms/op 499.19 ms/op 0.85
enrSubnets - fastDeserialize 64 bits 945.00 ns/op 1.5350 us/op 0.62
enrSubnets - ssz BitVector 64 bits 429.00 ns/op 554.00 ns/op 0.77
enrSubnets - fastDeserialize 4 bits 200.00 ns/op 222.00 ns/op 0.90
enrSubnets - ssz BitVector 4 bits 428.00 ns/op 521.00 ns/op 0.82
prioritizePeers score -10:0 att 32-0.1 sync 2-0 71.246 us/op 126.25 us/op 0.56
prioritizePeers score 0:0 att 32-0.25 sync 2-0.25 80.790 us/op 150.42 us/op 0.54
prioritizePeers score 0:0 att 32-0.5 sync 2-0.5 104.88 us/op 213.34 us/op 0.49
prioritizePeers score 0:0 att 64-0.75 sync 4-0.75 180.56 us/op 365.44 us/op 0.49
prioritizePeers score 0:0 att 64-1 sync 4-1 205.33 us/op 419.65 us/op 0.49
array of 16000 items push then shift 1.2785 us/op 1.8476 us/op 0.69
LinkedList of 16000 items push then shift 5.9300 ns/op 9.9920 ns/op 0.59
array of 16000 items push then pop 68.573 ns/op 119.53 ns/op 0.57
LinkedList of 16000 items push then pop 5.7050 ns/op 9.5560 ns/op 0.60
array of 24000 items push then shift 1.8911 us/op 2.7184 us/op 0.70
LinkedList of 24000 items push then shift 6.0940 ns/op 9.5410 ns/op 0.64
array of 24000 items push then pop 84.991 ns/op 158.79 ns/op 0.54
LinkedList of 24000 items push then pop 5.8880 ns/op 9.1590 ns/op 0.64
intersect bitArray bitLen 8 5.4400 ns/op 6.1740 ns/op 0.88
intersect array and set length 8 49.789 ns/op 85.586 ns/op 0.58
intersect bitArray bitLen 128 29.543 ns/op 37.211 ns/op 0.79
intersect array and set length 128 709.52 ns/op 1.1068 us/op 0.64
bitArray.getTrueBitIndexes() bitLen 128 1.2180 us/op 1.6530 us/op 0.74
bitArray.getTrueBitIndexes() bitLen 248 2.0180 us/op 3.1290 us/op 0.64
bitArray.getTrueBitIndexes() bitLen 512 3.8770 us/op 6.5890 us/op 0.59
Buffer.concat 32 items 951.00 ns/op 1.1300 us/op 0.84
Uint8Array.set 32 items 2.0690 us/op 2.1200 us/op 0.98
Set add up to 64 items then delete first 2.2221 us/op 5.1507 us/op 0.43
OrderedSet add up to 64 items then delete first 2.8819 us/op 6.4288 us/op 0.45
Set add up to 64 items then delete last 1.9858 us/op 5.4642 us/op 0.36
OrderedSet add up to 64 items then delete last 2.9584 us/op 7.0399 us/op 0.42
Set add up to 64 items then delete middle 1.9563 us/op 5.5564 us/op 0.35
OrderedSet add up to 64 items then delete middle 4.2039 us/op 8.3533 us/op 0.50
Set add up to 128 items then delete first 3.8353 us/op 10.679 us/op 0.36
OrderedSet add up to 128 items then delete first 6.0710 us/op 15.074 us/op 0.40
Set add up to 128 items then delete last 3.6845 us/op 9.9780 us/op 0.37
OrderedSet add up to 128 items then delete last 5.5967 us/op 12.264 us/op 0.46
Set add up to 128 items then delete middle 3.6319 us/op 10.612 us/op 0.34
OrderedSet add up to 128 items then delete middle 10.396 us/op 18.836 us/op 0.55
Set add up to 256 items then delete first 7.5370 us/op 21.874 us/op 0.34
OrderedSet add up to 256 items then delete first 12.294 us/op 31.049 us/op 0.40
Set add up to 256 items then delete last 7.3350 us/op 21.717 us/op 0.34
OrderedSet add up to 256 items then delete last 11.595 us/op 29.498 us/op 0.39
Set add up to 256 items then delete middle 7.2476 us/op 21.927 us/op 0.33
OrderedSet add up to 256 items then delete middle 30.048 us/op 54.350 us/op 0.55
transfer serialized Status (84 B) 1.4540 us/op 2.0850 us/op 0.70
copy serialized Status (84 B) 1.1540 us/op 1.5710 us/op 0.73
transfer serialized SignedVoluntaryExit (112 B) 1.5260 us/op 2.1800 us/op 0.70
copy serialized SignedVoluntaryExit (112 B) 1.1840 us/op 1.6280 us/op 0.73
transfer serialized ProposerSlashing (416 B) 1.7220 us/op 2.9100 us/op 0.59
copy serialized ProposerSlashing (416 B) 1.5080 us/op 2.3870 us/op 0.63
transfer serialized Attestation (485 B) 1.7600 us/op 3.0320 us/op 0.58
copy serialized Attestation (485 B) 2.1330 us/op 2.7700 us/op 0.77
transfer serialized AttesterSlashing (33232 B) 2.5080 us/op 2.6820 us/op 0.94
copy serialized AttesterSlashing (33232 B) 7.8460 us/op 10.100 us/op 0.78
transfer serialized Small SignedBeaconBlock (128000 B) 1.9810 us/op 3.3350 us/op 0.59
copy serialized Small SignedBeaconBlock (128000 B) 14.522 us/op 35.557 us/op 0.41
transfer serialized Avg SignedBeaconBlock (200000 B) 1.9700 us/op 3.9050 us/op 0.50
copy serialized Avg SignedBeaconBlock (200000 B) 26.819 us/op 56.749 us/op 0.47
transfer serialized BlobsSidecar (524380 B) 2.3080 us/op 5.2060 us/op 0.44
copy serialized BlobsSidecar (524380 B) 76.575 us/op 157.04 us/op 0.49
transfer serialized Big SignedBeaconBlock (1000000 B) 2.5070 us/op 5.6050 us/op 0.45
copy serialized Big SignedBeaconBlock (1000000 B) 201.26 us/op 266.79 us/op 0.75
pass gossip attestations to forkchoice per slot 2.8464 ms/op 4.4621 ms/op 0.64
forkChoice updateHead vc 100000 bc 64 eq 0 446.30 us/op 857.92 us/op 0.52
forkChoice updateHead vc 600000 bc 64 eq 0 3.1029 ms/op 6.4958 ms/op 0.48
forkChoice updateHead vc 1000000 bc 64 eq 0 4.5951 ms/op 9.9555 ms/op 0.46
forkChoice updateHead vc 600000 bc 320 eq 0 2.7178 ms/op 5.5103 ms/op 0.49
forkChoice updateHead vc 600000 bc 1200 eq 0 2.7840 ms/op 4.9572 ms/op 0.56
forkChoice updateHead vc 600000 bc 7200 eq 0 3.7199 ms/op 7.0997 ms/op 0.52
forkChoice updateHead vc 600000 bc 64 eq 1000 9.7111 ms/op 12.016 ms/op 0.81
forkChoice updateHead vc 600000 bc 64 eq 10000 9.5623 ms/op 12.652 ms/op 0.76
forkChoice updateHead vc 600000 bc 64 eq 300000 12.419 ms/op 19.217 ms/op 0.65
computeDeltas 500000 validators 300 proto nodes 3.0440 ms/op 7.6705 ms/op 0.40
computeDeltas 500000 validators 1200 proto nodes 2.9232 ms/op 7.7129 ms/op 0.38
computeDeltas 500000 validators 7200 proto nodes 2.9277 ms/op 7.7964 ms/op 0.38
computeDeltas 750000 validators 300 proto nodes 4.4411 ms/op 11.200 ms/op 0.40
computeDeltas 750000 validators 1200 proto nodes 4.5072 ms/op 10.992 ms/op 0.41
computeDeltas 750000 validators 7200 proto nodes 4.5223 ms/op 11.321 ms/op 0.40
computeDeltas 1400000 validators 300 proto nodes 8.6306 ms/op 21.191 ms/op 0.41
computeDeltas 1400000 validators 1200 proto nodes 9.0239 ms/op 23.714 ms/op 0.38
computeDeltas 1400000 validators 7200 proto nodes 9.2520 ms/op 21.121 ms/op 0.44
computeDeltas 2100000 validators 300 proto nodes 13.610 ms/op 33.867 ms/op 0.40
computeDeltas 2100000 validators 1200 proto nodes 13.983 ms/op 32.824 ms/op 0.43
computeDeltas 2100000 validators 7200 proto nodes 14.996 ms/op 32.033 ms/op 0.47
altair processAttestation - 250000 vs - 7PWei normalcase 2.3727 ms/op 3.5696 ms/op 0.66
altair processAttestation - 250000 vs - 7PWei worstcase 3.4310 ms/op 4.6604 ms/op 0.74
altair processAttestation - setStatus - 1/6 committees join 104.42 us/op 197.61 us/op 0.53
altair processAttestation - setStatus - 1/3 committees join 204.31 us/op 432.89 us/op 0.47
altair processAttestation - setStatus - 1/2 committees join 299.30 us/op 590.04 us/op 0.51
altair processAttestation - setStatus - 2/3 committees join 401.93 us/op 679.38 us/op 0.59
altair processAttestation - setStatus - 4/5 committees join 527.95 us/op 917.64 us/op 0.58
altair processAttestation - setStatus - 100% committees join 605.08 us/op 1.1655 ms/op 0.52
altair processBlock - 250000 vs - 7PWei normalcase 8.9464 ms/op 11.978 ms/op 0.75
altair processBlock - 250000 vs - 7PWei normalcase hashState 34.752 ms/op 40.631 ms/op 0.86
altair processBlock - 250000 vs - 7PWei worstcase 34.105 ms/op 51.440 ms/op 0.66
altair processBlock - 250000 vs - 7PWei worstcase hashState 85.500 ms/op 123.08 ms/op 0.69
phase0 processBlock - 250000 vs - 7PWei normalcase 2.7137 ms/op 3.5723 ms/op 0.76
phase0 processBlock - 250000 vs - 7PWei worstcase 25.105 ms/op 41.535 ms/op 0.60
altair processEth1Data - 250000 vs - 7PWei normalcase 361.92 us/op 1.0308 ms/op 0.35
getExpectedWithdrawals 250000 eb:1,eth1:1,we:0,wn:0,smpl:15 8.4260 us/op 18.811 us/op 0.45
getExpectedWithdrawals 250000 eb:0.95,eth1:0.1,we:0.05,wn:0,smpl:219 59.165 us/op 72.328 us/op 0.82
getExpectedWithdrawals 250000 eb:0.95,eth1:0.3,we:0.05,wn:0,smpl:42 10.909 us/op 31.929 us/op 0.34
getExpectedWithdrawals 250000 eb:0.95,eth1:0.7,we:0.05,wn:0,smpl:18 7.6090 us/op 17.091 us/op 0.45
getExpectedWithdrawals 250000 eb:0.1,eth1:0.1,we:0,wn:0,smpl:1020 183.19 us/op 305.01 us/op 0.60
getExpectedWithdrawals 250000 eb:0.03,eth1:0.03,we:0,wn:0,smpl:11777 1.0830 ms/op 1.9383 ms/op 0.56
getExpectedWithdrawals 250000 eb:0.01,eth1:0.01,we:0,wn:0,smpl:16384 1.3429 ms/op 2.5102 ms/op 0.53
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,smpl:16384 1.1874 ms/op 2.1626 ms/op 0.55
getExpectedWithdrawals 250000 eb:0,eth1:0,we:0,wn:0,nocache,smpl:16384 2.9299 ms/op 4.8736 ms/op 0.60
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,smpl:16384 1.8463 ms/op 3.4106 ms/op 0.54
getExpectedWithdrawals 250000 eb:0,eth1:1,we:0,wn:0,nocache,smpl:16384 4.9762 ms/op 8.4965 ms/op 0.59
Tree 40 250000 create 289.47 ms/op 825.72 ms/op 0.35
Tree 40 250000 get(125000) 134.24 ns/op 237.56 ns/op 0.57
Tree 40 250000 set(125000) 913.58 ns/op 2.6310 us/op 0.35
Tree 40 250000 toArray() 25.791 ms/op 27.695 ms/op 0.93
Tree 40 250000 iterate all - toArray() + loop 24.943 ms/op 28.573 ms/op 0.87
Tree 40 250000 iterate all - get(i) 43.992 ms/op 85.780 ms/op 0.51
MutableVector 250000 create 13.552 ms/op 20.596 ms/op 0.66
MutableVector 250000 get(125000) 6.3430 ns/op 7.1270 ns/op 0.89
MutableVector 250000 set(125000) 219.08 ns/op 481.30 ns/op 0.46
MutableVector 250000 toArray() 3.2874 ms/op 4.5961 ms/op 0.72
MutableVector 250000 iterate all - toArray() + loop 2.6511 ms/op 4.9020 ms/op 0.54
MutableVector 250000 iterate all - get(i) 1.3721 ms/op 1.7877 ms/op 0.77
Array 250000 create 2.1859 ms/op 4.2794 ms/op 0.51
Array 250000 clone - spread 1.1590 ms/op 1.9074 ms/op 0.61
Array 250000 get(125000) 1.0580 ns/op 2.5730 ns/op 0.41
Array 250000 set(125000) 1.2650 ns/op 6.1770 ns/op 0.20
Array 250000 iterate all - loop 154.81 us/op 194.58 us/op 0.80
effectiveBalanceIncrements clone Uint8Array 300000 14.201 us/op 77.572 us/op 0.18
effectiveBalanceIncrements clone MutableVector 300000 427.00 ns/op 500.00 ns/op 0.85
effectiveBalanceIncrements rw all Uint8Array 300000 187.89 us/op 226.13 us/op 0.83
effectiveBalanceIncrements rw all MutableVector 300000 71.149 ms/op 215.42 ms/op 0.33
phase0 afterProcessEpoch - 250000 vs - 7PWei 78.770 ms/op 135.44 ms/op 0.58
phase0 beforeProcessEpoch - 250000 vs - 7PWei 42.851 ms/op 70.349 ms/op 0.61
altair processEpoch - mainnet_e81889 492.06 ms/op 656.01 ms/op 0.75
mainnet_e81889 - altair beforeProcessEpoch 90.342 ms/op 118.40 ms/op 0.76
mainnet_e81889 - altair processJustificationAndFinalization 18.163 us/op 31.417 us/op 0.58
mainnet_e81889 - altair processInactivityUpdates 4.6709 ms/op 10.094 ms/op 0.46
mainnet_e81889 - altair processRewardsAndPenalties 57.477 ms/op 87.516 ms/op 0.66
mainnet_e81889 - altair processRegistryUpdates 4.2470 us/op 5.6340 us/op 0.75
mainnet_e81889 - altair processSlashings 857.00 ns/op 1.4240 us/op 0.60
mainnet_e81889 - altair processEth1DataReset 1.1280 us/op 1.1130 us/op 1.01
mainnet_e81889 - altair processEffectiveBalanceUpdates 1.1073 ms/op 2.1467 ms/op 0.52
mainnet_e81889 - altair processSlashingsReset 5.8860 us/op 7.8000 us/op 0.75
mainnet_e81889 - altair processRandaoMixesReset 9.0710 us/op 10.038 us/op 0.90
mainnet_e81889 - altair processHistoricalRootsUpdate 1.0040 us/op 2.7170 us/op 0.37
mainnet_e81889 - altair processParticipationFlagUpdates 2.2730 us/op 4.2580 us/op 0.53
mainnet_e81889 - altair processSyncCommitteeUpdates 837.00 ns/op 1.5170 us/op 0.55
mainnet_e81889 - altair afterProcessEpoch 85.373 ms/op 149.70 ms/op 0.57
capella processEpoch - mainnet_e217614 2.0536 s/op 2.8465 s/op 0.72
mainnet_e217614 - capella beforeProcessEpoch 556.94 ms/op 594.95 ms/op 0.94
mainnet_e217614 - capella processJustificationAndFinalization 20.621 us/op 22.023 us/op 0.94
mainnet_e217614 - capella processInactivityUpdates 34.239 ms/op 26.520 ms/op 1.29
mainnet_e217614 - capella processRewardsAndPenalties 519.14 ms/op 499.88 ms/op 1.04
mainnet_e217614 - capella processRegistryUpdates 24.377 us/op 39.952 us/op 0.61
mainnet_e217614 - capella processSlashings 1.0750 us/op 1.5590 us/op 0.69
mainnet_e217614 - capella processEth1DataReset 688.00 ns/op 849.00 ns/op 0.81
mainnet_e217614 - capella processEffectiveBalanceUpdates 6.2323 ms/op 6.5691 ms/op 0.95
mainnet_e217614 - capella processSlashingsReset 4.7860 us/op 6.5560 us/op 0.73
mainnet_e217614 - capella processRandaoMixesReset 7.3890 us/op 9.9950 us/op 0.74
mainnet_e217614 - capella processHistoricalRootsUpdate 681.00 ns/op 1.5580 us/op 0.44
mainnet_e217614 - capella processParticipationFlagUpdates 4.0760 us/op 3.5570 us/op 1.15
mainnet_e217614 - capella afterProcessEpoch 260.85 ms/op 346.44 ms/op 0.75
phase0 processEpoch - mainnet_e58758 639.89 ms/op 589.26 ms/op 1.09
mainnet_e58758 - phase0 beforeProcessEpoch 211.43 ms/op 194.03 ms/op 1.09
mainnet_e58758 - phase0 processJustificationAndFinalization 31.910 us/op 29.341 us/op 1.09
mainnet_e58758 - phase0 processRewardsAndPenalties 59.742 ms/op 50.575 ms/op 1.18
mainnet_e58758 - phase0 processRegistryUpdates 14.315 us/op 20.884 us/op 0.69
mainnet_e58758 - phase0 processSlashings 967.00 ns/op 937.00 ns/op 1.03
mainnet_e58758 - phase0 processEth1DataReset 605.00 ns/op 1.0290 us/op 0.59
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 851.79 us/op 1.6909 ms/op 0.50
mainnet_e58758 - phase0 processSlashingsReset 5.7260 us/op 5.1990 us/op 1.10
mainnet_e58758 - phase0 processRandaoMixesReset 2.9320 us/op 9.1280 us/op 0.32
mainnet_e58758 - phase0 processHistoricalRootsUpdate 801.00 ns/op 1.2100 us/op 0.66
mainnet_e58758 - phase0 processParticipationRecordUpdates 4.2830 us/op 10.924 us/op 0.39
mainnet_e58758 - phase0 afterProcessEpoch 71.333 ms/op 111.41 ms/op 0.64
phase0 processEffectiveBalanceUpdates - 250000 normalcase 1.0373 ms/op 1.8743 ms/op 0.55
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.1572 ms/op 1.7940 ms/op 0.65
altair processInactivityUpdates - 250000 normalcase 27.837 ms/op 38.670 ms/op 0.72
altair processInactivityUpdates - 250000 worstcase 25.359 ms/op 36.643 ms/op 0.69
phase0 processRegistryUpdates - 250000 normalcase 8.0700 us/op 14.716 us/op 0.55
phase0 processRegistryUpdates - 250000 badcase_full_deposits 500.40 us/op 667.65 us/op 0.75
phase0 processRegistryUpdates - 250000 worstcase 0.5 143.62 ms/op 215.66 ms/op 0.67
altair processRewardsAndPenalties - 250000 normalcase 53.950 ms/op 75.289 ms/op 0.72
altair processRewardsAndPenalties - 250000 worstcase 60.223 ms/op 89.431 ms/op 0.67
phase0 getAttestationDeltas - 250000 normalcase 7.9302 ms/op 13.079 ms/op 0.61
phase0 getAttestationDeltas - 250000 worstcase 6.6845 ms/op 12.186 ms/op 0.55
phase0 processSlashings - 250000 worstcase 92.422 us/op 112.16 us/op 0.82
altair processSyncCommitteeUpdates - 250000 99.261 ms/op 190.95 ms/op 0.52
BeaconState.hashTreeRoot - No change 582.00 ns/op 791.00 ns/op 0.74
BeaconState.hashTreeRoot - 1 full validator 143.43 us/op 183.88 us/op 0.78
BeaconState.hashTreeRoot - 32 full validator 1.3436 ms/op 1.8833 ms/op 0.71
BeaconState.hashTreeRoot - 512 full validator 13.423 ms/op 22.440 ms/op 0.60
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 168.25 us/op 215.96 us/op 0.78
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 2.3693 ms/op 2.9500 ms/op 0.80
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 26.862 ms/op 32.372 ms/op 0.83
BeaconState.hashTreeRoot - 1 balances 114.19 us/op 153.18 us/op 0.75
BeaconState.hashTreeRoot - 32 balances 1.0076 ms/op 1.5809 ms/op 0.64
BeaconState.hashTreeRoot - 512 balances 10.806 ms/op 14.289 ms/op 0.76
BeaconState.hashTreeRoot - 250000 balances 190.66 ms/op 246.21 ms/op 0.77
aggregationBits - 2048 els - zipIndexesInBitList 24.162 us/op 24.722 us/op 0.98
byteArrayEquals 32 64.294 ns/op 83.157 ns/op 0.77
Buffer.compare 32 37.447 ns/op 62.619 ns/op 0.60
byteArrayEquals 1024 1.7226 us/op 2.2774 us/op 0.76
Buffer.compare 1024 42.878 ns/op 78.280 ns/op 0.55
byteArrayEquals 16384 27.497 us/op 38.385 us/op 0.72
Buffer.compare 16384 232.68 ns/op 292.68 ns/op 0.80
byteArrayEquals 123687377 208.99 ms/op 269.92 ms/op 0.77
Buffer.compare 123687377 8.1136 ms/op 7.5315 ms/op 1.08
byteArrayEquals 32 - diff last byte 70.145 ns/op 78.045 ns/op 0.90
Buffer.compare 32 - diff last byte 40.347 ns/op 61.319 ns/op 0.66
byteArrayEquals 1024 - diff last byte 1.7775 us/op 2.1967 us/op 0.81
Buffer.compare 1024 - diff last byte 46.903 ns/op 80.672 ns/op 0.58
byteArrayEquals 16384 - diff last byte 27.500 us/op 36.444 us/op 0.75
Buffer.compare 16384 - diff last byte 219.58 ns/op 284.54 ns/op 0.77
byteArrayEquals 123687377 - diff last byte 228.94 ms/op 267.23 ms/op 0.86
Buffer.compare 123687377 - diff last byte 7.9645 ms/op 8.5816 ms/op 0.93
byteArrayEquals 32 - random bytes 6.1510 ns/op 6.2170 ns/op 0.99
Buffer.compare 32 - random bytes 40.681 ns/op 67.467 ns/op 0.60
byteArrayEquals 1024 - random bytes 6.1220 ns/op 6.4710 ns/op 0.95
Buffer.compare 1024 - random bytes 39.040 ns/op 69.105 ns/op 0.56
byteArrayEquals 16384 - random bytes 6.1030 ns/op 6.6690 ns/op 0.92
Buffer.compare 16384 - random bytes 38.901 ns/op 71.251 ns/op 0.55
byteArrayEquals 123687377 - random bytes 22.700 ns/op 11.310 ns/op 2.01
Buffer.compare 123687377 - random bytes 55.320 ns/op 83.640 ns/op 0.66
regular array get 100000 times 47.664 us/op 55.872 us/op 0.85
wrappedArray get 100000 times 44.436 us/op 53.157 us/op 0.84
arrayWithProxy get 100000 times 11.455 ms/op 16.169 ms/op 0.71
ssz.Root.equals 63.015 ns/op 60.413 ns/op 1.04
byteArrayEquals 61.625 ns/op 58.454 ns/op 1.05
Buffer.compare 11.605 ns/op 12.766 ns/op 0.91
shuffle list - 16384 els 5.1972 ms/op 7.8010 ms/op 0.67
shuffle list - 250000 els 75.285 ms/op 119.74 ms/op 0.63
processSlot - 1 slots 16.849 us/op 20.431 us/op 0.82
processSlot - 32 slots 3.0482 ms/op 4.2163 ms/op 0.72
getEffectiveBalanceIncrementsZeroInactive - 250000 vs - 7PWei 60.359 ms/op 73.941 ms/op 0.82
getCommitteeAssignments - req 1 vs - 250000 vc 2.2895 ms/op 2.9873 ms/op 0.77
getCommitteeAssignments - req 100 vs - 250000 vc 3.4365 ms/op 4.3612 ms/op 0.79
getCommitteeAssignments - req 1000 vs - 250000 vc 3.7186 ms/op 4.9161 ms/op 0.76
findModifiedValidators - 10000 modified validators 448.97 ms/op 707.72 ms/op 0.63
findModifiedValidators - 1000 modified validators 354.75 ms/op 581.54 ms/op 0.61
findModifiedValidators - 100 modified validators 339.20 ms/op 527.96 ms/op 0.64
findModifiedValidators - 10 modified validators 349.69 ms/op 524.89 ms/op 0.67
findModifiedValidators - 1 modified validators 332.22 ms/op 492.73 ms/op 0.67
findModifiedValidators - no difference 337.20 ms/op 521.52 ms/op 0.65
compare ViewDUs 4.3786 s/op 5.4488 s/op 0.80
compare each validator Uint8Array 1.8333 s/op 2.2079 s/op 0.83
compare ViewDU to Uint8Array 1.3247 s/op 1.5405 s/op 0.86
migrate state 1000000 validators, 24 modified, 0 new 803.12 ms/op 1.0034 s/op 0.80
migrate state 1000000 validators, 1700 modified, 1000 new 1.0048 s/op 1.5094 s/op 0.67
migrate state 1000000 validators, 3400 modified, 2000 new 1.2307 s/op 1.7061 s/op 0.72
migrate state 1500000 validators, 24 modified, 0 new 710.88 ms/op 945.28 ms/op 0.75
migrate state 1500000 validators, 1700 modified, 1000 new 839.86 ms/op 1.4068 s/op 0.60
migrate state 1500000 validators, 3400 modified, 2000 new 1.1872 s/op 1.5829 s/op 0.75
RootCache.getBlockRootAtSlot - 250000 vs - 7PWei 4.6400 ns/op 5.6000 ns/op 0.83
state getBlockRootAtSlot - 250000 vs - 7PWei 974.59 ns/op 801.24 ns/op 1.22
computeProposers - vc 250000 6.9701 ms/op 11.073 ms/op 0.63
computeEpochShuffling - vc 250000 69.345 ms/op 115.78 ms/op 0.60
getNextSyncCommittee - vc 250000 110.56 ms/op 198.24 ms/op 0.56
computeSigningRoot for AttestationData 24.834 us/op 32.855 us/op 0.76
hash AttestationData serialized data then Buffer.toString(base64) 1.2574 us/op 2.5981 us/op 0.48
toHexString serialized data 831.78 ns/op 1.4861 us/op 0.56
Buffer.toString(base64) 159.96 ns/op 310.45 ns/op 0.52

by benchmarkbot/action

@twoeths
Copy link
Contributor

twoeths commented Feb 1, 2024

gossip block "till received" is somehow 150ms later than unstable + stable mainnet nodes

Screenshot 2024-02-01 at 18 19 12

"till received" is created in gossipsub of network thread

@twoeths
Copy link
Contributor

twoeths commented Feb 2, 2024

also rss is ~2GB higher than unstable

Screenshot 2024-02-02 at 09 25 10

@matthewkeil
Copy link
Member Author

matthewkeil commented Feb 8, 2024

Comparing against stable because unstable has a performance regression. Not sure if that got resolved yet.

All metrics are over a 12hr timeline and 1h moving average with stable-mainnet first and then feat2-mainnet after:

Screenshot 2024-02-08 at 7 49 48 PM

Screenshot 2024-02-08 at 7 50 02 PM

Screenshot 2024-02-08 at 7 52 41 PM

Screenshot 2024-02-08 at 7 52 46 PM

Screenshot 2024-02-08 at 7 45 18 PM

Screenshot 2024-02-08 at 7 45 25 PM

Screenshot 2024-02-08 at 8 01 58 PM

Screenshot 2024-02-08 at 8 01 30 PM

Screenshot 2024-02-08 at 7 51 26 PM

Screenshot 2024-02-08 at 7 51 38 PM

Screenshot 2024-02-08 at 7 46 14 PM

Screenshot 2024-02-08 at 7 47 53 PM

Screenshot 2024-02-08 at 7 46 48 PM

Screenshot 2024-02-08 at 7 47 38 PM

Screenshot 2024-02-08 at 7 50 20 PM

Screenshot 2024-02-08 at 7 50 28 PM

@matthewkeil
Copy link
Member Author

There are some metrics that do not look better and they are listed below. I am trying to figure out why the memory is higher. I think a piece is that the NAPI objects are larger but it does not account for the whole increase in RSS. I think the other part is how RSS is reported for worker threads but I am not sure. I am working on a way to investigate this using the kernel to determine overall memory consumption from outside of the process (from the kernel's perspective). Will update when I have findings.

Timeline is same as above, 12hr with 1hr rate() interval with stable first and feat2 second.

Screenshot 2024-02-08 at 8 04 04 PM

Screenshot 2024-02-08 at 8 04 11 PM

Screenshot 2024-02-08 at 8 04 26 PM

Screenshot 2024-02-08 at 8 04 33 PM

@twoeths
Copy link
Contributor

twoeths commented Feb 19, 2024

memory on service deployment is usually higher than that of docker ChainSafe/js-libp2p-gossipsub#468 (comment)

just reviewed the memory in the last 2 days, rss has been stable and decreased over time

Screenshot 2024-02-19 at 11 19 43

@wemeetagain
Copy link
Member

@matthewkeil matthewkeil deleted the mkeil/test-blst-7 branch June 2, 2024 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants