-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discovery: throttle node DB commits (#3581) #3656
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, Create one more PR from stable branch with cherry-pick of this pr. Thank you
👌 some tests fail, I'm gonna check them first. |
The current solution doesn't work well, because there can be only one pending RW transaction globally. Having a separate open transaction for pings makes other writes impossible. For example, updating the local node info blocks forever. I'd either need to channel all node DB access to this event loop, or make an alternative solution. I'm considering trying something like MDBX_SAFE_NOSYNC instead. |
@battlmonstr MDBX_SAFE_NOSYNC doesn't conceptually change anything - still can't open parallel write txs. FYI, TxPool works next way:
Can you explain one more time - what is current problem with having cache in nodedb? (kick me in Discord if you need) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rejecting until some discussion
@AskAlexSharov the parallel writes are not needed anymore, I have removed that. It was a problem for the previous solution attempt with a goroutine, but I think that SafeNoSync alone solves the issue by and large. The issue was reported in discord a while ago that the initial node sync performance is not great when using a HDD. SafeNoSyncAFAIK, mdbx commit has 2 steps:
By default, mdbx tries to be 100% durable, and each Commit() does both write and fsync. P.S. If there are no new updates for a long time we risk to lose 2 seconds worth of updates. Here this doesn't seem like a big deal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree with everything
a6b4e55
to
9b56cdb
Compare
UpdateFindFails/UpdateLastPingReceived/UpdateLastPongReceived events are causing bursty DB commits (100 per minute). This optimization throttles the disk writes to happen at most once in a few seconds, because this info doesn't need to be persisted immediately. This helps on HDD drives.
UpdateFindFails/UpdateLastPingReceived/UpdateLastPongReceived events are causing bursty DB commits (100 per minute). This optimization throttles the disk writes to happen at most once in a few seconds, because this info doesn't need to be persisted immediately. This helps on HDD drives.
* Update to erigon-lib stable * Discovery: throttle node DB commits (#3581) (#3656) UpdateFindFails/UpdateLastPingReceived/UpdateLastPongReceived events are causing bursty DB commits (100 per minute). This optimization throttles the disk writes to happen at most once in a few seconds, because this info doesn't need to be persisted immediately. This helps on HDD drives. * Update erigon-lib * Discovery: split node records to a sepatate DB table (#3581) (#3667) Problem: QuerySeeds will poke 150 random entries in the whole node DB and ignore hitting "field" entries. In a bootstrap scenario it might hit hundreds of :lastping :lastpong entries, and very few true "node record" entries. After running for 15 minutes I've got totalEntryCount=1508 nodeRecordCount=114 entries. There's a 1/16 chance of hitting a "node record" entry. It means finding just about 10 nodes of 114 total on average from 150 attempts. Solution: Split "node record" entries to a separate table such that QuerySeeds doesn't do idle cycle hits. * Discovery: add Context to Listen. (#3577) Add explicit Context to ListenV4 and ListenV5. This makes it possible to stop listening by an external signal. * Discovery: refactor public key to node ID conversions. (#3634) Encode and hash logic was duplicated in multiple places. * Move encoding to p2p/discover/v4wire * Move hashing to p2p/enode/idscheme * Change newRandomLookup to create a proper random key on a curve. * Discovery: speed up lookup tests (#3677) * Update erigon-lib Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com>
* block by timestamp for stable (erigontech#3617) * Add timings of forward stages to logs (erigontech#3621) * save * save * deleted bor and starknet from doc (erigontech#3627) * add nosqlite tag (erigontech#3653) * add nosqlite tag * save * save (erigontech#3665) * save (erigontech#3663) * linter up (erigontech#3672) (erigontech#3673) * linter up (erigontech#3672) * save * save * Revert node DB cache (erigontech#3581) (erigontech#3674) (erigontech#3675) Revert "Prevent frequent commits to the node DB in sentries (erigontech#2505)". This reverts commit 65a9a26. * [stable] Fixes to discovery nodedb (erigontech#3691) * Update to erigon-lib stable * Discovery: throttle node DB commits (erigontech#3581) (erigontech#3656) UpdateFindFails/UpdateLastPingReceived/UpdateLastPongReceived events are causing bursty DB commits (100 per minute). This optimization throttles the disk writes to happen at most once in a few seconds, because this info doesn't need to be persisted immediately. This helps on HDD drives. * Update erigon-lib * Discovery: split node records to a sepatate DB table (erigontech#3581) (erigontech#3667) Problem: QuerySeeds will poke 150 random entries in the whole node DB and ignore hitting "field" entries. In a bootstrap scenario it might hit hundreds of :lastping :lastpong entries, and very few true "node record" entries. After running for 15 minutes I've got totalEntryCount=1508 nodeRecordCount=114 entries. There's a 1/16 chance of hitting a "node record" entry. It means finding just about 10 nodes of 114 total on average from 150 attempts. Solution: Split "node record" entries to a separate table such that QuerySeeds doesn't do idle cycle hits. * Discovery: add Context to Listen. (erigontech#3577) Add explicit Context to ListenV4 and ListenV5. This makes it possible to stop listening by an external signal. * Discovery: refactor public key to node ID conversions. (erigontech#3634) Encode and hash logic was duplicated in multiple places. * Move encoding to p2p/discover/v4wire * Move hashing to p2p/enode/idscheme * Change newRandomLookup to create a proper random key on a curve. * Discovery: speed up lookup tests (erigontech#3677) * Update erigon-lib Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com> * [stable] Fixes for state overrides in RPC (erigontech#3693) * State override support (erigontech#3628) * added stateOverride type * solved import cycle * refactoring * imported wrong package * fixed Call arguments * typo * override for traceCall * Fix eth call (erigontech#3618) * added isFake * using isFake instead of checkNonce * Revert "using isFake instead of checkNonce" This reverts commit 6a202bb. * Revert "added isFake" This reverts commit 2c48024. * only checking EOA if we are checking for Nonce Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com> * new bootnodes (erigontech#3591) (erigontech#3695) Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com> * Update skip analysis and preverified hashes (erigontech#3700) (erigontech#3704) Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Update version.go (erigontech#3701) * simulate future blocks: timestamp and block number incremented by 1 for calls to trace_call(Many), debug_traceCall, eth_createAccessList * expose UsedGas through trace_call and trace_callMany * expose accessList via trace_call(Many) * plumb error into trace_call(many) outer json response Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com> Co-authored-by: ledgerwatch <akhounov@gmail.com> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local>
* block by timestamp for stable (erigontech#3617) * Add timings of forward stages to logs (erigontech#3621) * save * save * deleted bor and starknet from doc (erigontech#3627) * add nosqlite tag (erigontech#3653) * add nosqlite tag * save * save (erigontech#3665) * save (erigontech#3663) * linter up (erigontech#3672) (erigontech#3673) * linter up (erigontech#3672) * save * save * Revert node DB cache (erigontech#3581) (erigontech#3674) (erigontech#3675) Revert "Prevent frequent commits to the node DB in sentries (erigontech#2505)". This reverts commit 65a9a26. * [stable] Fixes to discovery nodedb (erigontech#3691) * Update to erigon-lib stable * Discovery: throttle node DB commits (erigontech#3581) (erigontech#3656) UpdateFindFails/UpdateLastPingReceived/UpdateLastPongReceived events are causing bursty DB commits (100 per minute). This optimization throttles the disk writes to happen at most once in a few seconds, because this info doesn't need to be persisted immediately. This helps on HDD drives. * Update erigon-lib * Discovery: split node records to a sepatate DB table (erigontech#3581) (erigontech#3667) Problem: QuerySeeds will poke 150 random entries in the whole node DB and ignore hitting "field" entries. In a bootstrap scenario it might hit hundreds of :lastping :lastpong entries, and very few true "node record" entries. After running for 15 minutes I've got totalEntryCount=1508 nodeRecordCount=114 entries. There's a 1/16 chance of hitting a "node record" entry. It means finding just about 10 nodes of 114 total on average from 150 attempts. Solution: Split "node record" entries to a separate table such that QuerySeeds doesn't do idle cycle hits. * Discovery: add Context to Listen. (erigontech#3577) Add explicit Context to ListenV4 and ListenV5. This makes it possible to stop listening by an external signal. * Discovery: refactor public key to node ID conversions. (erigontech#3634) Encode and hash logic was duplicated in multiple places. * Move encoding to p2p/discover/v4wire * Move hashing to p2p/enode/idscheme * Change newRandomLookup to create a proper random key on a curve. * Discovery: speed up lookup tests (erigontech#3677) * Update erigon-lib Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com> * [stable] Fixes for state overrides in RPC (erigontech#3693) * State override support (erigontech#3628) * added stateOverride type * solved import cycle * refactoring * imported wrong package * fixed Call arguments * typo * override for traceCall * Fix eth call (erigontech#3618) * added isFake * using isFake instead of checkNonce * Revert "using isFake instead of checkNonce" This reverts commit 6a202bb. * Revert "added isFake" This reverts commit 2c48024. * only checking EOA if we are checking for Nonce Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com> * new bootnodes (erigontech#3591) (erigontech#3695) Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com> * Update skip analysis and preverified hashes (erigontech#3700) (erigontech#3704) Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Update version.go (erigontech#3701) * rpcdaemon: fix TxContext in traceBlock (erigontech#3716) Previously `txCtx` is not updated for every tx, which leads to wrong tracing results. * Mdbx: WriteMap fallback on error (erigontech#3714) * save * save * Pool cost fix (erigontech#3725) * save * save * Update to erigon-lib stable Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local> * mdbx v0.11.6 (erigontech#3771) * mdbx fix after v0.11.6 (erigontech#3775) * save * save * save * [stable] Event log subscription (erigontech#3773) * Logs sub (erigontech#3666) * save * Add onLogs * Fix lint * Add proper logs * Update go.mod * goimports * Add unwind * feat/rpcadaemon_logs_sub (erigontech#3751) * Fixes to subscribe logs (erigontech#3769) * Fixes to subscribe logs * Add criteria to logs subscription * Skeleton of RPC daemon event log distribution * Simplify * Send aggregated filter to Erigon * Change API * Print * Fixes * Fix topics filtering * Fill txHash and blockHash * Timing logs, fill tx index * Print * More print * Print * Asynchronous sending of log events to RPC daemon * Remove prints * Only extract logs if there are subscribers * Check empty when RPC daemon is removed Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Fix up * Update to erigon-lib stable * Update to erigon-lib stable Co-authored-by: primal_concrete_sledge <ryban92@gmail.com> Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Update version.go (erigontech#3776) * Update Skip analysis and preverified hashes (erigontech#3777) (erigontech#3778) * Update skip analysis * Add preverified hashes for mainnet and ropsten * preverified hashes and bootnode for sepolia Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Integration: reset StageFinish also (erigontech#3783) * docker hub - fetch git tags before build erigontech#3781 * fix nil pointer in fetch.go (erigontech#3802) * Update preverified hashes and skip analysis (erigontech#3831) (erigontech#3832) * Update skip_analysis * Preverified hashes Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Fix 'all defaults' case for eth_estimateGas (erigontech#3790) (erigontech#3824) * Fix 'all defaults' case for eth_estimateGas * fix tests Co-authored-by: Igor Mandrigin <i@mandrigin.ru> Co-authored-by: Igor Mandrigin <mandrigin@users.noreply.github.com> Co-authored-by: Igor Mandrigin <i@mandrigin.ru> * Update version.go (erigontech#3829) * Change libmdbx submodule origin (erigontech#3894) * save * Restore testdata Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Update to erigon-lib stable (erigontech#3895) Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Update version.go (erigontech#3896) * Update skip_analysis.go (erigontech#3897) (erigontech#3898) * save (erigontech#3904) * [stable] Fixes for header download (erigontech#3911) * Rollback preverified hashes for mainnet * Not remove header * Set verified = true * Fix verified extendUp and connect * Skip already persisted links * Prevent rewriting historical headers * Not load links after highestInDb * Restore preverified * Fix tests * Fix error handling Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * save (erigontech#3916) * Update libmdbx source (erigontech#3974) Same change as already merged in `devel` * Makefile (erigontech#3779): pass docker build arguments (erigontech#4239) Dockerfile requires some --build-arg options. Fix "docker" target to pass them. Fix GIT_TAG to reflect the most recent tag related to HEAD, instead of an unrelated most recent tag. Use it as the image VERSION. Image tags need to be passed explicitly if needed: DOCKER_FLAGS='-t erigon:latest' make docker * save (erigontech#4346) * Gray Glacier bomb delay (erigontech#4444) * Update version.go on stable branch (erigontech#4447) * Update version.go * Fix lint Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> * Clean up * in transaction execution, subtract from account balance only after enough gaspool is ensured (erigontech#4450) - noticed the difference when executing testdata#10 in go-ethereum and erigon * Update skip_analysis.go (erigontech#4452) * Adjust version Co-authored-by: Enrique Jose Avila Asapche <eavilaasapche@gmail.com> Co-authored-by: Alex Sharov <AskAlexSharov@gmail.com> Co-authored-by: battlmonstr <battlmonstr@users.noreply.github.com> Co-authored-by: ledgerwatch <akhounov@gmail.com> Co-authored-by: Alexey Sharp <alexeysharp@Alexeys-iMac.local> Co-authored-by: can <can@canx.me> Co-authored-by: Alex Sharp <alexsharp@Alexs-MacBook-Pro.local> Co-authored-by: primal_concrete_sledge <ryban92@gmail.com> Co-authored-by: Igor Mandrigin <mandrigin@users.noreply.github.com> Co-authored-by: Igor Mandrigin <i@mandrigin.ru> Co-authored-by: Andrea Lanfranchi <andrea.lanfranchi@gmail.com> Co-authored-by: Andrew Ashikhmin <34320705+yperbasis@users.noreply.github.com> Co-authored-by: sudeep <sudeepdino008@gmail.com>
UpdateFindFails/UpdateLastPingReceived/UpdateLastPongReceived events
are causing bursty DB commits (100 per minute).
This optimization throttles the commits to happen at most once in a few seconds,
because this info doesn't need to be persisted immediately.