erigon not shutting down properly during execution stage #10573

taratorio · 2024-05-31T10:33:13Z

User stickx on Discord (msg) reported that killall is taking too long (a whole day) for his erigon process.

System information

Erigon version: 2.60.0-f13762b4

OS & Version: Linux

Commit hash: N/A

Erigon Command (with flags/config): erigon --datadir /data --maxpeers 200 --dbsizelimit 8tb disable ipv6 prun htc 90000 before 11052984 --internalcl

Consensus Layer: Caplin

Consensus Layer Command (with flags/config): N/A

Chain/Network: Ethereum mainnet

Expected behaviour

process to cleanly terminate after max 1-2 mins without kill -9

Actual behaviour

Process is not terminating during execution stage.
top showing 10% cpu 65% mem
Logs:

[INFO] [05-31|01:17:11.440] [4/12 Execution] Executed blocks         number=9926196 blk/s=2.4 tx/s=248.6 Mgas/s=19.9 gasState=0.00 batch=0B alloc=18.1GB sys=29.2GB
[INFO] [05-31|01:17:16.347] [p2p] GoodPeers                          eth67=31 eth66=3 eth68=54
[INFO] [05-31|01:17:16.355] [txpool] stat                            pending=0 baseFee=0 queued=30000 alloc=10.1GB sys=29.2GB
[INFO] [05-31|01:17:18.265] [mem] memory stats                       Rss=96.9GB Size=0B Pss=96.9GB SharedClean=91.1MB SharedDirty=4.0KB PrivateClean=73.9GB PrivateDirty=22.9GB Referenced=96.9GB Anonymous=22.9GB Swap=0B alloc=10.4GB sys=29.2GB
[INFO] [05-31|01:17:24.632] [4/12 Execution] Executed blocks         number=9927103 blk/s=68.8 tx/s=9323.9 Mgas/s=618.0 gasState=0.01 batch=12.3MB alloc=11.5GB sys=29.2GB
[INFO] [05-31|01:17:46.182] Got interrupt, shutting down...          sig=terminated
[INFO] [05-31|01:17:46.183] Got interrupt, shutting down... 
[WARN] [05-31|01:17:46.186] bad blocks segment received              err="context canceled"
[INFO] [05-31|01:17:46.186] Exiting Engine... 
[INFO] [05-31|01:17:46.186] Exiting... 
[INFO] [05-31|01:17:46.212] HTTP endpoint closed                     url=127.0.0.1:8545
[INFO] [05-31|01:17:46.188] RPC server shutting down 
[INFO] [05-31|01:17:46.212] RPC server shutting down 
[INFO] [05-31|01:17:46.213] RPC server shutting down 
[INFO] [05-31|01:17:46.213] Engine HTTP endpoint close               url=127.0.0.1:8551
[EROR] [05-31|01:17:46.233] Could not start execution service        err="[4/12 Execution] batch commit: loadIntoTable PlainCodeHash: stopped"
[WARN] [05-31|01:17:46.235] failed to get blobs                      app=caplin stage=ForwardSync err="context canceled"
[INFO] [05-31|01:17:46.235] [Caplin] Forward Sync                    app=caplin stage=ForwardSync progress=9191080 distance-from-chain-tip=14m48s estimated-time-remaining=5m17s
[INFO] [05-31|01:17:46.235] [Caplin] exiting clstages loop           app=caplin
[EROR] [05-31|01:17:46.235] could not start caplin                   err="context canceled"

Steps to reproduce the behaviour

Run a node and kill during execution stage.

The text was updated successfully, but these errors were encountered:

awskii · 2024-06-11T11:51:25Z

yes i think its caplin downloading blocks in background and do not read cancellation from signal handler.

There is a problem with context passing and i suppose Giulio wanted process to be finished anyway, but maybe it just was not done.

yperbasis · 2024-06-28T08:51:35Z

Should be fixed by PR #10887

taratorio added the imp2 Medium importance label May 31, 2024

taratorio added imp1 High importance and removed imp2 Medium importance labels Jun 7, 2024

awskii mentioned this issue Jun 27, 2024

allow to gracefully exit from CL downloading stage #10887

Merged

yperbasis closed this as completed Jun 28, 2024

taratorio modified the milestone: 2.60.3-fixes Jun 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

erigon not shutting down properly during execution stage #10573

erigon not shutting down properly during execution stage #10573

taratorio commented May 31, 2024 •

edited

Loading

awskii commented Jun 11, 2024

yperbasis commented Jun 28, 2024

erigon not shutting down properly during execution stage #10573

erigon not shutting down properly during execution stage #10573

Comments

taratorio commented May 31, 2024 • edited Loading

System information

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

awskii commented Jun 11, 2024

yperbasis commented Jun 28, 2024

taratorio commented May 31, 2024 •

edited

Loading