Strange behaviour with event_observer blocking node sync #5663

wileyj · 2025-01-06T17:56:56Z

Infra setup is like this:
VM:
4vcpu
8GB memory
500GB disk
stacks-core running with a more or less default config with the following event_observer block:

[[events_observer]]
endpoint = "stacks-blockchain-api:3700"
#retry_count = 255
events_keys = ["*"]
timeout_ms = 300_000

stacks-core and the API are both running in docker via docker-compose, with an externally hosted DB that the API is writing to.
Both services are on the latest released versions.

Until recently (roughly 1 week ago - around dec 26th, 2024) , this setup was able to stay at chain tip reliably. around that day, i noticed it was falling behind, and the logs weren't much help in determining why.
on the API logs:

{"level":"info","time":"2024-12-31T16:44:41.312Z","pid":20,"hostname":"40666f4f946a","name":"stacks-blockchain-api","component":"core-api","microblocks":0,"tx_total":2,"txs":{"token_transfer":0,"smart_contract":0,"contract_call":2,"poison_microblock":0,"coinbase":0,"coinbase_to_alt_recipient":0,"versioned_smart_contract":0,"tenure_change":0,"nakamoto_coinbase":0},"event_total":22,"events":{"contract_event":12,"stx_transfer_event":1,"stx_mint_event":0,"stx_burn_event":0,"stx_lock_event":0,"nft_transfer_event":0,"nft_mint_event":0,"nft_burn_event":0,"ft_transfer_event":8,"ft_mint_event":0,"ft_burn_event":1},"miner_rewards":0,"msg":"Ingested block 389914 (0x5c46ceb8cd6913244af69e38a80c809c336330113c56a94748a97ee738672578) in 67ms"}
{"level":"info","time":"2024-12-31T16:44:41.315Z","pid":20,"hostname":"40666f4f946a","name":"stacks-node-event","reqId":"req-il","res":{"statusCode":200},"responseTime":71.82226181030273,"msg":"request completed"}
{"level":"info","time":"2024-12-31T16:44:42.310Z","pid":20,"hostname":"40666f4f946a","name":"stacks-blockchain-api","component":"core-api","msg":"Skipping mempool stats query because CONNECTION_DESTROYED"}
{"level":"info","time":"2024-12-31T16:44:42.666Z","pid":20,"hostname":"40666f4f946a","name":"stacks-node-event","reqId":"req-im","req":{"method":"POST","url":"/new_block","hostname":"stacks-blockchain-api:3700","remoteAddress":"172.24.0.3","remotePort":35164},"msg":"incoming request"}
{"level":"info","time":"2024-12-31T16:44:51.566Z","pid":20,"hostname":"40666f4f946a","name":"stacks-node-event","reqId":"req-in","req":{"method":"POST","url":"/new_block","hostname":"stacks-blockchain-api:3700","remoteAddress":"172.24.0.3","remotePort":48266},"msg":"incoming request"}

Skipping mempool stats query because CONNECTION_DESTROYED was interesting, but it seems to be a normal logged event in the API.

on the stacks-core side, the node was scrolling logs like this:

INFO [1735663983.278867] [stackslib/src/net/rpc.rs:551] [p2p-(0.0.0.0:20444,0.0.0.0:20443)] Handled StacksHTTPRequest, verb: POST, path: /v2/mempool/query?page_id=689b44a6a6060af29f1a36a19c6230e616077e88a70c8b608d5b5ffed4eff956, processing_time_ms: 0, latency_ms: 0, conn_id: 192, peer_addr: 129.158.214.226:37646, p2p_msg: None

system resources for the VM were all really low - no memory/load pressure and the IO usage was surprisingly low (probably due to not syncing new blocks).

restarting the services (api and stacks-core) resolved the issue, but only for about an hour or so after each restart before the mempool HTTP requests came flooding back.

the only thing that helped here was to disable ingress on 20443 (i also disabled 20444) - once those ports were firewalled, the node has stayed at chaintip without further issue.

I'm not sure if this is a bug, but i think it's interesting enough to open an issue.

Logs attached when this surfaced after a restart

exported-logs.tar.gz

The text was updated successfully, but these errors were encountered:

wileyj · 2025-02-05T17:41:15Z

this happened again actually, so i'm not 100% certain it's due to the http requests:

{"level":"info","time":"2025-01-27T04:57:31.758Z","pid":19,"hostname":"49a2b035beef","name":"stacks-node-event","reqId":"req-1rb","req":{"method":"POST","url":"/new_block","hostname":"stacks-blockchain-api:3700","remoteAddress":"172.28.0.3","remotePort":54978,"bodySize":16254},"msg":"incoming
request"}
{"level":"info","time":"2025-01-27T04:57:31.782Z","pid":19,"hostname":"49a2b035beef","name":"stacks-blockchain-api","component":"core-api","microblocks":0,"tx_total":2,"txs":{"token_transfer":1,"smart_contract":0,"contract_call":1,"poison_microblock":0,"coinbase":0,"coinbase_to_alt_recipient":0,"versioned_smart_contract":0,"tenure_change":0,"nakamoto_coinbase":0},"event_total":17,"events":{"contract_event":3,"stx_transfer_event":1,"stx_mint_event":0,"stx_burn_event":0,"stx_lock_event":0,"nft_transfer_event":0,"nft_mint_event":0,"nft_burn_event":0,"ft_transfer_event":5,"ft_mint_event":3,"ft_burn_event":5},"miner_rewards":0,"msg":"Ingested
block 519676 (0x0412f732e94b9c9eab7edea6250b51c71b1e8f5dd0e0d1046d9e9ed7f22217d1) in 23ms"}
{"level":"info","time":"2025-01-27T04:57:31.786Z","pid":19,"hostname":"49a2b035beef","name":"stacks-node-event","reqId":"req-1rb","res":{"statusCode":200,"method":"POST","url":"/new_block","requestBodySize":16254,"responseBodySize":15},"responseTime":27.630748748779297,"msg":"request
completed"}
{"level":"info","time":"2025-01-27T04:57:32.781Z","pid":19,"hostname":"49a2b035beef","name":"stacks-blockchain-api","component":"core-api","msg":"Skipping
mempool stats query because CONNECTION_DESTROYED"}
{"level":"info","time":"2025-01-27T04:57:52.939Z","pid":19,"hostname":"49a2b035beef","name":"stacks-node-event","reqId":"req-1rc","req":{"method":"POST","url":"/new_block","hostname":"stacks-blockchain-api:3700","remoteAddress":"172.28.0.3","remotePort":53006,"bodySize":6144},"msg":"incoming
request"}
{"level":"info","time":"2025-01-27T04:58:00.141Z","pid":19,"hostname":"49a2b035beef","name":"stacks-node-event","reqId":"req-1rd","req":{"method":"POST","url":"/new_block","hostname":"stacks-blockchain-api:3700","remoteAddress":"172.28.0.3","remotePort":48414,"bodySize":6144},"msg":"incoming
request"}

i can't be sure anymore this isn't an API issue either

wileyj · 2025-02-05T18:40:24Z

logs.tar.gz

wileyj · 2025-02-05T21:07:11Z

this specific issue may be related to this change in the API we're testing after the recent stall i encounterd above: hirosystems/stacks-blockchain-api#2217

wileyj · 2025-02-06T21:11:02Z

this seems to be resolved in API v8.6.0. closing

github-project-automation bot added this to Stacks Core Eng Jan 6, 2025

github-project-automation bot moved this to Status: 🆕 New in Stacks Core Eng Jan 6, 2025

wileyj mentioned this issue Feb 3, 2025

Database locked on 3.1.0.0.0 #5560

Open

aldur added this to the 3.1.0.0.6 milestone Feb 4, 2025

rafaelcr mentioned this issue Feb 5, 2025

fix: use an independent sql connection for mempool stats hirosystems/stacks-blockchain-api#2217

Merged

wileyj closed this as completed Feb 6, 2025

github-project-automation bot moved this from Status: 🆕 New to Status: ✅ Done in Stacks Core Eng Feb 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Strange behaviour with event_observer blocking node sync #5663

Strange behaviour with event_observer blocking node sync #5663

wileyj commented Jan 6, 2025

wileyj commented Feb 5, 2025

wileyj commented Feb 5, 2025

wileyj commented Feb 5, 2025

wileyj commented Feb 6, 2025

Strange behaviour with event_observer blocking node sync #5663

Strange behaviour with event_observer blocking node sync #5663

Comments

wileyj commented Jan 6, 2025

wileyj commented Feb 5, 2025

wileyj commented Feb 5, 2025

wileyj commented Feb 5, 2025

wileyj commented Feb 6, 2025