Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on bootstrap - X Chain - "expected block ... to be a snowman.OracleBlock but it's a *executor.Block" #1668

Closed
krakenson opened this issue Jun 29, 2023 · 3 comments · Fixed by #1683
Assignees
Labels
bug Something isn't working
Milestone

Comments

@krakenson
Copy link

Describe the bug
Started bootstrapping an archive node with clean DB on version 1.10.1.
It was bootstrapping uninterrupted since 2023-05-22, until today, when it crashed with:

[06-29|11:12:59.728] ERROR <X Chain> bootstrap/block_job.go:91 block failed verification during bootstrapping {"blkID": "2YW8SCtPHnQt7HHupseAa2SNWQScxradHVwpjT3WQKFbmnTvcy", "
error": "unexpected proposer block type: expected block ZyEMa4zg9a4Q8puPLwZGELwAEbHNqafqkwjiSiidWGNZWCHfp to be a snowman.OracleBlock but it's a *executor.Block"}

This is followed by a fatal error log based on same error (see full log below), and node proceeds to shutdown. Same error is returned on restart.
Upgrading to 1.10.3 gives the same error but also logs a broken looking 50K character line as details on the FATAL log line. Same with v1.10.4-rc.3.

Two other archive nodes are syncing normally on versions 1.10.0 and 1.10.3 with same config, but these were bootstrapped a couple of months ago.

Also, noticed that #1581 has ENGINE_TYPE_AVALANCHE while our logs show ENGINE_TYPE_SNOWMAN, so please advise if some other config is needed.

Thanks!

To Reproduce
Bootstrap node on version 1.10.1

Expected behavior
Node to bootstrap and keep syncing.

Logs
Initial crash:

[06-29|11:04:01.851] INFO <C Chain> queue/jobs.go:203 executing operations {"numExecuted": 29879510, "numToExecute": 30369655, "eta": "12h57m34s"}
[06-29|11:04:32.834] INFO <C Chain> queue/jobs.go:203 executing operations {"numExecuted": 29879586, "numToExecute": 30369655, "eta": "12h57m27s"}
[06-29|11:05:03.821] INFO <C Chain> queue/jobs.go:203 executing operations {"numExecuted": 29879705, "numToExecute": 30369655, "eta": "12h57m16s"}
[06-29|11:05:33.822] INFO <C Chain> queue/jobs.go:203 executing operations {"numExecuted": 29880026, "numToExecute": 30369655, "eta": "12h56m46s"}
[06-29|11:05:53.768] INFO <C Chain> github.com/ava-labs/coreth/plugin/evm/atomic_trie.go:209: committed atomic trie root=0xb3ef4dcf09943b64dfe0dc4db8a14186e1357ab551e476874b3ea3a977faec08 height=29,880,320
[06-29|11:06:03.838] INFO <C Chain> queue/jobs.go:203 executing operations {"numExecuted": 29880488, "numToExecute": 30369655, "eta": "12h56m2s"}
[06-29|11:06:12.442] ERROR <X Chain> bootstrap/block_job.go:91 block failed verification during bootstrapping {"blkID": "2YW8SCtPHnQt7HHupseAa2SNWQScxradHVwpjT3WQKFbmnTvcy", "error": "unexpected proposer block type: expected block ZyEMa4zg9a4Q8puPLwZGELwAEbHNqafqkwjiSiidWGNZWCHfp to be a snowman.OracleBlock but it's a *executor.Block">
[06-29|11:06:12.442] FATAL <X Chain> handler/handler.go:335 shutting down chain {"reason": "received an unexpected error", "error": "failed to execute job 2YW8SCtPHnQt7HHupseAa2SNWQScxradHVwpjT3WQKFbmnTvcy due to failed t>
Jun 29 12:01:17 nuremberg-5 start.sh[2156708]: [06-29|12:01:17.660] FATAL <X Chain> handler/handler.go:335 shutting down chain {"reason": "received an unexpected error", "error": "failed to execute job 2YW8SCtPHnQt7HHupseAa2SNWQScxradHVwpjT3WQKFbmnTvcy due to failed to verify block in bootstrapping: unexpected proposer block type: expected block ZyEMa4zg9a4Q8puPLwZGELwAEbHNqafqkwjiSiidWGNZWCHfp to be a snowman.OracleBlock but it's a *executor.Block while processing sync message: {%!s(*message.inboundMessage=&{[212 253 108 252 208 161 106 204 217 184 55 157 114 29 227 246 77 217 184 235] 19 0xc0599fa720 {0 9223372036854775807 0x34d9dc0} 0xc22a00 579723}) ENGINE_TYPE_SNOWMAN}"}
[06-29|11:06:12.442] INFO <X Chain> bootstrap/bootstrapper.go:297 shutting down bootstrapper
[06-29|11:06:13.593] INFO <P Chain> bootstrap/bootstrapper.go:297 shutting down bootstrapper
[06-29|11:06:16.708] INFO <C Chain> bootstrap/bootstrapper.go:297 shutting down bootstrapper

On upgrade to 1.10.3:

[06-29|11:21:11.275] INFO <X Chain> bootstrap/bootstrapper.go:115 starting bootstrapper
[06-29|11:21:15.865] INFO health/worker.go:258 check started passing {"namespace": "health", "name": "network", "tags": ["application"]}
[06-29|11:21:30.981] INFO <C Chain> common/bootstrapper.go:254 bootstrapping started syncing {"numVerticesInFrontier": 2}
[06-29|11:21:31.150] INFO <X Chain> common/bootstrapper.go:254 bootstrapping started syncing {"numVerticesInFrontier": 1}
[06-29|11:21:32.648] INFO <C Chain> bootstrap/bootstrapper.go:495 fetching blocks {"numFetchedBlocks": 490000, "numTotalBlocks": 2071767, "eta": "4h35m18s"}
[06-29|11:21:33.501] INFO <X Chain> bootstrap/bootstrapper.go:554 executing blocks {"numPendingJobs": 14}
[06-29|11:21:33.501] ERROR <X Chain> bootstrap/block_job.go:91 block failed verification during bootstrapping {"blkID": "2YW8SCtPHnQt7HHupseAa2SNWQScxradHVwpjT3WQKFbmnTvcy", "error": "unexpected proposer block type: expected block ZyEMa4zg9a4Q8puPLwZGELwAEbHNqafqkwjiSiidWGNZWCHfp to be a snowman.OracleBlock but it's a *executor.Block"}
[06-29|11:21:33.566] FATAL <X Chain> handler/handler.go:336 shutting down chain {"reason": "received an unexpected error", "error": "failed to execute job 2YW8SCtPHnQt7HHupseAa2SNWQScxradHVwpjT3WQKFbmnTvcy due to failed to verify block in bootstrapping: unexpected proposer block type: expected block ZyEMa4zg9a4Q8puPLwZGELwAEbHNqafqkwjiSiidWGNZWCHfp to be a snowman.OracleBlock but it's a *executor.Block while processing sync message: NodeID-7sTpQZFCZVzhLTugv8PKhNE8tdQ394LRo Op: ancestors Message: chain_id:\"\\xed_84\\x1eCn]F\\xe2\\xbb\\x00\\xb4]b\\xae\\x97ѰP\\xc6K\\xc64\\xaex\x10bg9\\xe3\\\\K\" request_id:7 containers:x"\\x00\\x00x\x00\xx00\\x00\\x00?\xx8bc_\\xf5RB\\xb0u\\xa9Y\\xf1mZi\\x9eHuj\\xbcW\\xb2\\xb5\\x1c*\\\\\\xc8\\xff\\xda\\xfd\\xde2\\x00\\x00\\x00\\x00d\\x9dh\\xf0\\x00\\x00\\x00\\x00\\x00gn\\xd7\\x00\\x00\\x00\\x00\\x00\\x00\\x02s\\x00\\x00\\x00\\x00\\x00\\x14\\xe8&\\xadT\\xd5Ԃ\\xef!\\xe3\\xce\\xe8\\xf8\\xa6\\x04==P\\x04\\\\\\xbc\\xe0@b\\xac\\x9cc\\xf7\\xd3\\nW\\\\\\x00\\x00\\x00\\x00\\x00\\x00ْ\X\x00\\x00\\x00\\x00d\\x9dh\\xf0\\x00\\x00\\x00\\x00\\x00\\x00\x00\\x00\\x00\\x00\\x00\rx0
0\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x01\\xed_84\\x1eCn]F\\xe2\\xbb\\x00\\xb4]b\\xae\\x97ѰP\\xc6K\\xc64\\xae\\x10bg9\\xe3\\\\K\\x00\\x00\\x00\\x04!\\xe6s\\x17\\xcbľ*...

This continues for 50K chars.

Operating System
Ubuntu 22.04.2 LTS

Additional
Only C chain config is given, if relevant:

{
        "state-sync-enabled": false,
        "pruning-enabled": false,
        "eth-apis": ["eth","eth-filter","net","web3","internal-eth","internal-blockchain","internal-transaction", "debug", "debug-tracer"],
        "rpc-gas-cap": 50000000
}
@krakenson krakenson added the bug Something isn't working label Jun 29, 2023
@github-project-automation github-project-automation bot moved this to Backlog 🗄 in Ski Patrol Jun 29, 2023
@StephenButtolph StephenButtolph moved this from Backlog 🗄 to In Progress 🏗 in Ski Patrol Jun 30, 2023
@StephenButtolph StephenButtolph self-assigned this Jun 30, 2023
@StephenButtolph StephenButtolph added this to the v1.10.4 milestone Jun 30, 2023
@StephenButtolph
Copy link
Contributor

We've been digging into this and feel like we know the root cause of the DB corruption. If you still have the corrupted DB - you should be able to run #1678. It should unblock your node and allow syncing to finish.

@krakenson
Copy link
Author

@StephenButtolph Works! Node continues bootstrapping. Thank you for saving us the re-init time! 🙇

Shout if you need anything from our end - happy to revert back to broken state if you need additional testing, or report behaviour after X-chain bootstraps some more. Otherwise good to close.

@StephenButtolph
Copy link
Contributor

Awesome! There will be another PR put up for this - so it probably makes sense to leave it open until that gets put in (but your node should be fine now).

@StephenButtolph StephenButtolph linked a pull request Jul 3, 2023 that will close this issue
@github-project-automation github-project-automation bot moved this from In Progress 🏗 to Done ✅ in Ski Patrol Jul 10, 2023
ramilexe pushed a commit to ConsiderItDone/avalanchego that referenced this issue Jul 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants