Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

op-reth/ExEx can't sync with working op-node #14220

Open
1 task done
midcurvy opened this issue Feb 4, 2025 · 14 comments
Open
1 task done

op-reth/ExEx can't sync with working op-node #14220

midcurvy opened this issue Feb 4, 2025 · 14 comments
Labels
A-exex Execution Extensions A-op-reth Related to Optimism and op-reth C-bug An unexpected or incorrect behavior

Comments

@midcurvy
Copy link

midcurvy commented Feb 4, 2025

Describe the bug

I'm experimenting with an OP stack chain-based ExEx. I have my reth_* dependencies pinned at rev = 5322877aed96804711671154658e79b08f9a1868 following the suggestion here.

My op-reth node seems to get stuck at syncing and keep repeating the following logs
INFO reth::cli: Status connected_peers=44 latest_block=0
whereas my op-node seems to be syncing fine:

op-node-1                | t=2025-02-04T19:51:59+0000 lvl=info msg="Sync progress" reason="new chain head block" l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_unsafe=0x17cdf58b1bd815e157bcc3e3a7c0dbb0dee5d3a44580a4636e19b7bf9d1897a1:11846291 l2_backup_unsafe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_time=1738698719
op-node-1                | t=2025-02-04T19:52:00+0000 lvl=info msg="Received signed execution payload from p2p" id=0xf0e1c6e2b352fdc4af819200460e25320139ce9f5f05eba6bddd8c6806d453da:11846292 peer=16Uiu2HAmLjZKARhTwCZtx8c9KzKfJ21qpTcdDeDp7jHLvCNBnF3R txs=8
op-node-1                | t=2025-02-04T19:52:00+0000 lvl=info msg="Optimistically inserting unsafe L2 execution payload to drive EL sync" id=0xf0e1c6e2b352fdc4af819200460e25320139ce9f5f05eba6bddd8c6806d453da:11846292
op-node-1                | t=2025-02-04T19:52:00+0000 lvl=info msg="Inserted new L2 unsafe block (synchronous)" hash=0xf0e1c6e2b352fdc4af819200460e25320139ce9f5f05eba6bddd8c6806d453da number=11846292 newpayload_time=1.518ms fcu2_time=478.617µs total_time=1.998ms mgas=3.522053 mgasps=1762.5793323918372

when I run op-reth with -vvvv I also see the following debug logs

2025-02-04T19:53:17.669524Z DEBUG engine::tree: received new engine message msg=Request(NewPayload(parent: 0xffbe4b90101652a303955ecc3281a8693f261703a488d085945dc93603ad80d5, number: 11846369, hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a))
2025-02-04T19:53:17.669768Z DEBUG engine::tree: Inserting new block into tree block=NumHash { number: 11846369, hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a } parent=0xffbe4b90101652a303955ecc3281a8693f261703a488d085945dc93603ad80d5 state_root=0xfd0b5551e364256961faa9bbc9fdb8fc5343299f6958f2cd4cdc7974e3b7adaa
2025-02-04T19:53:17.669858Z DEBUG engine::tree: no canonical state found for block hash=0xffbe4b90101652a303955ecc3281a8693f261703a488d085945dc93603ad80d5
2025-02-04T19:53:17.670345Z DEBUG engine::tree: received new engine message msg=Request(ForkchoiceUpdated { state: ForkchoiceState { head_block_hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, has_payload_attributes: false })
2025-02-04T19:53:17.670407Z DEBUG reth::cli: Event: Handler(ForkchoiceUpdated(ForkchoiceState { head_block_hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, Syncing))
2025-02-04T19:53:18.668765Z DEBUG engine::tree: received new engine message msg=Request(NewPayload(parent: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, number: 11846370, hash: 0xf14b08825ea856bf813bd8a15351108827fd02556b06afb28b6fc01ff84b83ee))
2025-02-04T19:53:18.669041Z DEBUG engine::tree: Inserting new block into tree block=NumHash { number: 11846370, hash: 0xf14b08825ea856bf813bd8a15351108827fd02556b06afb28b6fc01ff84b83ee } parent=0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a state_root=0xe7567f71be935f36efa708710fee95ec94f57019274ee1e4a9570886db45b0b7

which seems to suggest that it's running fine? which makes it more foncusing 🤔

Steps to reproduce

run a unichain sepolia based ExEx at rev 5322877.

Also a side note is that OP stack chain ExExs are very fragile at this point, where every minor reth version upgrade would break my very minimal implementation.

Node logs


Platform(s)

Linux (x86)

Container Type

Docker

What version/commit are you on?

rev 5322877

What database version are you on?

n/a

Which chain / network are you on?

Unichain Sepolia
https://github.com/Uniswap/unichain-node

What type of node are you running?

Archive (default)

What prune config do you use, if any?

n/a

If you've built Reth from source, provide the full command you used

No response

Code of Conduct

  • I agree to follow the Code of Conduct
@midcurvy midcurvy added C-bug An unexpected or incorrect behavior S-needs-triage This issue needs to be labelled labels Feb 4, 2025
@mattsse
Copy link
Collaborator

mattsse commented Feb 4, 2025

idk what happens here tbh
no clue why op-node sends zero hashes

can you perhaps share more logs?

has op-node ever sent a finalized/safe hash?

@midcurvy
Copy link
Author

midcurvy commented Feb 4, 2025

idk what happens here tbh no clue why op-node sends zero hashes

can you perhaps share more logs?

has op-node ever sent a finalized/safe hash?

Can you expand on what you mean by op-node sends zero hashes?

But yeah, the op-node log is basically it repeating the following entries for each block:

op-node-1                | t=2025-02-04T21:26:45+0000 lvl=info msg="Sync progress" reason="new chain head block" l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_unsafe=0x791af170191707c1010a5e129490b5317cc07d178f7c16fdf551e0eb71308572:11851976 l2_backup_unsafe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_time=1738704404
op-node-1                | t=2025-02-04T21:26:45+0000 lvl=info msg="Received signed execution payload from p2p" id=0xc45cc5c85e23ef4260d2675687533dd113922eafa903df7cee4aa25b04b2e8c3:11851977 peer=16Uiu2HAm2wkMqQY4Pn2AxELivsSyeWsdSr6DfwzbHcBM3jMnYgvp txs=5
op-node-1                | t=2025-02-04T21:26:45+0000 lvl=info msg="Optimistically inserting unsafe L2 execution payload to drive EL sync" id=0xc45cc5c85e23ef4260d2675687533dd113922eafa903df7cee4aa25b04b2e8c3:11851977
op-node-1                | t=2025-02-04T21:26:45+0000 lvl=info msg="Inserted new L2 unsafe block (synchronous)" hash=0xc45cc5c85e23ef4260d2675687533dd113922eafa903df7cee4aa25b04b2e8c3 number=11851977 newpayload_time=1.737ms fcu2_time=413.549µs total_time=2.153ms mgas=0.656103 mgasps=304.6435997277211

@mattsse
Copy link
Collaborator

mattsse commented Feb 4, 2025

l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0

these, unclear what it does this

@mattsse
Copy link
Collaborator

mattsse commented Feb 4, 2025

2025-02-04T19:53:17.670345Z DEBUG engine::tree: received new engine message msg=Request(ForkchoiceUpdated { state: ForkchoiceState { head_block_hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, has_payload_attributes: false })
2025-02-04T19:53:17.670407Z DEBUG reth::cli: Event: Handler(ForkchoiceUpdated(ForkchoiceState { head_block_hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, Syncing))

looks op-reth receives 0x00 hashes for finalized and safe and doesn't know how to pick a sync target,

are you running with --execution-layer syncmode or consenssu-layer?

@midcurvy
Copy link
Author

midcurvy commented Feb 4, 2025

2025-02-04T19:53:17.670345Z DEBUG engine::tree: received new engine message msg=Request(ForkchoiceUpdated { state: ForkchoiceState { head_block_hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, has_payload_attributes: false })
2025-02-04T19:53:17.670407Z DEBUG reth::cli: Event: Handler(ForkchoiceUpdated(ForkchoiceState { head_block_hash: 0x3e82d73eecd86f4ca3b2c455c91d92969f1d8da395fa8c243b0c65d1d4bd606a, safe_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000, finalized_block_hash: 0x0000000000000000000000000000000000000000000000000000000000000000 }, Syncing))

looks op-reth receives 0x00 hashes for finalized and safe and doesn't know how to pick a sync target,

are you running with --execution-layer syncmode or consenssu-layer?

I have OP_NODE_SYNCMODE set to execution-layer

What's weird is that I didn't change any of the chain/launch configs or envs, but merely bumped reth dependencies from 1.0.8.

I'll try another chain just in case

@mattsse
Copy link
Collaborator

mattsse commented Feb 4, 2025

did you also restart op-node?

could you try to restart it in CL sync whether it can make progress and then switch to execution layer mode again?

@midcurvy
Copy link
Author

midcurvy commented Feb 5, 2025

did you also restart op-node?

So I run my op-node alongside my ExEx node with docker-compose, and every time I make a change I'd terminate and re-start. both

could you try to restart it in CL sync whether it can make progress and then switch to execution layer mode again?

Okay so when I switch to CL sync my ExEx started working:

INFO exex{id="Test"}: Committed chain handled committed_chain=7524..=7524
INFO New payload job created id=0x03d681e232229f90 parent=0xf8825f6dcfff954ae36be784e47d201f2bbd958f49652f57423fac8789a96079
INFO Block added to canonical chain number=12563 hash=0xb14dc1376a00e179e301496f2668f0fcb37a7d2cf965acb4c46fbb85895c703e peers=37 txs=1 gas=43.93 Kgas gas_throughput=6.85 Ggas/second full=0.1% base_fee=0.00gwei blobs=0 excess_blobs=0 elapsed=6.41µs
INFO Canonical chain committed number=12563 hash=0xb14dc1376a00e179e301496f2668f0fcb37a7d2cf965acb4c46fbb85895c703e elapsed=13.884µs

but then when I stop and re-start both op-node and op-reth, I get this error now

INFO Initialized tracing, debug log directory: /root/.cache/reth/logs/unichain-sepolia
INFO Starting reth version="1.1.5-dev (5322877)"
INFO Opening database path="/db/db"
INFO Configuration loaded path="/db/reth.toml"
INFO Verifying storage consistency.
thread 'main' panicked at /usr/local/cargo/git/checkouts/reth-36d3ea1d1152b20c/5322877/crates/storage/codecs/src/alloy/transaction/txtype.rs:44:30:
Unsupported TxType identifier: 126

id 126 seems to be Deposit type

I do use OptimismCli as well as OpNode with my ExEx:

    cli.exex.run(|builder, rollup_args| async move {
        let handle = builder
            .node(OpNode::new(rollup_args))
            .install_exex("Test", |ctx| async move {
                Ok(AttestationExEx::new(ctx, Arc::new(signer)))
            })
            .launch()
            .await?;

        handle.wait_for_node_exit().await

@trianglesphere
Copy link

l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0

You should ignore l2_pending safe - it's an internal implementation detail.

op-node when doing the initial sync does not set the safe or finalized block until the EL reports that it's finished syncing from the head block

@midcurvy
Copy link
Author

midcurvy commented Feb 5, 2025

l2_finalized=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0 l2_pending_safe=0x0000000000000000000000000000000000000000000000000000000000000000:0

You should ignore l2_pending safe - it's an internal implementation detail.

op-node when doing the initial sync does not set the safe or finalized block until the EL reports that it's finished syncing from the head block

I see. Still I'm pretty sure I'm getting stuck with syncmode=EL. Reth keeps logging DEBUG engine::tree: no canonical state found for block hash=0x967777c6413aa4ec9cc5f18c7245f610f10116ad272262a19b96f999dc1ebf75 for every block

@trianglesphere
Copy link

@mattsse no canonical state found for block hash=0x967777c6413aa4ec9cc5f18c7245f610f10116ad272262a19b96f999dc1ebf75 seems correct for when the CL is driving the EL before reth has finished syncing to tip

@emhane emhane added A-op-reth Related to Optimism and op-reth A-exex Execution Extensions and removed S-needs-triage This issue needs to be labelled labels Feb 11, 2025
@cakevm
Copy link
Contributor

cakevm commented Feb 11, 2025

@midcurvy fyi, I ran in the same issue. It is now resolved with latest main head. See also #14407

@midcurvy
Copy link
Author

@midcurvy fyi, I ran in the same issue. It is now resolved with latest main head. See also #14407

Thank you! I gave d727701 a try but still seeing the same thing (latest_block=0 from op-reth side). Looking at #14407 it doesn't seem like it's related? May I ask what CL client and version you are using?

@cakevm
Copy link
Contributor

cakevm commented Feb 11, 2025

Oh sorry, I was referring to the Unsupported TxType identifier: 126 error here in the comment. Sorry to mix things here. Hope you will find a solution.

@mattsse
Copy link
Collaborator

mattsse commented Feb 11, 2025

thanks for flagging

Unsupported TxType identifier: 126

should be fixed by #14413

I've identified the other exex issues on op and will try to get them fixed in the upcoming days

@emhane emhane moved this from Backlog to TODO in Project Tracking Feb 12, 2025
@emhane emhane moved this from TODO to In Progress in Project Tracking Feb 12, 2025
@emhane emhane moved this from In Progress to Backlog in Project Tracking Feb 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-exex Execution Extensions A-op-reth Related to Optimism and op-reth C-bug An unexpected or incorrect behavior
Projects
Status: Todo
Development

No branches or pull requests

5 participants