Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nimbus-reth-1 on devnet-12 sporadically getting "code": -38001 from engine_getPayloadV2 #5623

Closed
1 task done
tersec opened this issue Nov 29, 2023 · 5 comments
Closed
1 task done
Labels
C-bug An unexpected or incorrect behavior

Comments

@tersec
Copy link

tersec commented Nov 29, 2023

Describe the bug

The EF runs a JSON-RPC snooper, and this has happened twice in the first hour or so:

Nov 29 13:16:12.069 2023 REQUEST
{
  "jsonrpc": "2.0",
  "method": "engine_forkchoiceUpdatedV2",
  "params": [
    {
      "headBlockHash": "0xbce64b5dd363e32660f40a3345bb71a068255fb5ae977132bea9e4c776bb6689",
      "safeBlockHash": "0x0000000000000000000000000000000000000000000000000000000000000000",
      "finalizedBlockHash": "0x0000000000000000000000000000000000000000000000000000000000000000"
    },
    {
      "timestamp": "0x6567399C",
      "prevRandao": "0xafa4c7c736bc9c9e43bbb375824d36342fab2f917b8e659dab637b01ef58f10c",
      "suggestedFeeRecipient": "0xf97e180c050e5ab072211ad2c213eb5aee4df134",
      "withdrawals": []
    }
  ],
  "id": 444
}

Nov 29 13:16:12.071 2023 RESPONSE (status 200 OK)
{
  "jsonrpc": "2.0",
  "result": {
    "payloadStatus": {
      "status": "VALID",
      "latestValidHash": "0xbce64b5dd363e32660f40a3345bb71a068255fb5ae977132bea9e4c776bb6689",
      "validationError": null
    },
    "payloadId": "0x9aa4047b10975fa9"
  },
  "id": 444
}

Nov 29 13:16:12.573 2023 REQUEST
{
  "jsonrpc": "2.0",
  "method": "engine_getPayloadV2",
  "params": [
    "0x9aa4047b10975fa9"
  ],
  "id": 445
}

Nov 29 13:16:12.574 2023 RESPONSE (status 200 OK)
{
  "jsonrpc": "2.0",
  "error": {
    "code": -38001,
    "message": "Unknown payload"
  },
  "id": 445
}
Nov 29 13:28:24.063 2023 REQUEST
{
  "jsonrpc": "2.0",
  "method": "engine_forkchoiceUpdatedV2",
  "params": [
    {
      "headBlockHash": "0x0834fb9253e33ec9ecc641f08d4673f8d28d2e74e657d0bafee609223a7b8bfe",
      "safeBlockHash": "0x5b9529839af4542cc8841ceb10b30df56dd96cc206af61f8717139cc86fa8627",
      "finalizedBlockHash": "0x0fb6d76c70424225a23b2e1e9af0e3cc2b66ce03c76079b5ad5d5cab706367a6"
    },
    {
      "timestamp": "0x65673C78",
      "prevRandao": "0x90ad47b968263571877557fc54c3c3a95014f52c35a9c97138b722246db0b0d0",
      "suggestedFeeRecipient": "0xf97e180c050e5ab072211ad2c213eb5aee4df134",
      "withdrawals": []
    }
  ],
  "id": 749
}

Nov 29 13:28:24.064 2023 RESPONSE (status 200 OK)
{
  "jsonrpc": "2.0",
  "result": {
    "payloadStatus": {
      "status": "VALID",
      "latestValidHash": "0x0834fb9253e33ec9ecc641f08d4673f8d28d2e74e657d0bafee609223a7b8bfe",
      "validationError": null
    },
    "payloadId": "0x2c43dd4b9fd860b2"
  },
  "id": 749
}

Nov 29 13:28:24.566 2023 REQUEST
{
  "jsonrpc": "2.0",
  "method": "engine_getPayloadV2",
  "params": [
    "0x2c43dd4b9fd860b2"
  ],
  "id": 750
}

Nov 29 13:28:24.567 2023 RESPONSE (status 200 OK)
{
  "jsonrpc": "2.0",
  "error": {
    "code": -38001,
    "message": "Unknown payload"
  },
  "id": 750
}

Steps to reproduce

Consult devnet-12

Node logs

Node logs available on `nimbus-reth-1.dencun-devnet-12.ethpandaops.io` for both Nimbus and Reth.

Platform(s)

No response

What version/commit are you on?

c49cda6

What database version are you on?

No response

What type of node are you running?

None

What prune config do you use, if any?

No response

If you've built Reth from source, provide the full command you used

No response

Code of Conduct

  • I agree to follow the Code of Conduct
@tersec tersec added C-bug An unexpected or incorrect behavior S-needs-triage This issue needs to be labelled labels Nov 29, 2023
@mattsse
Copy link
Collaborator

mattsse commented Nov 29, 2023

from the logs

2023-11-29T13:16:00.191668Z INFO reth_payload_builder::service: New payload job created id=0x9aa4047b10975fa9 parent=0xbce64b5dd363e32660f40a3345bb71a068255fb5ae977132bea9e4c776bb6689

and then 12 seconds later

2023-11-29T13:16:12.071105Z DEBUG reth_payload_builder::service: Payload job already in progress, ignoring. id=0x9aa4047b10975fa9 parent=0xbce64b5dd363e32660f40a3345bb71a068255fb5ae977132bea9e4c776bb6689

same situation with the other

2023-11-29T13:28:12.144440Z INFO reth_payload_builder::service: New payload job created id=0x2c43dd4b9fd860b2 parent=0x0834fb9253e33ec9ecc641f08d4673f8d28d2e74e657d0bafee609223a7b8bfe

2023-11-29T13:28:24.064648Z DEBUG reth_payload_builder::service: Payload job already in progress, ignoring. id=0x2c43dd4b9fd860b2 parent=0x0834fb9253e33ec9ecc641f08d4673f8d28d2e74e657d0bafee609223a7b8bfe

then unknown block at

13:28:24.566

curious why these are so far apart. The second logs are preceded by:

Block is already canonical, ignoring. block_hash=0x0834fb9253e33ec9ecc641f08d4673f8d28d2e74e657d0bafee609223a7b8bfe

It appears reth gets the same FCU twice.

I need to check the timeouts of the payload building job

@tersec
Copy link
Author

tersec commented Nov 29, 2023

This is also a Nimbus bug, though I'm not yet certain what the spec says about this, so this uncovered something useful on the Nimbus end too.

Nimbus is sending the same fcU twice, and the reason why is also explains why it'd stop in Deneb. It caches the PayloadAttributes it used in the "lookahead" forkchoiceUpdated (the first one in each of the pairs, immediately upon setting the parent block to head), then compares this cached information when it's about to request a payload with what it knows about the payload attributes then.

The trouble is that it fills in a placeholder parent beacon block root (Deneb-only payload attribute) for previous forks, as the zero hash. This never matches the actual beacon block root later on, so it sends a duplicate forkchoiceUpdated.

I'm not sure yet what the spec says should happen here, but regardless, it's among other things a Nimbus bug which should be fixed.

That said, it's a bit strange to respond with a payload ID and then immediately not have it available, as well.

@mattsse
Copy link
Collaborator

mattsse commented Nov 29, 2023

That said, it's a bit strange to respond with a payload ID and then immediately not have it available, as well.

agree, the reth bug was that the job's deadline was set too tight: fixed 12s starting when it is created, though this is incorrect according to the spec. This would explain why there's a race condition if the two FCU are <=12s apart and the getPayload is >12s after the first FCU

see #5626

@tersec
Copy link
Author

tersec commented Dec 1, 2023

status-im/nimbus-eth2#5635 should address this on the Nimbus side.

@tersec
Copy link
Author

tersec commented Dec 6, 2023

This has been resolved.

@tersec tersec closed this as completed Dec 6, 2023
@DaniPopes DaniPopes removed the S-needs-triage This issue needs to be labelled label Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug An unexpected or incorrect behavior
Projects
Archived in project
Development

No branches or pull requests

3 participants