-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Bigtable Upload Service Loops on Same Block Repeatedly #33831
Comments
Right now, the server is in the loop, so it's an excellent change to troubleshoot. |
Sure, if you can tell me what the missing ranges in your ledger are, I can take a look at the uploader logic. Probably it should emit an error and halt rather than continually looping. |
@CriesofCarrots - In regards to looping on the same slot, I think we can attribute that to this comment solana/ledger/src/bigtable_upload.rs Lines 44 to 46 in a3b0348
and this line here solana/ledger/src/bigtable_upload.rs Line 141 in a3b0348
From the logs posted above in the issue description, solana/ledger/src/bigtable_upload_service.rs Lines 110 to 120 in a3b0348
The logs then indicate that the blockstore only has the slot
And that the slot has already been uploaded:
so there is no work to do and we hit this early return with solana/ledger/src/bigtable_upload.rs Lines 136 to 141 in a3b0348
Looking at the snippet from solana/ledger/src/bigtable_upload_service.rs Line 120 in a3b0348
And thus we're stuck since this is a gap and our node will never repair/replay the slots immediately after solana/ledger/src/bigtable_upload.rs Lines 69 to 72 in a3b0348
PS: @McSim85 - When posting logs in the future, please post the text in between triple ` quotes instead of pasting an image; it makes it easier for us to copy/paste/search/etc the text. |
We have the same issue on two warehouse nodes.
this happens from 1.14.17 (around 19 September) and still happens.
Currently, on 1.16.17, but still happens.
I will add more details shortly.
version was running
bounds of the ledger
bounds of bigtable
all data from the genesys
when you saw this
Mostly happens after uprade\restart.
relevant issue
I'm going to close this for now, but please re-open if you see it again.
I'll also ponder how if bigtable-upload should support fragmented ledgers.
Originally posted by @CriesofCarrots in #27732 (comment)
The text was updated successfully, but these errors were encountered: