Consider restoring previous captive core state on startup instead of catchup #2960

marta-lokhova · 2021-03-11T19:28:30Z

As suggested by @bartekn, consider skipping the download of a particular bucket if it's already present in the buckets folder. This would be useful, as captive core has to re-download all the buckets upon re-start. Note that we should still verify buckets, to ensure that we are not processing invalid buckets.

MonsieurNicolas · 2021-03-11T20:16:53Z

That is a good idea; probably a little more complicated than this though:
as captive-core is always on genesis ledger on startup, I think it will delete buckets that are not referenced by anything as part of "buckets garbage collection" - so that may have to change as well.

marta-lokhova · 2021-03-11T20:25:56Z

yeah I am aware of the garbage collection issue. I don't think it's very complicated though: we'd just need to carefully evaluate the placement of the garbage collection calls during startup + catchup, and opt-out of it during specific parts of catchup (i.e., bucket apply; GC would still happen during ledger application of course).

MonsieurNicolas · 2021-03-11T22:01:13Z

Actually maybe the problem is something entirely different: if what people are trying to do is make "captive core restart fast", then the solution is to have a database setup (sqlite) but not use it to store the ledger (or transactions etc).

When setup like this, what core needs to do on startup is to apply buckets to rebuild its in-memory state. With this done, it can even catchup to the network without having to perform a full catchup from history.

That way the HAS will be there so this garbage collection problem won't even need to be fixed.

MonsieurNicolas · 2021-03-13T00:56:46Z

@ire-and-curses does that sound like it would help with the captive core situation? Basically we would run captive-core in some hybrid mode that will use a small database (sqlite) alongside buckets. This will also allow to keep peer information around, which also helps when restarting a node (today captive core has to re-discover peers with capacity on every restart, and that can take a while).

ire-and-curses · 2021-03-15T16:24:02Z

Yeah that does sound like it would help. I guess it depends on how the startup time is broken down. Where is the time spent?

Where we're coming from is we recently learned from @brahman81 and @jacekn that patching and restarting stand-alone stellar-core has no restart delay. We've been trying to understand what's different in the captive core case.

How big would this database be? Does it change anything with the current need to store gigabytes of bucket files?

Regarding the original proposal: I think which way to go really depends on this question of what will make the most improvement to initial start up time + implementation cost.

MonsieurNicolas · 2021-03-15T16:48:27Z

I just talked to @bartekn about this, so I am adding a few more details:

when running in this mode, the sqlite database location should default to be in the buckets folder (as to make it "0-config"). I think it works like this already (note: Horizon knows about it).
the sqlite database should be in the order of 1 MB or less (because the only large data set is the overlay data)
the change should be transparent to Horizon
- on startup, Horizon specifies the "ledger + hash" for the LCL, so we can use this
- if the specified LCL matches the one core has, we can rebuild the in-memory ledger state right away, without waiting for overlay; if it does not we can have some simple heuristics to apply it anyways (ie: if the ledger is "not too far in the future" like 10 ledgers), otherwise we just don't use it (and the behavior is like today, meaning we have to wait for a checkpoint ledger)

MonsieurNicolas · 2021-03-15T19:21:21Z

I guess one of the things we probably need before doing this is to quantify all of this:

we know that "waiting for a checkpoint" is over 2.5 minutes average
how long does "download & verify" buckets take?
how long does "apply buckets" (in memory) take? If it takes a while, we may have to let it run "in the background" (as to buffer ledgers) - but this probably complicates the work quite a bit

marta-lokhova · 2021-03-22T17:28:24Z

(tried this on my local machine in case it's useful)

how long does "download & verify" buckets take?

Looks like it takes about 80-90 seconds with --in-memory mode.

how long does "apply buckets" (in memory) take?

Bucket application takes ~30 seconds.

MonsieurNicolas · 2021-03-22T19:07:23Z

OK, so it looks like with those numbers the only viable change is what I outlined earlier as the node would only have to rebuild its in memory state (30s), in which case we're still within the window for keeping the node in sync after a restart

marta-lokhova · 2021-03-23T17:14:11Z

Implementation-wise, one thing to note here is that in order to restore the in-memory state correctly, we need the most recent ledger header (LedgerManager relies on the correct last closed ledger header from what I see). We do not store ledger headers in this mode, but we do store some information in storestate table, such as ledger hash, and HAS. So I think that we'd need to add another row to storestate with the ledgerheader information. Alternatively, we could only store the latest ledger header in the headers table when in this mode, but this option seems more complicated.

MonsieurNicolas · 2021-03-23T17:48:36Z

Alternatively, we could only store the latest ledger header in the headers table when in this mode, but this option seems more complicated.

I don't think it's more complicated: it's about picking the right "configuration" mode. More complicated is to add new ways to do the same thing (that needs to be tested etc).

We already have a mechanism to persist the lcl, it's just (currently) disabled in this mode. To make it work, we may need to enable a very aggressive "maintenance" so that we don't store too many of these but that's about it.

marta-lokhova · 2021-03-25T00:05:46Z

So I've put together an initial prototype, and was able to join the network in ~30 seconds, which is promising! I do want to clarify a couple of things to make sure we are all on the same page.

The API

Once core is upgraded to the version that uses the new hybrid mode, the operator needs to run stellar-core new-db --partial-for-in-memory-mode (naming is TBD) to initialize the persistent database.
- This will properly initialize the buckets folder and the new small database that will store minimal amounts of data needed in this hybrid mode. New database will be initialized to the genesis ledger.
Once the database is ready, core may start with the --in-memory flag. It will have to do the initial full catchup in order to populate the buckets.
After that, there are several options, depending on the ledger that captive core is restarted with:
- --start-at-ledger parameter that is equivalent to LCL in the database. In this case, the previous state will be rebuilt automatically before the node starts listening to the network traffic. This step takes ~30 seconds.
  - Question here: in terms of buckets validity, can we assume that persisted buckets will remain valid in between restarts? Specifically, I'm working out the expected behavior for when buckets are missing or corrupt. Options are: (a) rebuild from scratch anyway, ignoring the inconsistencies (I don't like this option, as it may silence the real issues), and (b) complain/throw so that the horizon operator has to recover/rebuild manually. Basically, can we assume that if core previously downloaded and verified buckets, those can be reused upon restart?
- --start-at-ledger parameter that is ahead of LCL by a few ledgers (Nicolas suggested 10). In this case, restore the state anyway, and we should be able to replay the needed ledgers when we hear from the network or do catchup. The only question here is that we might stream a few extra ledgers, is that an issue?
- --start-at-ledger parameter that is before LCL. We can't stream older ledgers with the existing state, so rebuild from scratch (full catchup).
- --start-at-ledger parameter is not configured at all. In this case, the node would rebuild from scratch and catchup to the latest ledger from the network. This is to ensure that the operators have visibility and full control over which ledgers will be streamed (as in this case we don't know the latest ledger until we listen to the network), does that sound right?

Would love to hear feedback from the Horizon team on this to make sure I'm understanding this right (@bartekn, @ire-and-curses, @MonsieurNicolas)

Implementation comments

One question I had is around the semantics of "in memory state vs persisted state" during startup. Note: When I say "persisted" state in the context of captive core, I mean "whatever is in the database plus the in-memory LedgerTxn". When I say "in memory" state, I'm talking about the internal state that core keeps track of to correctly proceed (such as last closed ledger, assumed bucket state, last Herder state etc). I know that in this case "persisted" is not actually fully persisted, but it's easier to mentally separate the two categories this way (at least for me!)

So the typical startup flow is: start the app, load state from the database, reason about what to do based on that state. With the introduction of this hybrid mode, our "persisted state" is inconsistent on startup, so core cannot setup the in-memory state correctly (i.e., ledger headers say "we're at ledger X", whereas LedgerTxn has no state at all). So I'm restoring the correct state first before officially "starting the app and loading last known state". This seems like a safer thing to do and the "restore state" code does not depend on any internal state (i.e., it only depends on persisted buckets, HAS, etc), but I wanted to double check that there isn't anything missing in this approach @MonsieurNicolas @graydon.

MonsieurNicolas · 2021-03-25T16:09:56Z

Overall this makes sense to me. The one thing I am not sure about is having to call newdb if there is no database: newdb should work to force clean state, but it seems a bit annoying that we don't keep the existing "captive core" behavior that "just works".

On your last point, yeah that seems to be the best way forward as to keep things separate. The good news there is we have "state", so it should be relatively easy to spin up/spin down (if needed) a different "app" before doing the rest of the work.

graydon · 2021-03-25T18:24:01Z

I think it would help me understand the strategy if you could break down which tables (and which rows in the storestate table) are being saved between runs, and which are being recreated.

marta-lokhova added the enhancement label Mar 11, 2021

bartekn mentioned this issue Mar 11, 2021

Allow setting BUCKET_DIR_PATH when starting Stellar-Core in captive mode stellar/go#3437

Closed

marta-lokhova self-assigned this Mar 11, 2021

marta-lokhova changed the title ~~Allow re-using existing buckets when running captive core~~ Consider restoring previous captive core state on startup instead of catchup Mar 25, 2021

This was referenced Mar 31, 2021

Captive core should remember peers in between restarts #2993

Closed

Captive core fast startup #2994

Merged

MonsieurNicolas mentioned this issue Apr 23, 2021

Consider processing buckets in memory in captive mode #2942

Closed

latobarita closed this as completed in #2994 May 13, 2021

ire-and-curses mentioned this issue May 17, 2021

Restart captive core when a new version of core is detected on disk stellar/go#3602

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider restoring previous captive core state on startup instead of catchup #2960

Consider restoring previous captive core state on startup instead of catchup #2960

marta-lokhova commented Mar 11, 2021

MonsieurNicolas commented Mar 11, 2021

marta-lokhova commented Mar 11, 2021

MonsieurNicolas commented Mar 11, 2021

MonsieurNicolas commented Mar 13, 2021

ire-and-curses commented Mar 15, 2021

MonsieurNicolas commented Mar 15, 2021

MonsieurNicolas commented Mar 15, 2021

marta-lokhova commented Mar 22, 2021 •

edited

Loading

MonsieurNicolas commented Mar 22, 2021

marta-lokhova commented Mar 23, 2021

MonsieurNicolas commented Mar 23, 2021

marta-lokhova commented Mar 25, 2021

MonsieurNicolas commented Mar 25, 2021

graydon commented Mar 25, 2021

Consider restoring previous captive core state on startup instead of catchup #2960

Consider restoring previous captive core state on startup instead of catchup #2960

Comments

marta-lokhova commented Mar 11, 2021

MonsieurNicolas commented Mar 11, 2021

marta-lokhova commented Mar 11, 2021

MonsieurNicolas commented Mar 11, 2021

MonsieurNicolas commented Mar 13, 2021

ire-and-curses commented Mar 15, 2021

MonsieurNicolas commented Mar 15, 2021

MonsieurNicolas commented Mar 15, 2021

marta-lokhova commented Mar 22, 2021 • edited Loading

MonsieurNicolas commented Mar 22, 2021

marta-lokhova commented Mar 23, 2021

MonsieurNicolas commented Mar 23, 2021

marta-lokhova commented Mar 25, 2021

The API

Implementation comments

MonsieurNicolas commented Mar 25, 2021

graydon commented Mar 25, 2021

marta-lokhova commented Mar 22, 2021 •

edited

Loading