Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider restoring previous captive core state on startup instead of catchup #2960

Closed
marta-lokhova opened this issue Mar 11, 2021 · 14 comments · Fixed by #2994
Closed

Consider restoring previous captive core state on startup instead of catchup #2960

marta-lokhova opened this issue Mar 11, 2021 · 14 comments · Fixed by #2994
Assignees

Comments

@marta-lokhova
Copy link
Contributor

As suggested by @bartekn, consider skipping the download of a particular bucket if it's already present in the buckets folder. This would be useful, as captive core has to re-download all the buckets upon re-start. Note that we should still verify buckets, to ensure that we are not processing invalid buckets.

@MonsieurNicolas
Copy link
Contributor

That is a good idea; probably a little more complicated than this though:
as captive-core is always on genesis ledger on startup, I think it will delete buckets that are not referenced by anything as part of "buckets garbage collection" - so that may have to change as well.

@marta-lokhova
Copy link
Contributor Author

yeah I am aware of the garbage collection issue. I don't think it's very complicated though: we'd just need to carefully evaluate the placement of the garbage collection calls during startup + catchup, and opt-out of it during specific parts of catchup (i.e., bucket apply; GC would still happen during ledger application of course).

@marta-lokhova marta-lokhova self-assigned this Mar 11, 2021
@MonsieurNicolas
Copy link
Contributor

Actually maybe the problem is something entirely different: if what people are trying to do is make "captive core restart fast", then the solution is to have a database setup (sqlite) but not use it to store the ledger (or transactions etc).

When setup like this, what core needs to do on startup is to apply buckets to rebuild its in-memory state. With this done, it can even catchup to the network without having to perform a full catchup from history.

That way the HAS will be there so this garbage collection problem won't even need to be fixed.

@MonsieurNicolas
Copy link
Contributor

@ire-and-curses does that sound like it would help with the captive core situation? Basically we would run captive-core in some hybrid mode that will use a small database (sqlite) alongside buckets. This will also allow to keep peer information around, which also helps when restarting a node (today captive core has to re-discover peers with capacity on every restart, and that can take a while).

@ire-and-curses
Copy link

Yeah that does sound like it would help. I guess it depends on how the startup time is broken down. Where is the time spent?

Where we're coming from is we recently learned from @brahman81 and @jacekn that patching and restarting stand-alone stellar-core has no restart delay. We've been trying to understand what's different in the captive core case.

How big would this database be? Does it change anything with the current need to store gigabytes of bucket files?

Regarding the original proposal: I think which way to go really depends on this question of what will make the most improvement to initial start up time + implementation cost.

@MonsieurNicolas
Copy link
Contributor

I just talked to @bartekn about this, so I am adding a few more details:

  • when running in this mode, the sqlite database location should default to be in the buckets folder (as to make it "0-config"). I think it works like this already (note: Horizon knows about it).
  • the sqlite database should be in the order of 1 MB or less (because the only large data set is the overlay data)
  • the change should be transparent to Horizon
    • on startup, Horizon specifies the "ledger + hash" for the LCL, so we can use this
    • if the specified LCL matches the one core has, we can rebuild the in-memory ledger state right away, without waiting for overlay; if it does not we can have some simple heuristics to apply it anyways (ie: if the ledger is "not too far in the future" like 10 ledgers), otherwise we just don't use it (and the behavior is like today, meaning we have to wait for a checkpoint ledger)

@MonsieurNicolas
Copy link
Contributor

I guess one of the things we probably need before doing this is to quantify all of this:

  • we know that "waiting for a checkpoint" is over 2.5 minutes average
  • how long does "download & verify" buckets take?
  • how long does "apply buckets" (in memory) take? If it takes a while, we may have to let it run "in the background" (as to buffer ledgers) - but this probably complicates the work quite a bit

@marta-lokhova
Copy link
Contributor Author

marta-lokhova commented Mar 22, 2021

(tried this on my local machine in case it's useful)

how long does "download & verify" buckets take?

Looks like it takes about 80-90 seconds with --in-memory mode.

how long does "apply buckets" (in memory) take?

Bucket application takes ~30 seconds.

@MonsieurNicolas
Copy link
Contributor

OK, so it looks like with those numbers the only viable change is what I outlined earlier as the node would only have to rebuild its in memory state (30s), in which case we're still within the window for keeping the node in sync after a restart

@marta-lokhova
Copy link
Contributor Author

Implementation-wise, one thing to note here is that in order to restore the in-memory state correctly, we need the most recent ledger header (LedgerManager relies on the correct last closed ledger header from what I see). We do not store ledger headers in this mode, but we do store some information in storestate table, such as ledger hash, and HAS. So I think that we'd need to add another row to storestate with the ledgerheader information. Alternatively, we could only store the latest ledger header in the headers table when in this mode, but this option seems more complicated.

@MonsieurNicolas
Copy link
Contributor

Alternatively, we could only store the latest ledger header in the headers table when in this mode, but this option seems more complicated.

I don't think it's more complicated: it's about picking the right "configuration" mode. More complicated is to add new ways to do the same thing (that needs to be tested etc).

We already have a mechanism to persist the lcl, it's just (currently) disabled in this mode. To make it work, we may need to enable a very aggressive "maintenance" so that we don't store too many of these but that's about it.

@marta-lokhova
Copy link
Contributor Author

So I've put together an initial prototype, and was able to join the network in ~30 seconds, which is promising! I do want to clarify a couple of things to make sure we are all on the same page.

The API

  • Once core is upgraded to the version that uses the new hybrid mode, the operator needs to run stellar-core new-db --partial-for-in-memory-mode (naming is TBD) to initialize the persistent database.
    • This will properly initialize the buckets folder and the new small database that will store minimal amounts of data needed in this hybrid mode. New database will be initialized to the genesis ledger.
  • Once the database is ready, core may start with the --in-memory flag. It will have to do the initial full catchup in order to populate the buckets.
  • After that, there are several options, depending on the ledger that captive core is restarted with:
    • --start-at-ledger parameter that is equivalent to LCL in the database. In this case, the previous state will be rebuilt automatically before the node starts listening to the network traffic. This step takes ~30 seconds.
      • Question here: in terms of buckets validity, can we assume that persisted buckets will remain valid in between restarts? Specifically, I'm working out the expected behavior for when buckets are missing or corrupt. Options are: (a) rebuild from scratch anyway, ignoring the inconsistencies (I don't like this option, as it may silence the real issues), and (b) complain/throw so that the horizon operator has to recover/rebuild manually. Basically, can we assume that if core previously downloaded and verified buckets, those can be reused upon restart?
    • --start-at-ledger parameter that is ahead of LCL by a few ledgers (Nicolas suggested 10). In this case, restore the state anyway, and we should be able to replay the needed ledgers when we hear from the network or do catchup. The only question here is that we might stream a few extra ledgers, is that an issue?
    • --start-at-ledger parameter that is before LCL. We can't stream older ledgers with the existing state, so rebuild from scratch (full catchup).
    • --start-at-ledger parameter is not configured at all. In this case, the node would rebuild from scratch and catchup to the latest ledger from the network. This is to ensure that the operators have visibility and full control over which ledgers will be streamed (as in this case we don't know the latest ledger until we listen to the network), does that sound right?

Would love to hear feedback from the Horizon team on this to make sure I'm understanding this right (@bartekn, @ire-and-curses, @MonsieurNicolas)

Implementation comments

One question I had is around the semantics of "in memory state vs persisted state" during startup. Note: When I say "persisted" state in the context of captive core, I mean "whatever is in the database plus the in-memory LedgerTxn". When I say "in memory" state, I'm talking about the internal state that core keeps track of to correctly proceed (such as last closed ledger, assumed bucket state, last Herder state etc). I know that in this case "persisted" is not actually fully persisted, but it's easier to mentally separate the two categories this way (at least for me!)

So the typical startup flow is: start the app, load state from the database, reason about what to do based on that state. With the introduction of this hybrid mode, our "persisted state" is inconsistent on startup, so core cannot setup the in-memory state correctly (i.e., ledger headers say "we're at ledger X", whereas LedgerTxn has no state at all). So I'm restoring the correct state first before officially "starting the app and loading last known state". This seems like a safer thing to do and the "restore state" code does not depend on any internal state (i.e., it only depends on persisted buckets, HAS, etc), but I wanted to double check that there isn't anything missing in this approach @MonsieurNicolas @graydon.

@MonsieurNicolas
Copy link
Contributor

Overall this makes sense to me. The one thing I am not sure about is having to call newdb if there is no database: newdb should work to force clean state, but it seems a bit annoying that we don't keep the existing "captive core" behavior that "just works".

On your last point, yeah that seems to be the best way forward as to keep things separate. The good news there is we have "state", so it should be relatively easy to spin up/spin down (if needed) a different "app" before doing the rest of the work.

@marta-lokhova marta-lokhova changed the title Allow re-using existing buckets when running captive core Consider restoring previous captive core state on startup instead of catchup Mar 25, 2021
@graydon
Copy link
Contributor

graydon commented Mar 25, 2021

I think it would help me understand the strategy if you could break down which tables (and which rows in the storestate table) are being saved between runs, and which are being recreated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants