Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ingestion: verify updating data against stellar-core #1550

Closed
ire-and-curses opened this issue Jul 30, 2019 · 7 comments
Closed

Ingestion: verify updating data against stellar-core #1550

ire-and-curses opened this issue Jul 30, 2019 · 7 comments
Assignees
Labels
ingest New ingestion system

Comments

@ire-and-curses
Copy link
Contributor

To gain confidence in the new ingestion system, and for sanity checking in production, every few ledgers we should verify the hash of the current ingested ledger, and compare it to the equivalent hash of stellar-core's data.

Although we could if necessary do this by direct DB access in the short-term, to implement this well we need to plan with core team the best way for stellar-core to expose the hash for Horizon to consume.

@ire-and-curses ire-and-curses added the ingest New ingestion system label Jul 30, 2019
@bartekn
Copy link
Contributor

bartekn commented Aug 9, 2019

Given that adding changes to stellar-core may take some time (and we don't have any specific proposal yet), here's a simple algorithm we can implement now. Returns false if the state is invalid, true otherwise:

  1. When a checkpoint ledger is ingested (or every x checkpoint ledgers), start a new go routine.
  2. Check if verification routine is already running. Exit if true. This is to prevent multiple routines running if an algorithm takes more than 5 minutes (checkpoints created every 5 minutes).
  3. In a new go routine start a REPEATABLE READ transaction (call ROLLBACK before returning from function).
  4. Create a temporary table with a single field for sha256 hashes with a primary index.
  5. Steam all rows from state tables (accounts, accountdata, offers, trustlines) from a database, for each entry:
    1. convert it to xdr.LedgerEntryData,
    2. marshal it to XDR,
    3. calculate sha256 hash,
    4. batch-insert hashes to a temporary table.
  6. Stream all ledger entries from buckets for a given checkpoint ledger, for each entry:
    1. marshal it to XDR,
    2. calculate sha256 hash,
    3. batch-delete hashes from a temporary table, if key not found return false.
  7. If there are any hashes left in a temporary table return false
  8. Return true.

Memory requirements: depend on the batch size and state reader TempSet implementation.
Disk requirements: for public network as of now: ~5.5M × size of a row with single sha256 hash field + index size.

If an invalid state is detected, Horizon saves a special flag to a database and panics. On startup, if a special flag is set, Horizon refuses to start and prints error about invalid state.

@ire-and-curses
Copy link
Contributor Author

@bartekn Does the stellar-core http info endpoint provide enough information for this? Doc says

{
      "build" : "v11.1.0",
      "history_failure_rate" : "0",
      "ledger" : {
         "age" : 3,
         "baseFee" : 100,
         "baseReserve" : 5000000,
         "closeTime" : 1560350852,
         "hash" : "40d884f6eb105da56bea518513ba9c5cda9a4e45ac824e5eac8f7262c713cc60",
         "maxTxSetSize" : 1000,
         "num" : 24311579,
         "version" : 11
      },

which looks like the ledger hash for ledger number num.

@bartekn
Copy link
Contributor

bartekn commented Aug 21, 2019

Ledger hash is helpful but not for state checking. As far as I remember, it's a hash of transaction set for a given ledger. So we can use this to ensure that set of transactions provided by ledger backend matches the hash in ledger header.

@graydon
Copy link
Contributor

graydon commented Aug 21, 2019

@bartekn Do you think it'd be adequate if we just surfaced the set of bucket level hashes? Or do you want a fine-grained set of hashes of each value in your database?

(am I understanding the plan correctly, that you want to hash every row in your database?)

@bartekn
Copy link
Contributor

bartekn commented Aug 22, 2019

@graydon this is correct and I actually have a prototype in expingest-state-verifier branch. This works exactly as I described in #1550 (comment) but it also has a special transform function to transform ledger entries to result in the same hash. For example, currently in Horizon we only store signers for accounts. The transform function removes all the fields from ledger entries from buckets except these related to signers (accountid, master key in thresholds and signers array). I'll have a PR ready in the next few days.

bartekn added a commit that referenced this issue Aug 30, 2019

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
This commit adds a new job and nightly workflow to compare state
ingested using `SingleLedgerStateReader` with an actual state in
Stellar-Core DB.

In #1550 we are building a tool to periodically compare state in Horizon
DB vs state in history archives. This way we can find if the state
updated using txmeta is the same as in buckets for a given checkpoint.
However, in case there is a bug in `SingleLedgerStateReader` the check
may succeed even though the state is actually invalid. We have good
tests for `SingleLedgerStateReader` but it's always better to be safe
than sorry.
@bartekn
Copy link
Contributor

bartekn commented Aug 30, 2019

The algorithm above depends on SingleLedgerStateReader correctness. Added #1677 to check a diff between state in stellar-core DB vs fetched by SingleLedgerStateReader.

@bartekn
Copy link
Contributor

bartekn commented Sep 12, 2019

Closed in #1691.

@bartekn bartekn closed this as completed Sep 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ingest New ingestion system
Projects
None yet
Development

No branches or pull requests

3 participants