Ingestion: verify updating data against stellar-core #1550

ire-and-curses · 2019-07-30T23:57:52Z

To gain confidence in the new ingestion system, and for sanity checking in production, every few ledgers we should verify the hash of the current ingested ledger, and compare it to the equivalent hash of stellar-core's data.

Although we could if necessary do this by direct DB access in the short-term, to implement this well we need to plan with core team the best way for stellar-core to expose the hash for Horizon to consume.

bartekn · 2019-08-09T13:52:13Z

Given that adding changes to stellar-core may take some time (and we don't have any specific proposal yet), here's a simple algorithm we can implement now. Returns false if the state is invalid, true otherwise:

When a checkpoint ledger is ingested (or every x checkpoint ledgers), start a new go routine.
Check if verification routine is already running. Exit if true. This is to prevent multiple routines running if an algorithm takes more than 5 minutes (checkpoints created every 5 minutes).
In a new go routine start a REPEATABLE READ transaction (call ROLLBACK before returning from function).
Create a temporary table with a single field for sha256 hashes with a primary index.
Steam all rows from state tables (accounts, accountdata, offers, trustlines) from a database, for each entry:
1. convert it to xdr.LedgerEntryData,
2. marshal it to XDR,
3. calculate sha256 hash,
4. batch-insert hashes to a temporary table.
Stream all ledger entries from buckets for a given checkpoint ledger, for each entry:
1. marshal it to XDR,
2. calculate sha256 hash,
3. batch-delete hashes from a temporary table, if key not found return false.
If there are any hashes left in a temporary table return false
Return true.

Memory requirements: depend on the batch size and state reader TempSet implementation.
Disk requirements: for public network as of now: ~5.5M × size of a row with single sha256 hash field + index size.

If an invalid state is detected, Horizon saves a special flag to a database and panics. On startup, if a special flag is set, Horizon refuses to start and prints error about invalid state.

ire-and-curses · 2019-08-21T19:54:56Z

@bartekn Does the stellar-core http info endpoint provide enough information for this? Doc says

{
      "build" : "v11.1.0",
      "history_failure_rate" : "0",
      "ledger" : {
         "age" : 3,
         "baseFee" : 100,
         "baseReserve" : 5000000,
         "closeTime" : 1560350852,
         "hash" : "40d884f6eb105da56bea518513ba9c5cda9a4e45ac824e5eac8f7262c713cc60",
         "maxTxSetSize" : 1000,
         "num" : 24311579,
         "version" : 11
      },

which looks like the ledger hash for ledger number num.

bartekn · 2019-08-21T20:01:21Z

Ledger hash is helpful but not for state checking. As far as I remember, it's a hash of transaction set for a given ledger. So we can use this to ensure that set of transactions provided by ledger backend matches the hash in ledger header.

graydon · 2019-08-21T21:55:17Z

@bartekn Do you think it'd be adequate if we just surfaced the set of bucket level hashes? Or do you want a fine-grained set of hashes of each value in your database?

(am I understanding the plan correctly, that you want to hash every row in your database?)

bartekn · 2019-08-22T17:46:30Z

@graydon this is correct and I actually have a prototype in expingest-state-verifier branch. This works exactly as I described in #1550 (comment) but it also has a special transform function to transform ledger entries to result in the same hash. For example, currently in Horizon we only store signers for accounts. The transform function removes all the fields from ledger entries from buckets except these related to signers (accountid, master key in thresholds and signers array). I'll have a PR ready in the next few days.

This commit adds a new job and nightly workflow to compare state ingested using `SingleLedgerStateReader` with an actual state in Stellar-Core DB. In #1550 we are building a tool to periodically compare state in Horizon DB vs state in history archives. This way we can find if the state updated using txmeta is the same as in buckets for a given checkpoint. However, in case there is a bug in `SingleLedgerStateReader` the check may succeed even though the state is actually invalid. We have good tests for `SingleLedgerStateReader` but it's always better to be safe than sorry.

bartekn · 2019-08-30T16:58:52Z

The algorithm above depends on SingleLedgerStateReader correctness. Added #1677 to check a diff between state in stellar-core DB vs fetched by SingleLedgerStateReader.

bartekn · 2019-09-12T18:28:27Z

Closed in #1691.

ire-and-curses added the ingest label Jul 30, 2019

bartekn mentioned this issue Aug 9, 2019

Migrate Horizon to new ingestion #1577

Closed

34 tasks

ire-and-curses assigned bartekn Aug 19, 2019

bartekn mentioned this issue Aug 20, 2019

DatabaseProcessor.ProcessState should batch insert ledger entries #1613

Closed

bartekn added this to the Horizon 0.21.0 milestone Aug 21, 2019

This was referenced Aug 26, 2019

Horizon api find all trustees to an asset issuing account #442

Closed

all: Compare state ingested using expingest with stellar-core #1677

Merged

ire-and-curses mentioned this issue Aug 30, 2019

expingest: test ingested state on every ledger checkpoint #1687

Closed

4 tasks

This was referenced Sep 2, 2019

exp/ingest: State Verifier #1691

Merged

horizon/expingest: Fix ingest session resume #1722

Merged

bartekn added the P1 label Sep 6, 2019

bartekn mentioned this issue Sep 8, 2019

Compare local state with stellar-core state in actions #1726

Closed

bartekn closed this as completed Sep 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ingestion: verify updating data against stellar-core #1550

Ingestion: verify updating data against stellar-core #1550

ire-and-curses commented Jul 30, 2019

bartekn commented Aug 9, 2019 •

edited

Loading

ire-and-curses commented Aug 21, 2019

bartekn commented Aug 21, 2019

graydon commented Aug 21, 2019

bartekn commented Aug 22, 2019

bartekn commented Aug 30, 2019

bartekn commented Sep 12, 2019

Ingestion: verify updating data against stellar-core #1550

Ingestion: verify updating data against stellar-core #1550

Comments

ire-and-curses commented Jul 30, 2019

bartekn commented Aug 9, 2019 • edited Loading

ire-and-curses commented Aug 21, 2019

bartekn commented Aug 21, 2019

graydon commented Aug 21, 2019

bartekn commented Aug 22, 2019

bartekn commented Aug 30, 2019

bartekn commented Sep 12, 2019

bartekn commented Aug 9, 2019 •

edited

Loading