-
Notifications
You must be signed in to change notification settings - Fork 511
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ingest: Extract ledger entry changes from Tx Meta in a deterministic order #5070
Conversation
This change is surprising to me: the key order in meta has never been deterministic (since 2015), so what changed in Horizon and/or clients? If this is it to help with testing, it seems odd to take a dependency on ordering in production? Also, meta already stored basically everywhere (Horizon DB, core-DB in "legacy mode") is already non-canonical so this change may not be enough. |
I mean: if it makes the logic simpler, you can try to normalize meta in some way across all data sources, but I am not sure it's something you really want. |
Having horizon ingestion be deterministic and reproducible makes it a lot easier for us catch bugs in release candidates. The most straightforward way to guarantee that horizon ingestion is deterministic is to ensure that the order of tx meta is deterministic. What would be the alternative approach?
This change will re-sort tx meta before Horizon consumes it in ingestion. The sorting will be applied regardless if the source of the tx meta is captive core or the core-db in legacy mode. |
Maybe it depends at which layer you test, which is why I was asking about what changed in Horizon: I thought you would just compare ingested state with expected state (so set comparisons) and not really worry about the format of the diff. |
@MonsieurNicolas Horizon stores some ledger entries in its database and those are periodically compared against the ledger entries from history archive snapshots. That comparison is a set comparison like you described (where ordering doesn't matter). But we also compare horizon effects between the release candidate and the last stable release. The ordering of meta does affect the ordering of horizon effects in some cases. Here is an example comparing the effects response for a specific operation in horizon staging and horizon production In staging the effects response looks like:
In production the effects response looks like:
The responses are almost identical. The only difference is in the After doing some digging, I found that this issue of random tx meta ordering affecting the effects response has occurred before in #3942 . If you look at the conversation in the issue, @2opremio argues that core should provide a deterministic ordering and he was met with objections from Jon Jove who was concerned about the impact of core's performance. The way we fixed this issue is by adding sorting logic locally in the ingestion processor: We could fix this new issue in the same way by adding sorting logic locally in this part of the code: go/services/horizon/internal/ingest/processors/effects_processor.go Lines 262 to 267 in facabfc
However, I believe a more robust solution would be sorting tx meta at the source either in core (the ideal solution) or from the ingestion library which reads tx meta from core (which is what this PR does). |
// but if that is the case, we fall back to the original ordering of the changes | ||
// by using a stable sorting algorithm. | ||
func sortChanges(changes []Change) []Change { | ||
sort.Stable(newSortableChanges(changes)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since sortableChanges
can mutate the underlying array in changes
passed from caller as part of Swap method, should a copy of changes
be made to avoid side effect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the intent was to mutate changes
and avoid making a copy of the slice. I can make sortChanges()
have a void return value to make that intent more clear
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, that works also, thanks!
ingest/ledger_change_reader.go
Outdated
@@ -139,6 +139,7 @@ func (r *LedgerChangeReader) Read() (Change, error) { | |||
Post: nil, | |||
} | |||
} | |||
changes = sortChanges(changes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this now results in all changes being (re)sorted, which is great for the goal stated, do you think it may have potential to trigger false warnings from verify-range as that compares history table dumps, at least in the interim while the production dumps are from a horizon release prior to the one having this change, maybe it's just an aspect worth mentioning in the changelog notes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think it may have potential to trigger false warnings from verify-range ?
yes, that's true but we actually already have false positives from verify-range due to random ordering of tx meta from core. In fact, I discovered this issue by debugging verify-range failures on the soroban release candidate.
The aim of this PR is to fix this issue for all releases going forward. If we want to compare a release candidate with an older version then we would need to cherry-pick this commit which sorts the ingestion changes deterministically on top of the older version. Then we could run verify-range on the release candidate and compare it against the older version + cherry-picked commit.
Good call about adding this to the changelog notes. I can do that
Thanks @tamirms for the background. I didn't realize this was needed for effects that don't have good global "IDs". Soroban-RPC / other ingesters may not need this though (but maybe we have a lot of spare CPU there)? As far as doing the sort at the core layer: as we'd be looking at some pretty bad migration (to eventually remove this from Horizon), it does not seem to be worth doing at this time (we may revisit this if we do SPVs/parallel execution). |
Yes, horizon effects try to correlate a meta change to a specific operation. In the cases where there is more than 1 meta change of the same ledger entry type associated to an operation we can ran into ordering issues unless we perform a sort.
Yes, this is true, soroban-rpc does not do anything similar to effects so it does not need tx meta to be ordered consistently |
PR Checklist
PR Structure
otherwise).
services/friendbot
, orall
ordoc
if the changes are broad or impact manypackages.
Thoroughness
.md
files, etc... affected by this change). Take a look in the
docs
folder for a given service,like this one.
Release planning
needed with deprecations, added features, breaking changes, and DB schema changes.
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.
What
In stellar/stellar-core#3952 we discovered that the tx meta emitted by captive core can appear in a random order. This is problematic for Horizon because we would like ingestion to always be deterministic and reproducible. For some releases we test our ingestion system by ingesting history twice, once with the release candidate and the second time with the previous stable version and then we assert that content produced by both runs are the same. If tx meta can appear in a random order this can result in horizon effects being produced in a different order.
This commit is an attempt to make the ordering of Tx Meta consistent in Horizon ingestion.
Known limitations
stellar/stellar-core#3954 is an attempt to fix this issue in core. Once that change is included in a core stable release we can revert this PR.