Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/db/history: update the claimable balance SQL query #4398

Merged

Conversation

sreuland
Copy link
Contributor

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Change the claimable balance SQL Query generation to position LIMIT clause dependent on presence of cursor parameters as that determines how many effective rows and the indexes are chosen by db query planner.

Why

the query planner isn't picking the right indexes to use on claimable balance query due to LIMIT clause

Known limitations

…IT depending on whether paging cursor parameters are present or not
@sreuland sreuland requested a review from a team May 24, 2022 00:27
Copy link

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After performing several more benchmarking experiments, I think we actually need to place the limit clause based on whether or not the claimants filter is present.

Regardless of pagination, if the user is filtering on claimants, the limit clause should be outside the subselect query:

") select " + claimableBalancesSelectStatement + " from cb LIMIT ?"

If the claimants filter is not present, the LIMIT clause should always reside in the subselect:

"LIMIT ?) select " + claimableBalancesSelectStatement + " from cb"

I tested with the following combinations:

  • filtering only asset
  • filtering asset with pagination
  • filtering only claimants
  • filtering claimants with pagination
  • filtering only sponsor
  • only pagination
  • filtering asset and claimants with pagination

the assets filter sees great performance improvement by always having the LIMIT clause within the subselect. Other combinations only had negligible improvement. Conversely, the claimants filter requires the LIMIT clause to always exist outside the subselect for optimal performance.

@2opremio 2opremio deleted the branch stellar:horizon-db-optimizations May 24, 2022 14:40
@2opremio 2opremio closed this May 24, 2022
@sreuland
Copy link
Contributor Author

@2opremio , I can adjust this to assemble the sql query based on @sydneynotthecity 's profiling, should I proceed on that or hold off?

@2opremio
Copy link
Contributor

Oh crap, I deleted the branch by mistake. Let me recreate it.

@2opremio 2opremio reopened this May 24, 2022
@2opremio
Copy link
Contributor

Done! No, we shouldn't hold off!

@sreuland
Copy link
Contributor Author

sreuland commented May 24, 2022

@sydneynotthecity @2opremio , I reworked that claimant balance query for LIMIT placement per the profiling results and recommendations. If the query has asset and claimant parameters, then i chose to let that follow the LIMIT placement per the claimant, i.e. outside query.

@sreuland sreuland requested a review from sydneynotthecity May 24, 2022 22:18
Copy link

@sydneynotthecity sydneynotthecity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question regarding whether we need to add a space to the SQL syntax, otherwise looks good!

// we need to use WITH syntax and correct LIMIT placement to force the query planner to use the right
// indexes, otherwise when the limit is small, it will use an index scan
// which will be very slow once we have millions of records
limitClausePlacement := "LIMIT ?) select " + claimableBalancesSelectStatement + " from cb"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about adding a space before LIMIT? Or does the original statement insert the necessary space? Like " LIMIT ?) select " instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @sydneynotthecity , yes, looks like the squirrel Suffix takes care of spacing on concatenation, i dumped the final emitted sql under both conditions and looks ok with spacing:

# limit on the outer
WITH cb AS ( SELECT cb.id, cb.claimants, cb.asset, cb.amount, cb.sponsor, cb.last_modified_ledger, cb.flags FROM claimable_balances cb WHERE (cb.last_modified_ledger, cb.id) > (?, ?) AND cb.claimants @> '[{"destination": "GC3C4AKRBQLHOJ45U4XG35ESVWRDECWO5XLDGYADO6DPR3L7KIDVUMML"}]' ORDER BY cb.last_modified_ledger asc, cb.id asc ) select cb.id, cb.claimants, cb.asset, cb.amount, cb.sponsor, cb.last_modified_ledger, cb.flags from cb LIMIT ? 

# limit on the inner
WITH cb AS ( SELECT cb.id, cb.claimants, cb.asset, cb.amount, cb.sponsor, cb.last_modified_ledger, cb.flags FROM claimable_balances cb ORDER BY cb.last_modified_ledger asc, cb.id asc LIMIT ?) select cb.id, cb.claimants, cb.asset, cb.amount, cb.sponsor, cb.last_modified_ledger, cb.flags from cb 

thanks for double checking on that!

@sreuland sreuland merged commit da94a37 into stellar:horizon-db-optimizations May 25, 2022
sreuland added a commit that referenced this pull request May 26, 2022
* #4382: refined the placement of LIMIT clause on claimant balance query by claimant id per profiling results.
sreuland added a commit that referenced this pull request Jun 9, 2022
* #4382: refined the placement of LIMIT clause on claimant balance query by claimant id per profiling results.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants