Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: Avoid combining SQL clauses when filtering is disabled #5250

Closed
wants to merge 2 commits into from

Conversation

Shaptic
Copy link
Contributor

@Shaptic Shaptic commented Mar 15, 2024

What

Stop adding the UNION ALL clause alongside a filter for transaction hashes in the history_transactions_filtered_tmp table when ingestion filtering is disabled.

Previously, the final SQL statement would look something like this:

ListOfPendingTxs = [ ... ]

SELECT `...[fields from ht and hl]...`
    FROM
        history_transactions ht LEFT JOIN history_ledgers hl 
        ON ht.ledger_sequence = hl.sequence 
    WHERE 
        (ht.ledger_sequence >= $1 AND 
        (ht.transaction_hash IN ListOfPendingTxs) OR 
        (ht.inner_transaction_hash ListOfPendingTxs))
UNION ALL 
SELECT `...[fields from ht and hl]...`
    FROM
        history_transactions_filtered_tmp ht LEFT JOIN history_ledgers hl 
        ON ht.ledger_sequence = hl.sequence 
    WHERE 
        (ht.ledger_sequence >= $1 AND 
        (ht.transaction_hash IN ListOfPendingTxs) OR 
        (ht.inner_transaction_hash ListOfPendingTxs))

whereas now it will only include the portion prior to the UNION ALL clause.

Why

There's no reason to have the additional clause when the history_transactions_filtered_tmp table is guaranteed to be empty if the EXP_ENABLE_INGESTION_FILTERING flag is set to false.

Known limitations

n/a

@Shaptic Shaptic added the performance issues aimed at improving performance label Mar 15, 2024
@Shaptic Shaptic requested a review from a team March 15, 2024 18:05
@@ -236,6 +236,7 @@ func initSubmissionSystem(app *App) {
DB: func(ctx context.Context) txsub.HorizonDB {
return &history.Q{SessionInterface: app.HorizonSession()}
},
LedgerState: app.ledgerState,
LedgerState: app.ledgerState,
FilteringDisabled: !app.config.EnableIngestionFiltering,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be easier to read if we had a FilteringEnabled field. That way you could pass in the field directly to AllTransactionsByHashesSinceLedger() without inverting it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the ingestion filtering enabled config flag is true by default, it's marked deprecated, won't be override-able soon. for most deployments, it's probably not being set, and defaulting to true.

may need to continue to check the config flag for now but also check that at least one filter rule is enabled to determine if effectively disabled, can probably wrap that as new method on the QFilter interface:

func (q *Q) IsFilteringEffective(ctx context.Context) (bool, error) {
	accountFilter, err := q.GetAccountFilterConfig(ctx)
	if err != nil {
		return false, err
	}

	assetFilter, err := q.GetAssetFilterConfig(ctx)
	if err != nil {
		return false, err
	}

	return accountFilter.Enabled || assetFilter.Enabled, nil
}

in db migrations, both rules were seeded in the db as disabled by default.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we're going to deprecate the flag and always enable filtering then I think we should leave the pending transactions query as is because, the over head of having to query the db on whether filtering rules are present is probably more inefficient than getting rid of the union in the pending transactions query. There is also the problem that querying the db on whether filtering rules are present is still vulnerable to a race condition where filtering rules are added right after we query the db.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, agree on that summary, 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah wow, I didn't realize we were deprecating the flag entirely! in that case there's no benefit to having this piece of code, since as you said @tamirms we'd need to actively query for filters to avoid a(n empty) subquery which defeats the purpose. thanks for the discussion!

@@ -30,21 +30,34 @@ func (q *Q) PreFilteredTransactionByHash(ctx context.Context, dest interface{},

// TransactionsByHashesSinceLedger fetches transactions from `history_transactions_filtered_tmp`
// table which match the given hash since the given ledger sequence (for perf reasons).
func (q *Q) AllTransactionsByHashesSinceLedger(ctx context.Context, hashes []string, sinceLedgerSeq uint32) ([]Transaction, error) {
func (q *Q) AllTransactionsByHashesSinceLedger(
Copy link
Contributor

@tamirms tamirms Mar 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you add a test which exercises both code paths where includeFiltered is true / false?

@Shaptic
Copy link
Contributor Author

Shaptic commented Mar 20, 2024

counterproductive per the above discussion: #5250 (comment)

@Shaptic Shaptic closed this Mar 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance issues aimed at improving performance
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants