Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizon: /account/:id/payments timeout error for accounts with large number of trades #1808

Open
orbitlens opened this issue Oct 3, 2019 · 19 comments
Assignees
Labels
horizon performance issues aimed at improving performance

Comments

@orbitlens
Copy link

This requests fails with 504 error after ~20s loading timeout:
https://horizon.stellar.org/accounts/GAMU5TQFUMDGVKYQPPDCD2MKKUUWELSQAEKNNU4RFQCWFSRBPJA2MAGQ/payments?order=asc&limit=200

I suspect that the root of the problem lies in the huge number of trades (120K trades and only 3 payments). The same endpoint for other accounts shows decent performance (for example, this one).

@ire-and-curses
Copy link
Member

Interesting - I just tried it and it came back in under 5s.

@orbitlens
Copy link
Author

Checked again. Run 5 times – took about ~5s average.
Still relatively slow, but at least it doesn't return an error. Maybe the server was at capacity 6 hours ago. I checked 4 times before submitting an issue, and each time received 504 error after waiting for about 20 seconds.

Could you please check Horizon server logs?

To my mind, the problem is worth investigation. From my experience, any query that executes longer than 150-200ms is potentially a problem because such requests tend to provoke "avalanches" under heavy load scenarios – each running query reserves a portion of db user memory buffers for grouping and sorting operations, making the subsequent queries slower and slower as each new sql request may provoke cached results and indexes eviction from the shared mem pool depending on db settings. That's not usually the case when the server has 256+ GB of memory, but may become a cause of degraded performance with limited memory setups.

@ire-and-curses
Copy link
Member

Yeah we'll take a look. What was the timestamp when you tested?

@orbitlens
Copy link
Author

Yeah we'll take a look. What was the timestamp when you tested?

I submitted the issue at 2019-10-03T15:59:21.000Z, so probably between 2019-10-03T15:20:00.000Z and 2019-10-03T15:50:00.000Z , or 1570116000 - 1570117800 UNIX time.

@ire-and-curses ire-and-curses added this to the Horizon 0.23.0 milestone Oct 15, 2019
@bartekn bartekn self-assigned this Oct 16, 2019
@bartekn
Copy link
Contributor

bartekn commented Oct 28, 2019

We've discovered that some other queries don't use proper indexes. We're currently verifying with the ops team when was the last run of VACUUM ANALYZE on Horizon DB and if it's the issue. If not, we'll check this issue again.

@bartekn
Copy link
Contributor

bartekn commented Nov 13, 2019

OK, it looks like VACUUM ANALYZE improved some queries speed however there are still corner cases where this query is extremely slow. I think @orbitlens is right that it's connected to distribution of payments ops in a given account history. Here's the EXPLAIN ANALYZE for one of such queries:

                                                                                              QUERY PLAN                                                                                               
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.72..717.44 rows=50 width=675) (actual time=302.980..515589.945 rows=44 loops=1)
   ->  Nested Loop Left Join  (cost=1.72..15096986.73 rows=1054669 width=675) (actual time=302.980..515589.894 rows=44 loops=1)
         Filter: (ht.successful OR (ht.successful IS NULL))
         Rows Removed by Filter: 8
         ->  Nested Loop  (cost=1.15..11385841.43 rows=1657193 width=422) (actual time=302.968..515487.630 rows=52 loops=1)
               ->  Index Only Scan Backward using hist_op_p_id on history_operation_participants hopp  (cost=0.57..127568.64 rows=4716867 width=8) (actual time=0.022..20268.779 rows=3996530 loops=1)
                     Index Cond: ((history_account_id = '5308433'::bigint) AND (history_operation_id < '9223372036854775807'::bigint))
                     Heap Fetches: 70986
               ->  Index Scan using index_history_operations_on_id on history_operations hop  (cost=0.57..2.38 rows=1 width=414) (actual time=0.123..0.123 rows=0 loops=3996530)
                     Index Cond: (id = hopp.history_operation_id)
                     Filter: (type = ANY ('{0,1,2,13,8}'::integer[]))
                     Rows Removed by Filter: 1
         ->  Index Scan using index_history_transactions_on_id on history_transactions ht  (cost=0.57..2.23 rows=1 width=261) (actual time=1.960..1.961 rows=1 loops=52)
               Index Cond: (id = hop.transaction_id)
 Planning time: 0.425 ms
 Execution time: 515590.026 ms
(16 rows)

My understanding is that Postgres runs a nested loop (to join tables): the outer loop finds all operations for a given account, then inner loop checks if the given operation is actually a payment. Now, when there are many non-payment ops Postgres needs to skip many of them until it finds a specified number (LIMIT) of payment operations.

The problem is that it's not possible to improve the query with existing schema. Data required for this query is in two different tables: information about operation type in history_operations and information about operation participant in history_operation_participants.

It looks like the only reasonable, long-term solution is to:

  1. Add a new type field on history_operation_participants (it's easy to fill past operations with a simple UPDATE query during migrations).
  2. Create a new "hist_type_op_p_id" UNIQUE, btree (history_account_id, type, history_operation_id).
  3. Change payments for account queries to use the index above.
  4. Update ingestion code to store the type in history_operation_participants.

Then all queries like:

WHERE hopp.history_account_id = '5308433' AND hopp.type IN ('0','1','2','13','8') AND hopp.history_operation_id < '9223372036854775807'

will use the index. I'll try this in the staging cluster this week.

@bartekn
Copy link
Contributor

bartekn commented Nov 14, 2019

I tried the following steps:

ALTER TABLE history_operation_participants ADD type integer DEFAULT NULL;
UPDATE history_operation_participants SET type = (SELECT type FROM history_operations where id = history_operation_participants.history_operation_id);
VACUUM VERBOSE history_operation_participants; # to reclaim space after UPDATE
CREATE UNIQUE INDEX hist_op_p_account_type_id ON history_operation_participants USING BTREE(history_account_id, type, history_operation_id);

Total query time was around 10 hours on AWS RDS db.r4.8xlarge. Now when I run the same query with a small modification: hopp.type in WHERE clause instead of hop.type here's the query plan I get:

                                                                                          QUERY PLAN                                                                                           
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.72..498.21 rows=50 width=676) (actual time=160.489..1892.501 rows=44 loops=1)
   ->  Nested Loop Left Join  (cost=1.72..11828672.21 rows=1191241 width=676) (actual time=160.488..1892.461 rows=44 loops=1)
         Filter: (ht.successful OR (ht.successful IS NULL))
         Rows Removed by Filter: 8
         ->  Nested Loop  (cost=1.15..7677326.14 rows=1832151 width=420) (actual time=160.479..1833.434 rows=52 loops=1)
               ->  Index Scan Backward using hist_op_p_id on history_operation_participants hopp  (cost=0.57..3216882.16 rows=1832151 width=8) (actual time=160.466..1768.630 rows=52 loops=1)
                     Index Cond: ((history_account_id = '5308433'::bigint) AND (history_operation_id < '9223372036854775807'::bigint))
                     Filter: (type = ANY ('{0,1,2,13,8}'::integer[]))
                     Rows Removed by Filter: 4011177
               ->  Index Scan using index_history_operations_on_id on history_operations hop  (cost=0.57..2.42 rows=1 width=412) (actual time=1.242..1.242 rows=1 loops=52)
                     Index Cond: (id = hopp.history_operation_id)
         ->  Index Scan using index_history_transactions_on_id on history_transactions ht  (cost=0.57..2.26 rows=1 width=264) (actual time=1.130..1.131 rows=1 loops=52)
               Index Cond: (id = hop.transaction_id)
 Planning time: 4.041 ms
 Execution time: 1892.633 ms

This is a huge improvement however I was wondering if we can do better. 1892.633 ms is great compared to 10 minutes before I was expecting execution time to be something like a few milliseconds. I was surprised that the new index (hist_op_p_account_type_id) I created is not used, instead Postgres is using the old hist_op_p_id index and filter rows using the new type filed in history_operation_participants. After reading Postgres docs again and running a couple more queries I realized that index doesn't help much when multiple values of a field are used (WHERE x in (a, b, c)). After rewriting the query to use a single type (ex. 1) the execution time was less than 1ms and the index was actually used!

                                                                                               QUERY PLAN                                                                                                
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=1.84..1457.26 rows=200 width=676) (actual time=0.096..0.773 rows=42 loops=1)
   ->  Nested Loop Left Join  (cost=1.84..7854899.60 rows=1079400 width=676) (actual time=0.095..0.753 rows=42 loops=1)
         Filter: (ht.successful OR (ht.successful IS NULL))
         Rows Removed by Filter: 8
         ->  Nested Loop  (cost=1.27..4093308.33 rows=1660137 width=420) (actual time=0.087..0.405 rows=50 loops=1)
               ->  Index Only Scan Backward using hist_op_p_account_type_id on history_operation_participants hopp  (cost=0.70..45584.78 rows=1660137 width=8) (actual time=0.077..0.113 rows=50 loops=1)
                     Index Cond: ((history_account_id = '5308433'::bigint) AND (type = 1) AND (history_operation_id < '9223372036854775807'::bigint))
                     Heap Fetches: 0
               ->  Index Scan using index_history_operations_on_id on history_operations hop  (cost=0.57..2.43 rows=1 width=412) (actual time=0.005..0.005 rows=1 loops=50)
                     Index Cond: (id = hopp.history_operation_id)
         ->  Index Scan using index_history_transactions_on_id on history_transactions ht  (cost=0.57..2.26 rows=1 width=264) (actual time=0.005..0.005 rows=1 loops=50)
               Index Cond: (id = hop.transaction_id)
 Planning time: 0.490 ms
 Execution time: 0.842 ms

To sum it up, I think I'm going to test a new field and index like this:

ALTER TABLE history_operation_participants ADD is_payment boolean DEFAULT NULL;
UPDATE history_operation_participants SET is_payment = (SELECT type IN ('0','1','2','13','8') FROM history_operations where id = history_operation_participants.history_operation_id);
VACUUM VERBOSE history_operation_participants; # to reclaim space after UPDATE
CREATE UNIQUE INDEX hist_op_p_account_payment_id ON history_operation_participants USING BTREE(history_account_id, is_payment, history_operation_id);

After the new migration, we should be able to rewrite payment queries' WHERE clause from hop.type IN ('0','1','2','13','8') to hopp.is_payment = true.

@tomquisel
Copy link
Contributor

This is an impressive improvement!

It seems a little special-case, though. We might forget to update is_payment as new types of operations are added, and it doesn't generalize to other types of operations. We may want to grab only offer operations or settings operations in the future like stellar.expert does.

It seems like the query analyzer is just straight getting it wrong. If you ran separate queries, one for each of those 5 payment types, it seems like it'd be much faster than the 1.8 seconds that's happening above. The logic is that the cost of the is_payment query is probably similar to the single type query, so you'd end up with about 4ms running that query 5 times.

The ideal would be to figure out a way to get the query optimizer to do this right, but another option is to execute those 5 queries and concatenate their results.

Not a big deal if you're already well down the is_payment path though.

@orbitlens
Copy link
Author

@bartekn After a second glance at the query, I spotted that you are using stringified values for type here: AND hopp.type IN ('0','1','2','13','8'). Why so? I'm almost 100% sure that Pg is smart enough to deal automatically with the implicit types conversion in this case, but still...

Also, have you tried building an expression index to support this particular query instead of introducing new field? It might be better in terms of space saving. Something like

CREATE INDEX idx_history_operations_is_payment
ON history_operations(type IN (0,1,2,8,13));

This won't provide the same performance as a new is_payment field on the history_operation_participants table though.

@bartekn
Copy link
Contributor

bartekn commented Nov 18, 2019

It seems like the query analyzer is just straight getting it wrong. If you ran separate queries, one for each of those 5 payment types, it seems like it'd be much faster than the 1.8 seconds that's happening above. The logic is that the cost of the is_payment query is probably similar to the single type query, so you'd end up with about 4ms running that query 5 times.

I think query planner is sophisticated but probably not as much to realize that it's faster to do it that way instead of using different index and then filtering rows.

I like the idea of either running the query multiple times and then merging results in Horizon code or maybe using UNION but maybe the simpler (but also future proof) way is create group_type that's enum and can be either payments, offers, settings or whatever we think can be useful for future queries.

@bartekn After a second glance at the query, I spotted that you are using stringified values for type here: AND hopp.type IN ('0','1','2','13','8'). Why so? I'm almost 100% sure that Pg is smart enough to deal automatically with the implicit types conversion in this case, but still...

Yes, postgres will convert values before running the query so it's fine. I used ' because I copied the query directly from slow query log where all params are enquoted and forgot to remove the quotes. But again, it shouldn't affect the performance at all.

Also, have you tried building an expression index to support this particular query instead of introducing new field?

This won't solve the problem because the lack of index on history_operations is not an issue here. What postgres is doing now (in a broken query) is really a loop, something like:

for each row in index history_operation_participants {
    op := find operations with this ID in history_operations
    // line above fast but slow when you do it thousands times and you have to because some accounts have a lot of non payment ops
    check if the op.type is one of payment types
}

The first fix I made was adding type to history_operation_participants, then the loop changes to:

for each row in index history_operation_participants {
    check if the operation is one of payment types
}

It no longer needs to check the other table but still needs to filter out a lot of rows in history_operation_participants. The final fix (is_payment or similar - check top of this comment) will remove a need for check too because we'll have an index for (history_account_id, is_payment, history_operation_id - last one used for sorting) so index will be enough to find data (no need for table scan or filtering rows that don't match the query condition).

What could help is adding history_operation_participants(type IN (0,1,2,8,13)) index but the problem here is partial indexes are matched at query planing time (https://www.postgresql.org/docs/9.5/indexes-partial.html), not at a run time what means we'd need to remove params from "Payment for Account" queries. Removing params maybe isn't a huge deal but the problem is that you'd need to recreate an index whenever a new payment operation is added.

@bartekn
Copy link
Contributor

bartekn commented Nov 19, 2019

Today, during Horizon Team meeting, we decided to prioritize fixing this. Please add any other ideas to this issue by Nov, 27th so we have time to implement this in 0.24.0 release. Otherwise, we'll add type/group_type to history_operation_participants.

CC: @ire-and-curses @tamirms @abuiles.

@bartekn
Copy link
Contributor

bartekn commented Nov 22, 2019

I checked other types of slow queries to make sure we do all necessary migrations in one go. I realized that the fix above is not enough because what we fix here is only one case where many rows must be scanned in a nested loop in order to check the query condition: finding successful payment operations for accounts with a large number of non-payment ops.

In case of accounts with many non-payment ops that are part of failed txs I think this fix won't make them faster because a loop checking history_transactions will have to skip a lot of them in order to find successful txs only.

So ultimately, in order to support super fast "For Accounts" queries in history tables we need to add the following indexes:

  • CREATE UNIQUE INDEX hist_op_p_account_is_payment_is_successful_id ON history_operation_participants USING BTREE(history_account_id, is_payment, is_successful, history_operation_id);
  • CREATE UNIQUE INDEX hist_op_p_account_is_payment_id ON history_operation_participants USING BTREE(history_account_id, is_payment, history_operation_id);
  • CREATE UNIQUE INDEX hist_op_p_account_is_successful_id ON history_operation_participants USING BTREE(history_account_id, is_successful, history_operation_id);
  • CREATE UNIQUE INDEX hist_tx_p_is_successful_id ON public.history_transaction_participants USING BTREE(history_account_id, is_successful, history_transaction_id);

Here's how the queries will use them after updating them to use proper columns:

Endpoint Index Use case
/accounts/{id}/payments hist_op_p_account_is_payment_is_successful_id When many non-payment ops and failed tx in history
/accounts/{id}/payments?include_failed=true hist_op_p_account_is_payment_id When many non-payment ops in history
/accounts/{id}/operations hist_op_p_account_is_successful_id When many failed tx in history
/accounts/{id}/operations?include_failed=true hist_op_p_id (existing) -
/accounts/{id}/transactions hist_tx_p_is_successful_id When many failed tx in history
/accounts/{id}/transactions?include_failed=true hist_tx_p_id (existing) -

Thoughts:

  • New indexes will probably take a lot of disk space.
  • We probably need to move this after mirroring is live so we can check if performance really improves and there are no other issues.
  • Maybe instead we should disable include_failed filters on some endpoints? It can be useful sometimes though.
  • Or maybe we leave it as it is, if account doesn't have an average distribution then Horizon will respond with 504s? Would be great to check how many accounts with strange distribution are there.

@tomquisel
Copy link
Contributor

Thanks Bartek! Like you point out, it's pretty rough to have to add all those new indices. To me it seems like those two middle indices don't provide meaningful value beyond the top index (history_account_id, is_payment, is_successful, history_operation_id) if the postgres query optimizer is using that index properly. Do you have reason to think the top index is insufficient?

@tomquisel
Copy link
Contributor

Actually, how about partitioning on is_successful in our history tables? This seems like a really elegant use for the partitioning feature. It'd allow us to go down to just two indices in this case, and should provide speedups across the board because is_successful=false data is probably rarely accessed. It looks like migrating to that would be painful though.

@bartekn
Copy link
Contributor

bartekn commented Nov 25, 2019

@tomquisel: Do you have reason to think the top index is insufficient?

According to Postgres docs:

A multicolumn B-tree index can be used with query conditions that involve any subset of the index's columns, but the index is most efficient when there are constraints on the leading (leftmost) columns. The exact rule is that equality constraints on leading columns, plus any inequality constraints on the first column that does not have an equality constraint, will be used to limit the portion of the index that is scanned. Constraints on columns to the right of these columns are checked in the index, so they save visits to the table proper, but they do not reduce the portion of the index that has to be scanned.

So using only (history_account_id, is_payment, is_successful, history_operation_id) index will be faster than no index but I'm afraid that some data distributions will make it slow for request like is_payment=true AND is_successful=true queries where number of failed transactions is large. Worth checking.

@tomquisel: Actually, how about partitioning on is_successful in our history tables?

This is interesting but I haven't found information in the docs if it's possible to send queries that get data from two partitions automatically (instead of using ex. UNION). This is to support ?include_failed=true queries that return both successful and failed objects. Also, it looks like it's not possible to add partitions to existing table so we'd need to drop and create a table (and ingest data again):

It is not possible to turn a regular table into a partitioned table or vice versa.

@tomquisel
Copy link
Contributor

tomquisel commented Nov 25, 2019

Great points!

Given this index: (history_account_id, is_payment, is_successful, history_operation_id)

  1. I think the example of history_account_id=X AND is_payment=true AND is_successful=true will be fast since it has equality on all the leading columns.
  2. However a query like history_account_id=X AND is_successful=true will be slow, as you point out.

I'm liking the UNION approach with just the single index to handle 2. Generate a is_payment = false query and UNION it with the is_payment = true version. It seems quite desirable to save the overhead of those extra indices.

Do you see any problems with that approach in the real Horizon queries? Obviously they're more complex and involve joins that may not mix well with UNION.

As far as partitions, my understanding is that queries automatically get data from all partitions without needing UNION. The example in section 5.10.4 here shows that happening by default.

I'm leaning away from partitions, though. They seem to add serious maintenance overhead and constraints.

@bartekn
Copy link
Contributor

bartekn commented Feb 4, 2020

Looks like Figma run into similar problem: https://www.figma.com/blog/post-mortem-service-disruption-on-january-21-22-2020/ They say that upgrading to Postgres 11 solved the problems with the query planner for them. I wonder if this could solve this issue. I’m pretty confident that schema change is the only option but… maybe.

The upgrade to PostgreSQL 11 mitigated both of these causes: the query planner is able to take advantage of the uniqueness of the inner relation to avoid the problematic query plan, and the performance characteristics of autovacuuming have been significantly improved.

Let's try it out in staging server first before any code/schema changes.

@bartekn bartekn removed their assignment Mar 3, 2020
@ire-and-curses ire-and-curses removed this from the Horizon 1.1.0 milestone Apr 7, 2020
@bartekn bartekn added this to the Horizon 1.7.0 milestone Jul 14, 2020
@tamirms tamirms self-assigned this Jul 14, 2020
@bartekn bartekn added the performance issues aimed at improving performance label Aug 5, 2020
@tamirms
Copy link
Contributor

tamirms commented Aug 5, 2020

I attempted the problematic payments request on the staging db which was created recently and has been fully ingested. unfortunately, the request still times out on staging. So at this point we should check whether postgres 11 would improve the query. If that doesn't work then we can consider schema changes

@2opremio
Copy link
Contributor

@tamirms could you retest the query?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
horizon performance issues aimed at improving performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants