Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Remove database differentiation in _get_state_groups_from_groups_txn - SQLite supports recursive queries #14531

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Nov 23, 2022

Remove database differentiation (SQLite vs Postgres specific code) in _get_state_groups_from_groups_txn (_get_state_groups_from_groups). We can use recursive queries in our supported SQLite version range

Follow-up to #14527

Dev notes

Testing queries against the matrix.org database: https://gitlab.matrix.org/new-vector/internal/-/wikis/Matrix.org-Synapse-ops#access-matrixorg-synapses-database

Query test dummy data
CREATE TABLE state_groups_state (
    state_group BIGINT NOT NULL,
    room_id TEXT NOT NULL,
    type TEXT NOT NULL,
    state_key TEXT NOT NULL,
    event_id TEXT NOT NULL
);
INSERT INTO state_groups_state
	(state_group, room_id, type, state_key, event_id)
VALUES
  (1, 'room1', 'm.room.create', '', '$abc-create'),
  (1, 'room1', 'm.room.member', '@madlittlemods:matrix.org', '$abc-member'),
  (1, 'room1', 'm.room.history_visibility', '', '$abc-history'),
  (2, 'room1', 'm.room.member', '@madlittlemods:matrix.org', '$abc-member2'),
  (2, 'room1', 'm.room.history_visibility', '', '$abc-history2'),
  (3, 'room1', 'm.room.member', '@madlittlemods:matrix.org', '$abc-member3'),
  (3, 'room1', 'm.room.history_visibility', '', '$abc-history3');

CREATE TABLE state_group_edges (
    state_group BIGINT NOT NULL,
    prev_state_group BIGINT NOT NULL
);

INSERT INTO state_group_edges
	(state_group, prev_state_group)
VALUES
  (3, 2),
  (2, 1);

Postgres vs SQLite:

Recursive queries work just fine

Recursive queries are supported in "SQLite 3.8.3 or higher". Our minimum supported SQLite version is 3.27.0

SQLite doesn't support ?::bigint

Solution: Use cast(? as bigint) ✔️

SQLite doesn't support SELECT DISTINCT ON

These nice SELECT DISTINCT ON (a, b) a, b, c queries that work in postgres are not supported in SQLite which make our lives difficult.

There are many context-specific ways to workaround/ignore these problems. One way is just to not get the distinct results in favor of sorting them and then only care about the first row result. But this means you're transporting a lot of duplicate pairs from the database back to your app just to ignore.

If you're lucky, maybe you can use GROUP BY if the columns you group by are the same ones you want to select (doesn't work in our case since we want to group by (type, state_key) but select type, state_key, event_id.

Or maybe you want to complicate things with a bunch of sub-queries.

Other references:

SQLite doesn't support parenthesis around UNION subqueries which also means no per-subquery ORDER/LIMIT

SQLite doesn't like parenthesis around each clause like (select_clause_A) UNION (select_clause_B) -> sqlite3.OperationalError: near "(": syntax error

If you're just doing a SELECT without ORDER/LIMIT, then you can easily just remove the parenthesis grouping. But you're kinda stuck if you wanted the per-subquery ORDER/LIMIT. You can ORDER/LIMIT at the end after the UNION but it's not the same.

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

@MadLittleMods MadLittleMods added T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db labels Nov 23, 2022
# should be the same.
sql = """
WITH RECURSIVE sgs(state_group) AS (
VALUES(CAST(? AS bigint))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only change to this code here is changing the cast to be compatible between SQLite and Postgres.

SQLite chokes on the previous ?::bigint syntax

typ, state_key, event_id = row
key = (intern_string(typ), intern_string(state_key))
results[group][key] = event_id
overall_select_clause = " UNION ".join(select_clause_list)
else:
Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff is a bit weird to follow but we're just removing the whole SQLite specific else and unindenting the Postgres one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -98,153 +98,100 @@ def _get_state_groups_from_groups_txn(
# a temporary hack until we can add the right indices in
txn.execute("SET LOCAL enable_seqscan=off")
Copy link
Contributor Author

@MadLittleMods MadLittleMods Nov 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting is Postgres specific so it's the only thing left that is gated.

Do we have to worry about this in SQLite? I don't see a similar setting for SQLite

I think we have the correct indexes in place and this is just a hint to use those indexes.

@DMRobertson
Copy link
Contributor

Is this worth resurrecting?

else:
max_entries_returned = state_filter.max_entries_returned()

where_clause, where_args = state_filter.make_sql_filter_clause()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this worth resurrecting?

@DMRobertson It seems possible to accomplish but there are a few more hurdles. If you're keen to take it over, feel free. I won't be looking at it until I get more backend facing again.

See #14531 (comment)

This does mean that I had to leave one SQLite specific query in place.
But at least we're using the same recursive query pattern now.
Comment on lines +150 to +161
# Small helper function to wrap the union clause in parenthesis if we're
# using postgres. This is because SQLite doesn't allow `LIMIT`/`ORDER`
# clauses in the union subquery but postgres does as long as they are
# wrapped in parenthesis which this function handles the complexity of
# handling.
def wrap_union_if_postgres(
union_clause: str, extra_order_or_limit_clause: str = ""
) -> str:
if isinstance(self.database_engine, PostgresEngine):
return f"""({union_clause} {extra_order_or_limit_clause})"""

return union_clause
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better just to define this function somewhere above once but thought it lost the specific usage context when we put it somewhere else. Feel free to poke ⏩

Comment on lines +216 to +226
# SQLite doesn't support `SELECT DISTINCT ON`, so we have to just get
# some potential duplicate (type, state_key) pairs and then only use the
# first of each kind we see.
overall_select_clause = f"""
SELECT type, state_key, event_id
FROM state_groups_state
WHERE state_group IN (
SELECT state_group FROM sgs
) {where_clause}
ORDER BY type, state_key, state_group DESC
"""
Copy link
Contributor Author

@MadLittleMods MadLittleMods May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query is exactly the same as the Postgres version above except for the DISTINCT ON (type, state_key) part at the beginning.

Any interest in sharing the bulk of the query here? Not sure it would be more clear

Comment on lines -229 to -239
# If the number of entries in the (type,state_key)->event_id dict
# matches the number of (type,state_keys) types we were searching
# for, then we must have found them all, so no need to go walk
# further down the tree... UNLESS our types filter contained
# wildcards (i.e. Nones) in which case we have to do an exhaustive
# search
if (
max_entries_returned is not None
and len(results[group]) == max_entries_returned
):
break
Copy link
Contributor Author

@MadLittleMods MadLittleMods May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the current consolidation, we're losing this shortcut when running with SQLite (and the shortcut is probably pretty useful) 😬

And as mentioned in another comment, also conscious that some of these subtle hacks to consolidate might not be better than the big if-else distinction between Postgres and SQLite we had before 😬

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Closing as I'm not convinced this PR is better now that we've reached this point where things work. It was looking very promising before these latest changes necessitated by SQLite not supporting SELECT DISTINCT ON syntax.

It is less code especially considering comments added but wouldn't consider it easier to reason about and we lose the SQLite optimization mentioned above.

Comment on lines -155 to -162
(
SELECT DISTINCT ON (type, state_key)
type, state_key, event_id
FROM state_groups_state
INNER JOIN sgs USING (state_group)
WHERE {where_clause}
ORDER BY type, state_key, state_group DESC
)
Copy link
Contributor Author

@MadLittleMods MadLittleMods May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One improvement that I thought I could take out of this was simplifying these conditions like the following but it seems to perform about the same (at least in my one example I use against matrix.org) -> https://explain.depesz.com/s/pWH9

Suggested change
(
SELECT DISTINCT ON (type, state_key)
type, state_key, event_id
FROM state_groups_state
INNER JOIN sgs USING (state_group)
WHERE {where_clause}
ORDER BY type, state_key, state_group DESC
)
(
SELECT type, state_key, event_id
FROM state_groups_state
INNER JOIN sgs USING (state_group)
WHERE {where_clause}
ORDER BY type, state_key, state_group DESC
LIMIT 1
)
Raw query and EXPLAIN ANALYZE
EXPLAIN ANALYZE WITH RECURSIVE sgs(state_group) AS (
    VALUES(739988088::bigint)
    UNION ALL
    SELECT prev_state_group FROM state_group_edges e, sgs s
    WHERE s.state_group = e.state_group
)
(
    SELECT type, state_key, event_id, state_group
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.history_visibility' AND state_key = '')
    ORDER BY type, state_key, state_group DESC
    LIMIT 1
)
 UNION 
(
    SELECT type, state_key, event_id, state_group
    FROM state_groups_state
    INNER JOIN sgs USING (state_group)
    WHERE (type = 'm.room.member' AND state_key = '@madlittlemods:matrix.org')
    ORDER BY type, state_key, state_group DESC
    LIMIT 1
);
                                                                                                QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=1272.13..1272.15 rows=2 width=104) (actual time=1.707..1.717 rows=2 loops=1)
   CTE sgs
     ->  Recursive Union  (cost=0.00..284.28 rows=101 width=8) (actual time=0.004..0.504 rows=59 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=1)
           ->  Nested Loop  (cost=0.57..28.23 rows=10 width=8) (actual time=0.007..0.007 rows=1 loops=59)
                 ->  WorkTable Scan on sgs s  (cost=0.00..0.20 rows=10 width=8) (actual time=0.000..0.000 rows=1 loops=59)
                 ->  Index Only Scan using state_group_edges_unique_idx on state_group_edges e  (cost=0.57..2.79 rows=1 width=16) (actual time=0.006..0.006 rows=1 loops=59)
                       Index Cond: (state_group = s.state_group)
                       Heap Fetches: 0
   ->  Sort  (cost=987.85..987.85 rows=2 width=104) (actual time=1.705..1.708 rows=2 loops=1)
         Sort Key: state_groups_state.type, state_groups_state.state_key, state_groups_state.event_id, state_groups_state.state_group
         Sort Method: quicksort  Memory: 25kB
         ->  Append  (cost=493.90..987.84 rows=2 width=104) (actual time=1.136..1.545 rows=2 loops=1)
               ->  Limit  (cost=493.90..493.90 rows=1 width=91) (actual time=1.134..1.136 rows=1 loops=1)
                     ->  Sort  (cost=493.90..493.90 rows=1 width=91) (actual time=1.133..1.134 rows=1 loops=1)
                           Sort Key: state_groups_state.state_group DESC
                           Sort Method: quicksort  Memory: 25kB
                           ->  Nested Loop  (cost=0.84..493.89 rows=1 width=91) (actual time=1.111..1.122 rows=1 loops=1)
                                 ->  CTE Scan on sgs  (cost=0.00..2.02 rows=101 width=8) (actual time=0.007..0.532 rows=59 loops=1)
                                 ->  Index Scan using state_groups_state_type_idx on state_groups_state  (cost=0.84..4.86 rows=1 width=91) (actual time=0.009..0.009 rows=0 loops=59)
                                       Index Cond: ((state_group = sgs.state_group) AND (type = 'm.room.history_visibility'::text) AND (state_key = ''::text))
               ->  Limit  (cost=493.90..493.90 rows=1 width=91) (actual time=0.405..0.406 rows=1 loops=1)
                     ->  Sort  (cost=493.90..493.90 rows=1 width=91) (actual time=0.404..0.405 rows=1 loops=1)
                           Sort Key: state_groups_state_1.state_group DESC
                           Sort Method: quicksort  Memory: 25kB
                           ->  Nested Loop  (cost=0.84..493.89 rows=1 width=91) (actual time=0.398..0.399 rows=1 loops=1)
                                 ->  CTE Scan on sgs sgs_1  (cost=0.00..2.02 rows=101 width=8) (actual time=0.000..0.013 rows=59 loops=1)
                                 ->  Index Scan using state_groups_state_type_idx on state_groups_state state_groups_state_1  (cost=0.84..4.86 rows=1 width=91) (actual time=0.006..0.006 rows=0 loops=59)
                                       Index Cond: ((state_group = sgs_1.state_group) AND (type = 'm.room.member'::text) AND (state_key = '@madlittlemods:matrix.org'::text))
 Planning Time: 7.594 ms
 Execution Time: 3.023 ms
(31 rows)

Compare to the after query performance from the previous PR, #14527 -> https://explain.depesz.com/s/Qi4A

Copy link
Contributor Author

@MadLittleMods MadLittleMods May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I already explored this in #14494 (comment) (see Other exploration section) and came to the same conclusion

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants