-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Unified search query syntax using the full-text search capabilities of the underlying DB #11635
Unified search query syntax using the full-text search capabilities of the underlying DB #11635
Conversation
…earch_to_tsquery-for-fts
@@ -437,7 +440,7 @@ async def search_msgs(self, room_ids, search_term, keys): | |||
"SELECT room_id, count(*) as count FROM event_search" | |||
" WHERE value MATCH ?" | |||
) | |||
count_args = [search_term] + count_args |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it was a bug to pass search_term
here rather than search_query
..?
that can be passed to database. | ||
We use this so that we can add prefix matching, which isn't something | ||
that is supported by default. | ||
that can be passed to sqllite's matchinfo(). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently a no-op, but we will probably want to add stuff here to handle certain cases e.g. #3024
There's a test failing on sqlite because it relies on some sort of prefixing (searches for sqlite does actually support stemming, so I'm going to try to enable that to fix (this relies on a db migration, so might get tricky) |
…arch_to_tsquery-for-fts
@erikjohnston any thoughts on the currently open comment threads? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise I think its good
" room_id, event_id" | ||
" FROM event_search" | ||
" WHERE vector @@ to_tsquery('english', ?)" | ||
f" WHERE vector @@ {tsquery_func}('english', ?)" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move these to multiline strings while we're here 😇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was planning to do a follow-up PR to update the entire module to multi-line strings. Would that be acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure!
Synapse 1.71.0 (2022-11-08) =========================== Please note that, as announced in the release notes for Synapse 1.69.0, legacy Prometheus metric names are now disabled by default. They will be removed altogether in Synapse 1.73.0. If not already done, server administrators should update their dashboards and alerting rules to avoid using the deprecated metric names. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.71/upgrade.html#upgrading-to-v1710) for more details. **Note:** in line with our [deprecation policy](https://matrix-org.github.io/synapse/latest/deprecation_policy.html) for platform dependencies, this will be the last release to support PostgreSQL 10, which reaches upstream end-of-life on November 10th, 2022. Future releases of Synapse will require PostgreSQL 11+. Features -------- - Support back-channel logouts from OpenID Connect providers. ([\#11414](matrix-org/synapse#11414)) - Allow use of Postgres and SQLlite full-text search operators in search queries. ([\#11635](matrix-org/synapse#11635), [\#14310](matrix-org/synapse#14310), [\#14311](matrix-org/synapse#14311)) - Implement [MSC3664](matrix-org/matrix-spec-proposals#3664), Pushrules for relations. Contributed by Nico. ([\#11804](matrix-org/synapse#11804)) - Improve aesthetics of HTML templates. Note that these changes do not retroactively apply to templates which have been [customised](https://matrix-org.github.io/synapse/latest/templates.html#templates) by server admins. ([\#13652](matrix-org/synapse#13652)) - Enable write-ahead logging for SQLite installations. Contributed by [@asymmetric](https://github.com/asymmetric). ([\#13897](matrix-org/synapse#13897)) - Show erasure status when [listing users](https://matrix-org.github.io/synapse/latest/admin_api/user_admin_api.html#query-user-account) in the Admin API. ([\#14205](matrix-org/synapse#14205)) - Provide a specific error code when a `/sync` request provides a filter which doesn't represent a JSON object. ([\#14262](matrix-org/synapse#14262))
Synapse 1.71.0 (2022-11-08) =========================== Please note that, as announced in the release notes for Synapse 1.69.0, legacy Prometheus metric names are now disabled by default. They will be removed altogether in Synapse 1.73.0. If not already done, server administrators should update their dashboards and alerting rules to avoid using the deprecated metric names. See the [upgrade notes](https://matrix-org.github.io/synapse/v1.71/upgrade.html#upgrading-to-v1710) for more details. **Note:** in line with our [deprecation policy](https://matrix-org.github.io/synapse/latest/deprecation_policy.html) for platform dependencies, this will be the last release to support PostgreSQL 10, which reaches upstream end-of-life on November 10th, 2022. Future releases of Synapse will require PostgreSQL 11+. No significant changes since 1.71.0rc2. Synapse 1.71.0rc2 (2022-11-04) ============================== Improved Documentation ---------------------- - Document the changes to monthly active user metrics due to deprecation of legacy Prometheus metric names. ([\matrix-org#14358](matrix-org#14358), [\matrix-org#14360](matrix-org#14360)) Deprecations and Removals ------------------------- - Disable legacy Prometheus metric names by default. They can still be re-enabled for now, but they will be removed altogether in Synapse 1.73.0. ([\matrix-org#14353](matrix-org#14353)) Internal Changes ---------------- - Run unit tests against Python 3.11. ([\matrix-org#13812](matrix-org#13812)) Synapse 1.71.0rc1 (2022-11-01) ============================== Features -------- - Support back-channel logouts from OpenID Connect providers. ([\matrix-org#11414](matrix-org#11414)) - Allow use of Postgres and SQLlite full-text search operators in search queries. ([\matrix-org#11635](matrix-org#11635), [\matrix-org#14310](matrix-org#14310), [\matrix-org#14311](matrix-org#14311)) - Implement [MSC3664](matrix-org/matrix-spec-proposals#3664), Pushrules for relations. Contributed by Nico. ([\matrix-org#11804](matrix-org#11804)) - Improve aesthetics of HTML templates. Note that these changes do not retroactively apply to templates which have been [customised](https://matrix-org.github.io/synapse/latest/templates.html#templates) by server admins. ([\matrix-org#13652](matrix-org#13652)) - Enable write-ahead logging for SQLite installations. Contributed by [@asymmetric](https://github.com/asymmetric). ([\matrix-org#13897](matrix-org#13897)) - Show erasure status when [listing users](https://matrix-org.github.io/synapse/latest/admin_api/user_admin_api.html#query-user-account) in the Admin API. ([\matrix-org#14205](matrix-org#14205)) - Provide a specific error code when a `/sync` request provides a filter which doesn't represent a JSON object. ([\matrix-org#14262](matrix-org#14262)) Bugfixes -------- - Fix a long-standing bug where the `update_synapse_database` script could not be run with multiple databases. Contributed by @thefinn93 @ Beeper. ([\matrix-org#13422](matrix-org#13422)) - Fix a bug which prevented setting an avatar on homeservers which have an explicit port in their `server_name` and have `max_avatar_size` and/or `allowed_avatar_mimetypes` configuration. Contributed by @ashfame. ([\matrix-org#13927](matrix-org#13927)) - Check appservice user interest against the local users instead of all users in the room to align with [MSC3905](matrix-org/matrix-spec-proposals#3905). ([\matrix-org#13958](matrix-org#13958)) - Fix a long-standing bug where Synapse would accidentally include extra information in the response to [`PUT /_matrix/federation/v2/invite/{roomId}/{eventId}`](https://spec.matrix.org/v1.4/server-server-api/#put_matrixfederationv2inviteroomideventid). ([\matrix-org#14064](matrix-org#14064)) - Fix a bug introduced in Synapse 1.64.0 where presence updates could be missing from `/sync` responses. ([\matrix-org#14243](matrix-org#14243)) - Fix a bug introduced in Synapse 1.60.0 which caused an error to be logged when Synapse received a SIGHUP signal if debug logging was enabled. ([\matrix-org#14258](matrix-org#14258)) - Prevent history insertion ([MSC2716](matrix-org/matrix-spec-proposals#2716)) during an partial join ([MSC3706](matrix-org/matrix-spec-proposals#3706)). ([\matrix-org#14291](matrix-org#14291)) - Fix a bug introduced in Synapse 1.34.0 where device names would be returned via a federation user key query request when `allow_device_name_lookup_over_federation` was set to `false`. ([\matrix-org#14304](matrix-org#14304)) - Fix a bug introduced in Synapse 0.34.0 where logs could include error spam when background processes are measured as taking a negative amount of time. ([\matrix-org#14323](matrix-org#14323)) - Fix a bug introduced in Synapse 1.70.0 where clients were unable to PUT new [dehydrated devices](matrix-org/matrix-spec-proposals#2697). ([\matrix-org#14336](matrix-org#14336)) Improved Documentation ---------------------- - Explain how to disable the use of [`trusted_key_servers`](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#trusted_key_servers). ([\matrix-org#13999](matrix-org#13999)) - Add workers settings to [configuration manual](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#individual-worker-configuration). ([\matrix-org#14086](matrix-org#14086)) - Correct the name of the config option [`encryption_enabled_by_default_for_room_type`](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#encryption_enabled_by_default_for_room_type). ([\matrix-org#14110](matrix-org#14110)) - Update docstrings of `SynapseError` and `FederationError` to bettter describe what they are used for and the effects of using them are. ([\matrix-org#14191](matrix-org#14191)) Internal Changes ---------------- - Remove unused `@lru_cache` decorator. ([\matrix-org#13595](matrix-org#13595)) - Save login tokens in database and prevent login token reuse. ([\matrix-org#13844](matrix-org#13844)) - Refactor OIDC tests to better mimic an actual OIDC provider. ([\matrix-org#13910](matrix-org#13910)) - Fix type annotation causing import time error in the Complement forking launcher. ([\matrix-org#14084](matrix-org#14084)) - Refactor [MSC3030](matrix-org/matrix-spec-proposals#3030) `/timestamp_to_event` endpoint to loop over federation destinations with standard pattern and error handling. ([\matrix-org#14096](matrix-org#14096)) - Add initial power level event to batch of bulk persisted events when creating a new room. ([\matrix-org#14228](matrix-org#14228)) - Refactor `/key/` endpoints to use `RestServlet` classes. ([\matrix-org#14229](matrix-org#14229)) - Switch to using the `matrix-org/backend-meta` version of `triage-incoming` for new issues in CI. ([\matrix-org#14230](matrix-org#14230)) - Build wheels on macos 11, not 10.15. ([\matrix-org#14249](matrix-org#14249)) - Add debugging to help diagnose lost device list updates. ([\matrix-org#14268](matrix-org#14268)) - Add Rust cache to CI for `trial` runs. ([\matrix-org#14287](matrix-org#14287)) - Improve type hinting of `RawHeaders`. ([\matrix-org#14303](matrix-org#14303)) - Use Poetry 1.2.0 in the Twisted Trunk CI job. ([\matrix-org#14305](matrix-org#14305)) <details> <summary>Dependency updates</summary> Runtime: - Bump anyhow from 1.0.65 to 1.0.66. ([\matrix-org#14278](matrix-org#14278)) - Bump jinja2 from 3.0.3 to 3.1.2. ([\matrix-org#14271](matrix-org#14271)) - Bump prometheus-client from 0.14.0 to 0.15.0. ([\matrix-org#14274](matrix-org#14274)) - Bump psycopg2 from 2.9.4 to 2.9.5. ([\matrix-org#14331](matrix-org#14331)) - Bump pysaml2 from 7.1.2 to 7.2.1. ([\matrix-org#14270](matrix-org#14270)) - Bump sentry-sdk from 1.5.11 to 1.10.1. ([\matrix-org#14330](matrix-org#14330)) - Bump serde from 1.0.145 to 1.0.147. ([\matrix-org#14277](matrix-org#14277)) - Bump serde_json from 1.0.86 to 1.0.87. ([\matrix-org#14279](matrix-org#14279)) Tooling and CI: - Bump black from 22.3.0 to 22.10.0. ([\matrix-org#14328](matrix-org#14328)) - Bump flake8-bugbear from 21.3.2 to 22.9.23. ([\matrix-org#14042](matrix-org#14042)) - Bump peaceiris/actions-gh-pages from 3.8.0 to 3.9.0. ([\matrix-org#14276](matrix-org#14276)) - Bump peaceiris/actions-mdbook from 1.1.14 to 1.2.0. ([\matrix-org#14275](matrix-org#14275)) - Bump setuptools-rust from 1.5.1 to 1.5.2. ([\matrix-org#14273](matrix-org#14273)) - Bump twine from 3.8.0 to 4.0.1. ([\matrix-org#14332](matrix-org#14332)) - Bump types-opentracing from 2.4.7 to 2.4.10. ([\matrix-org#14133](matrix-org#14133)) - Bump types-requests from 2.28.11 to 2.28.11.2. ([\matrix-org#14272](matrix-org#14272)) </details>
Support a unified search query syntax which leverages more of the full-text search of each DB we support.
We now support, with the same syntax across Postgresql and Sqlite:
AND
,OR
,-
(negation) set operatorsand parenthesesThis is achieved by
websearch_to_tsquery
Multiple terms separated by a space are implicitly ANDed.
Possible Issues:
NOT
,NEAR
,*
in sqlite,&
,|
in pgsql. This runs the risk that people might discover this as accidental functionality and depend on something we don't guarantee.Before
Prior to this PR, we transformed all full-text queries into a DB full-text query by
\w
)For example, the user input
foo? bar
is transformed intofoo* & bar*
. This happens for both sqlite and postgres.This had some drawbacks -
Sqlite docs
Postgres docs
And for comparison with how Element clients support search, see the Tantivy docs
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.(run the linters)
Sign-off
Signed-off-by: James Salter iteration@gmail.com