-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Document easy room purge benefit of using (room_id, event_id)
#13771
Document easy room purge benefit of using (room_id, event_id)
#13771
Conversation
@@ -208,10 +208,11 @@ But hash collisions are still possible, and by treating event IDs as room | |||
scoped, we can reduce the possibility of a hash collision. When scoping | |||
`event_id` in the database schema, it should be also accompanied by `room_id` | |||
(`PRIMARY KEY (room_id, event_id)`) and lookups should be done through the pair | |||
`(room_id, event_id)`. | |||
`(room_id, event_id)`. Another benefit of scoping `event_ids` to the room is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow this reasoning. Is the point that we can do this with a single DELETE FROM ... WHERE ...
rather than having to use a subselect or similar which joins to the events table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DMRobertson Seems to be the case so we don't have to do this:
synapse/synapse/storage/databases/main/purge_events.py
Lines 388 to 412 in fa2f3d8
# Now we delete tables which lack an index on room_id but have one on event_id | |
for table in ( | |
"event_auth", | |
"event_edges", | |
"event_json", | |
"event_push_actions_staging", | |
"event_relations", | |
"event_to_state_groups", | |
"event_auth_chains", | |
"event_auth_chain_to_calculate", | |
"redactions", | |
"rejections", | |
"state_events", | |
): | |
logger.info("[purge] removing %s from %s", room_id, table) | |
txn.execute( | |
""" | |
DELETE FROM %s WHERE event_id IN ( | |
SELECT event_id FROM events WHERE room_id=? | |
) | |
""" | |
% (table,), | |
(room_id,), | |
) |
From the chapter sync, it was one of the useful benefits people liked from pairing up (room_id, event_id)
.
Added a note on this part of why it's easier ⏩
Closing in favor of #13915 |
Document easy room purge benefit of using
(room_id, event_id)
Discussed in the backend chapter sync as mentioned by @erikjohnston,
https://docs.google.com/document/d/1kmGRzPFfg_gRY6l0sxjYkSLW6UpMFn9ELQX5CtTLWlA/edit#bookmark=id.ciuq6xs2t47
Follow-up to #13701
Dev notes
Database tables which don't have a
(room_id, event_id)
index,synapse/synapse/storage/databases/main/purge_events.py
Lines 388 to 412 in fa2f3d8
Pull Request Checklist
EventStore
toEventWorkerStore
.".code blocks
.Pull request includes a sign off(run the linters)