Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Document easy room purge benefit of using (room_id, event_id) #13771

Conversation

MadLittleMods
Copy link
Contributor

@MadLittleMods MadLittleMods commented Sep 9, 2022

Document easy room purge benefit of using (room_id, event_id)

Discussed in the backend chapter sync as mentioned by @erikjohnston,
https://docs.google.com/document/d/1kmGRzPFfg_gRY6l0sxjYkSLW6UpMFn9ELQX5CtTLWlA/edit#bookmark=id.ciuq6xs2t47

Follow-up to #13701

Dev notes

Database tables which don't have a (room_id, event_id) index,

# Now we delete tables which lack an index on room_id but have one on event_id
for table in (
"event_auth",
"event_edges",
"event_json",
"event_push_actions_staging",
"event_relations",
"event_to_state_groups",
"event_auth_chains",
"event_auth_chain_to_calculate",
"redactions",
"rejections",
"state_events",
):
logger.info("[purge] removing %s from %s", room_id, table)
txn.execute(
"""
DELETE FROM %s WHERE event_id IN (
SELECT event_id FROM events WHERE room_id=?
)
"""
% (table,),
(room_id,),
)

Pull Request Checklist

  • Pull request is based on the develop branch
  • Pull request includes a changelog file. The entry should:
    • Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
    • Use markdown where necessary, mostly for code blocks.
    • End with either a period (.) or an exclamation mark (!).
    • Start with a capital letter.
    • Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
  • Pull request includes a sign off
  • Code style is correct
    (run the linters)

@MadLittleMods MadLittleMods added T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. A-Docs things relating to the documentation T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. and removed T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. labels Sep 9, 2022
@MadLittleMods MadLittleMods marked this pull request as ready for review September 9, 2022 19:54
@MadLittleMods MadLittleMods requested a review from a team as a code owner September 9, 2022 19:54
@@ -208,10 +208,11 @@ But hash collisions are still possible, and by treating event IDs as room
scoped, we can reduce the possibility of a hash collision. When scoping
`event_id` in the database schema, it should be also accompanied by `room_id`
(`PRIMARY KEY (room_id, event_id)`) and lookups should be done through the pair
`(room_id, event_id)`.
`(room_id, event_id)`. Another benefit of scoping `event_ids` to the room is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow this reasoning. Is the point that we can do this with a single DELETE FROM ... WHERE ... rather than having to use a subselect or similar which joins to the events table?

Copy link
Contributor Author

@MadLittleMods MadLittleMods Sep 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DMRobertson Seems to be the case so we don't have to do this:

# Now we delete tables which lack an index on room_id but have one on event_id
for table in (
"event_auth",
"event_edges",
"event_json",
"event_push_actions_staging",
"event_relations",
"event_to_state_groups",
"event_auth_chains",
"event_auth_chain_to_calculate",
"redactions",
"rejections",
"state_events",
):
logger.info("[purge] removing %s from %s", room_id, table)
txn.execute(
"""
DELETE FROM %s WHERE event_id IN (
SELECT event_id FROM events WHERE room_id=?
)
"""
% (table,),
(room_id,),
)

From the chapter sync, it was one of the useful benefits people liked from pairing up (room_id, event_id).

Added a note on this part of why it's easier ⏩

@MadLittleMods
Copy link
Contributor Author

Closing in favor of #13915

richvdh pushed a commit that referenced this pull request Sep 27, 2022
…13915)

* Emphasize the right reasons to use (room_id, event_id)

Follow-up to:
 - #13701
 - #13771
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Docs things relating to the documentation T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants