Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Sync can 404 with "Can't find event..." when processing some invites #7843

Closed
babolivier opened this issue Jul 14, 2020 · 2 comments · Fixed by #8110
Closed

Sync can 404 with "Can't find event..." when processing some invites #7843

babolivier opened this issue Jul 14, 2020 · 2 comments · Fixed by #8110
Assignees
Labels
A-Stuck-Invite Incoming invitations that won't go away z-bug (Deprecated Label) z-p2 (Deprecated Label)

Comments

@babolivier
Copy link
Contributor

babolivier commented Jul 14, 2020

Until #6983 landed (shipped in 1.12.0 - March 23, 2020), we weren't upserting rows to the rooms table when receiving a new invite. We're now doing that because we need to know the version of the room the invite is for.

In some places of the code, including when handling sync, we make the assumption that if we have an event or an invite then we have the version of the room it's been sent into. When the user has an invite to a room that's not in the rooms table, this causes an error to be raised ("Room X for event Y is unknown") and syncs to 404 (with "Can't find event Y").

As of last Friday there were 1600 rooms with this issue on matrix.org (select count(*) from local_current_membership left join rooms using (room_id) where membership = 'invite' and rooms.room_id is null;).

An ideal solution would be a background update that inserts the missing rooms in the rooms table, however this isn't trivial because we don't know the version of these rooms. Inserting a room with version NULL means Synapse will assume it's v1, which can be wrong, as non-v1 room versions were introduced way earlier (v2 was introduced in #4307 and shipped in 0.34.1, released on Jan 9, 2019).

We could also do a similar update just for rooms which invite predates rooms v2, but that would probably be quite low impact (89 rooms on matrix.org).

A first step towards resolution could be to ensure we don't fail in that case unless it's necessary - e.g. we don't fail the entire sync response over one bogus invite.

@richvdh
Copy link
Member

richvdh commented Jul 14, 2020

I'm fairly sure that this used to work correctly, so I'd be somewhat interested to know at what point it regressed. Where exactly does the /sync handler expect that there is a rooms entry?

#6902 is related in that even if your sync returns correctly, you may not be able to reject the invite.

@clokep clokep added the z-bug (Deprecated Label) label Jul 14, 2020
@babolivier babolivier changed the title Rooms can be missing from the rooms table for old invites Sync can 404 with "Can't find event..." when processing some invites Jul 14, 2020
@richvdh richvdh added the A-Stuck-Invite Incoming invitations that won't go away label Jul 29, 2020
@neilisfragile neilisfragile added the z-p2 (Deprecated Label) label Jul 30, 2020
@richvdh
Copy link
Member

richvdh commented Aug 18, 2020

I've looked into this a little more.

First of all, it's not a recent regression, but more of an edge-case that got forgotten.

The code that relies on an entry in rooms was added in #6874 (1.12.0rc1). When I wrote that, I did consider the case of out-of-band-memberships and added a workaround (see here). However, the workaround relied on the out_of_band_membership internal metadata flag, which itself was only added in #4405 (Synapse 0.99.0).

Hence a more accurate estimate of the number of users affected is:

matrix=> select count(*) from local_current_membership lcm left join event_json ej using (event_id) left join rooms r on r.room_id = lcm.room_id where membership = 'invite' and r.room_id is null and ej.internal_metadata::json->>'out_of_band_membership' is null;
 count 
-------
    88

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Stuck-Invite Incoming invitations that won't go away z-bug (Deprecated Label) z-p2 (Deprecated Label)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants