Make the event `txnId` mapping rely on the `device_id` instead of the `access_token_id`. #13083

sandhose · 2022-06-16T13:19:18Z

This changes the event txnId mapping to rely on the device_id instead of the access_token_id.
Note that there are access tokens created via the admin API which don't have a device_id. Those tokens would loose:

the txnId echo in /sync when they are sending an event to a room
idempotency when sending an event to a room (in case of a network failure)

Note that I decided to keep the token_id (and still writing to it) in the event_txn_id table to allow rolling back, but I'm not reading on it.

Pull Request Checklist

Pull request is based on the develop branch
Pull request includes a changelog file. The entry should:
- Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from EventStore to EventWorkerStore.".
- Use markdown where necessary, mostly for code blocks.
- End with either a period (.) or an exclamation mark (!).
- Start with a capital letter.
- Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry.
Pull request includes a sign off
Code style is correct
(run the linters)

reivilibre · 2022-06-16T14:03:00Z

Some brief thoughts.

Note that I decided to keep the token_id (and still writing to it) in the event_txn_id table to allow rolling back, but I'm not reading on it.

I think there's some fiddling of SCHEMA_VERSION/SCHEMA_COMPAT_VERSION to be done (and adding to the comments around those) when you make a column 'written but not read'.
If you can't figure it out from the example, let me/someone know (and I will try to figure out how that works again).
https://matrix-org.github.io/synapse/latest/development/database_schema.html#synapse-schema-versions also describes this process.

Otherwise:

handling of access tokens without device ID: I think I would rather uphold idempotency and the /sync feedback, even if that means we have to pretend all no-device-ID 'devices' are the same. Happy to hear comments from others.
This should be specified. I believe you've already found that the spec contradicts this change (:sob:) — whether that can be classed as a spec bug or needs an MSC, I don't know.
What should happen when you delete a device and create a new one with the same device ID? (Or alternatively: replace a device by logging in with the same ID?).
- I would find it surprising if that means you inherit the pool of 'consumed' transaction IDs. Am I wrong to be surprised?
- Regardless of this, I think it would be good to have this specced since it's an easy disagreement that homeserver implementations could have, that could also lead to subtle bugs. (Many clients use a timestamp in their ID and wouldn't care, but not all of them do this)

richvdh

As a general theme: please make sure you include clear comments about what is going on. People shouldn't have to have an extensive knowledge of the entire Synapse codebase to understand what individual bits of code are for.

richvdh · 2022-06-17T11:22:31Z

synapse/events/__init__.py

+    device_id: DictProperty[str] = DictProperty("device_id")
+    token_id: DictProperty[Optional[int]] = DefaultDictProperty("token_id", None)


comments please. What do these fields mean? What does it mean for them to be absent, or None?

richvdh · 2022-06-17T11:23:21Z

synapse/events/__init__.py

+    device_id: DictProperty[str] = DictProperty("device_id")
+    token_id: DictProperty[Optional[int]] = DefaultDictProperty("token_id", None)


why do we use a DefaultDictProperty for token_id when we do not do so for any of the other properties (all of which are optional)

richvdh · 2022-06-17T11:25:54Z

synapse/events/utils.py

+    if config.device_id is not None and config.device_id == getattr(
+        e.internal_metadata, "device_id", None
+    ):


comments here would be helpful. Why do we need to getattr rather than just read e.internal_metadata.device_id directly?

(just because the code was under-commented before doesn't mean you need to maintain that! What is this block actually doing?)

richvdh · 2022-06-17T11:33:07Z

synapse/storage/schema/main/delta/72/01event_txn_device_id.sql.postgres

+
+DROP INDEX event_txn_id_txn_id;
+CREATE UNIQUE INDEX event_txn_id_txn_id ON event_txn_id (room_id, user_id, device_id, txn_id);
+-- Keep this (non-unique) index in case we're rolling back


again, what does "in case we're rolling back" mean?

richvdh · 2022-06-17T11:34:31Z

synapse/storage/schema/main/delta/72/01event_txn_device_id.sql.postgres

+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+


more comments please. What is this delta doing? why?

(see

synapse/synapse/storage/schema/main/delta/71/01rebuild_event_edges.sql.postgres

Line 16 in ba03e66

-- We're going to stop populating event_edges.room_id and event_edges.is_state,

for example, though I now wonder why I used -- comments rather than a /* one.

richvdh · 2022-06-17T11:38:49Z

synapse/storage/schema/main/delta/72/01event_txn_device_id.sql.postgres

+-- Keep this (non-unique) index in case we're rolling back
+CREATE INDEX event_txn_id_txn_id_token_id ON event_txn_id (room_id, user_id, token_id, txn_id);
+
+-- Make the device_id NOT NULL and remove the NOT NULL constraint from the token_id


can we have a foreign key constraint on devices ?

richvdh · 2022-06-17T11:39:10Z

synapse/storage/schema/main/delta/72/01event_txn_device_id.sql.sqlite

+DROP INDEX event_txn_id_event_id;
+DROP INDEX event_txn_id_txn_id;
+DROP INDEX event_txn_id_ts;


I think this will happen automatically when you drop the table?

richvdh · 2022-06-17T11:39:31Z

synapse/storage/schema/main/delta/72/01event_txn_device_id.sql.sqlite

+CREATE UNIQUE INDEX event_txn_id_event_id ON event_txn_id2(event_id);
+CREATE UNIQUE INDEX event_txn_id_txn_id ON event_txn_id2(room_id, user_id, device_id, txn_id);
+-- Keep this index in case we're rolling back
+CREATE INDEX event_txn_id_txn_id_token_id ON event_txn_id2(room_id, user_id, token_id, txn_id);


it's normally quicker to create the index once all the inserts are done.

richvdh · 2022-06-17T11:40:25Z

synapse/storage/schema/main/delta/72/01event_txn_device_id.sql.sqlite

+ * limitations under the License.
+ */
+
+CREATE TABLE event_txn_id2 (


we seem to be missing the foreign key constraints here?

clokep · 2023-01-25T19:33:38Z

@sandhose Is this still needed? Looks like there's some outstanding feedback.

This adds two tests, which check the current spec behaviour of transaction IDs, which are that they are scoped to a series of access tokens, and not the device ID. The first test highlight this behaviour, by logging in with refresh token enabled, sending an event, using the refresh token and syncing with the new access token. On the sync, the transaction ID should be there, but currently in Synapse it is not. The second test highlight that the transaction ID is not scoped to the device ID, by logging in twice with the same device ID, sending an event with the first access token, and syncing with the second access token. In that case, the sync should not contain the transaction ID, but I think it's the case in HS implementations which use the device ID to scope the transaction IDs, like Conduit. Related: matrix-org/matrix-spec#1133, matrix-org/matrix-spec#1236, matrix-org/synapse#13064 and matrix-org/synapse#13083

… `access_token_id`. Signed-off-by: Quentin Gliech <quenting@element.io>

sandhose · 2023-03-24T15:58:30Z

Replaced by #15318

This adds two tests, which check the current spec behaviour of transaction IDs, which are that they are scoped to a series of access tokens, and not the device ID. The first test highlight this behaviour, by logging in with refresh token enabled, sending an event, using the refresh token and syncing with the new access token. On the sync, the transaction ID should be there, but currently in Synapse it is not. The second test highlight that the transaction ID is not scoped to the device ID, by logging in twice with the same device ID, sending an event with the first access token, and syncing with the second access token. In that case, the sync should not contain the transaction ID, but I think it's the case in HS implementations which use the device ID to scope the transaction IDs, like Conduit. Related: matrix-org/matrix-spec#1133, matrix-org/matrix-spec#1236, matrix-org/synapse#13064 and matrix-org/synapse#13083

sandhose requested a review from a team as a code owner June 16, 2022 13:19

sandhose mentioned this pull request Jun 16, 2022

Scope transaction IDs to the (user, device_id) instead of the access token matrix-org/matrix-spec#1133

Closed

richvdh suggested changes Jun 17, 2022

View reviewed changes

sandhose mentioned this pull request Feb 15, 2023

Test the scope of a transaction IDs matrix-org/complement#613

Closed

DMRobertson added the X-Awaiting-Changes A contributed PR which needs changes and re-review before it can be merged label Feb 16, 2023

hughns mentioned this pull request Feb 24, 2023

MSC3970: Scope transaction IDs to devices matrix-org/matrix-spec-proposals#3970

Merged

Make the event txnId mapping rely on the device_id instead of the…

bb4b546

… `access_token_id`. Signed-off-by: Quentin Gliech <quenting@element.io>

sandhose force-pushed the quenting/no-more-token-id/message-txn branch from ba03e66 to bb4b546 Compare February 27, 2023 17:02

sandhose mentioned this pull request Mar 24, 2023

Experimental support for MSC3970: per-device transaction IDs #15318

Merged

sandhose closed this Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make the event `txnId` mapping rely on the `device_id` instead of the `access_token_id`. #13083

Make the event `txnId` mapping rely on the `device_id` instead of the `access_token_id`. #13083

sandhose commented Jun 16, 2022 •

edited

Loading

reivilibre commented Jun 16, 2022

richvdh left a comment

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

richvdh Jun 17, 2022

clokep commented Jan 25, 2023

sandhose commented Mar 24, 2023

		device_id: DictProperty[str] = DictProperty("device_id")
		token_id: DictProperty[Optional[int]] = DefaultDictProperty("token_id", None)

Make the event txnId mapping rely on the device_id instead of the access_token_id. #13083

Make the event txnId mapping rely on the device_id instead of the access_token_id. #13083

Conversation

sandhose commented Jun 16, 2022 • edited Loading

Pull Request Checklist

reivilibre commented Jun 16, 2022

richvdh left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clokep commented Jan 25, 2023

sandhose commented Mar 24, 2023

Make the event `txnId` mapping rely on the `device_id` instead of the `access_token_id`. #13083

Make the event `txnId` mapping rely on the `device_id` instead of the `access_token_id`. #13083

sandhose commented Jun 16, 2022 •

edited

Loading