Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Faster joins: improve handling of race where we can fail to persist a client event with partial state after _sync_partial_state_room clears the partial state flag for a room #13173

Closed
Tracked by #14030
squahtx opened this issue Jul 4, 2022 · 2 comments · Fixed by #14665
Labels
A-Federated-Join joins over federation generally suck S-Minor Blocks non-critical functionality, workarounds exist. T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.

Comments

@squahtx
Copy link
Contributor

squahtx commented Jul 4, 2022

Follow on from #12988.

Currently, when we encounter the race in #12988, we send a 429 Too Many Requests back to clients with the expectation that a retry happens.

@reivilibre notes that some clients may require retries to be manually initiated, which is not good UX.

We can improve things by doing some of the retries internally in synapse.


A handy plot of code paths which create client events:
graphviz
The 429 gets raised in EventCreationHandler.handle_new_client_event. Event contexts are computed in orange methods.

@squahtx squahtx added the S-Minor Blocks non-critical functionality, workarounds exist. label Jul 4, 2022
@squahtx squahtx added this to the Faster joins (further work) milestone Jul 4, 2022
@squahtx squahtx added A-Federated-Join joins over federation generally suck T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. labels Jul 12, 2022
@squahtx
Copy link
Contributor Author

squahtx commented Jul 20, 2022

@reivilibre spotted an occurrence of the race during a complement run: https://github.com/matrix-org/synapse/runs/7397381295?check_suite_focus=true#step:4:2049
This one looks like it's on the _local_membership_update path

@reivilibre
Copy link
Contributor

If we want to just make Complement retry the request, this will do it: matrix-org/complement#414

I don't know if that's something we actually want to do or not?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Federated-Join joins over federation generally suck S-Minor Blocks non-critical functionality, workarounds exist. T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants