Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizon reaper can run on multiple horizon nodes at the same time #5339

Closed
tamirms opened this issue Jun 10, 2024 · 4 comments · Fixed by #5331
Closed

Horizon reaper can run on multiple horizon nodes at the same time #5339

tamirms opened this issue Jun 10, 2024 · 4 comments · Fixed by #5331

Comments

@tamirms
Copy link
Contributor

tamirms commented Jun 10, 2024

SDF uses a horizon architecture where there are multiple ingesting nodes and multiple request serving nodes. If reaping is enabled on multiple nodes, there is nothing preventing two horizon nodes from reaping at the same time. Redundant reaping processes are wasteful because we end up repeating the same delete statements which result in unnecessary extra load on the horizon DB.

As a workaround we can enable reaping on only 1 Horizon node. However, the ideal solution would be to allow multiple Horizon nodes to coordinate so that only 1 node can reap at a time. We already do something similar for ingestion (only 1 horizon node can ingest at time).

@JakeUrban
Copy link
Contributor

Hey Tamir, do you mind explaining how we support multiple ingestion nodes? Whats the point of this too, if only one node can ingest at a time?

@tamirms
Copy link
Contributor Author

tamirms commented Jun 11, 2024

@JakeUrban the point of having multiple ingesting nodes is for redundancy in case one of the ingesting nodes fails / crashes.

The way the coordination is implemented is that, when ingesting a new ledger, all the ingesting nodes race to acquire a lock in the postgres db. Only one of the ingesting nodes is able to acquire the lock and that node is responsible for ingesting the new ledger. Once the ledger is ingested, the lock is released and the other nodes who attempted to acquire the lock realize the ledger has already been ingested and so they release the lock as well and wait until the next ledger is emitted by the network.

@JakeUrban
Copy link
Contributor

the other nodes who attempted to acquire the lock realize the ledger has already been ingested and so they release the lock as well

Got it, how do nodes realize the ledger was ingested? Do they make a DB query to confirm?

@tamirms
Copy link
Contributor Author

tamirms commented Jun 11, 2024

@JakeUrban yeah, they query the latest ledger in the db and if it's greater than or equal to the ledger they're about to ingest that means the ledger has already been ingested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants
@tamirms @mollykarcher @JakeUrban and others