Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(catalog): consistent ordering of catalog operations #25690

Merged
merged 3 commits into from
Dec 20, 2024

Conversation

jacksonrnewhouse
Copy link
Contributor

Closes #25667

In looking at addressing #25667 I found a class of race conditions related to inconsistencies between the ordering of catalog operations against the in-memory Catalog versus how they were written to the WAL. For example, if write calls A and B are both writing to table foo, one might create the table while the other adds some field. If the order of the WAL calls is different than the order of the original invocations the system will fail on restoration, as it'll receive a field addition for a table it hasn't yet created.

The new approach is for the WalOps to be take an OrderedCatalogBatch, which will have been produced by calling apply_catalog_batch. We then sort the WalOps by a newly defined order, putting the catalog ops first, sorted by catalog_sequence_number.

Following on the work in #25642, I've removed the dedicated methods add_meta_cache(), remove_meta_cache(), add_last_cache(), delete_last_cache(), instead relying on the apply_catalog_batch() method. @hiltontj, can you take a look to make sure this was done safely?

@pauldix
Copy link
Member

pauldix commented Dec 19, 2024

PR and commit subject should be fix: consistent ordering of catalog operations with the space after the :

@jacksonrnewhouse jacksonrnewhouse changed the title bugfix:consistent ordering of catalog operations fix(catalog): consistent ordering of catalog operations Dec 19, 2024
Copy link
Member

@pauldix pauldix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, but I'd like @hiltontj to have a look too.

Copy link
Contributor

@hiltontj hiltontj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me @jacksonrnewhouse - removing the create/delete methods on the catalog for the last/meta caches looks a lot cleaner and also looks sound to me.

👍 on having all changes go through a single point (apply_catalog_batches).

I left a couple comments on things I noticed but they are not blockers for this PR.

influxdb3_write/src/persister.rs Outdated Show resolved Hide resolved
influxdb3_write/src/write_buffer/mod.rs Outdated Show resolved Hide resolved
@jacksonrnewhouse jacksonrnewhouse merged commit 0db71b6 into main Dec 20, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

v3 Race condition on schema creation
3 participants