fix(catalog): consistent ordering of catalog operations #25690
Merged
+367
−223
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Closes #25667
In looking at addressing #25667 I found a class of race conditions related to inconsistencies between the ordering of catalog operations against the in-memory Catalog versus how they were written to the WAL. For example, if write calls A and B are both writing to table
foo
, one might create the table while the other adds some field. If the order of the WAL calls is different than the order of the original invocations the system will fail on restoration, as it'll receive a field addition for a table it hasn't yet created.The new approach is for the WalOps to be take an
OrderedCatalogBatch
, which will have been produced by callingapply_catalog_batch
. We then sort the WalOps by a newly defined order, putting the catalog ops first, sorted by catalog_sequence_number.Following on the work in #25642, I've removed the dedicated methods
add_meta_cache()
,remove_meta_cache()
,add_last_cache()
,delete_last_cache()
, instead relying on the apply_catalog_batch() method. @hiltontj, can you take a look to make sure this was done safely?