DataShard: read iterators may have a broken external reference precharging #7769

snaury · 2024-08-14T09:27:22Z

When reviewing code for #7674 I noticed that LastProcessedKey value when precharging missing references may be incorrect. The problem is that @SammyVimes changed the LastProcessedKeyErased to LastProcessedKeyErasedOrMissing, which seemed ok at first glance. In reality this flag is not serialized in responses and may cause rows to be missing in the result.

Previously LastProcessedKeyErased was used as a hack/optimization when resuming internally, to force erased sub-ranges to overlap and have a side-effect of stitching them together. The result remains correct even when query is restarted externally with the last reported continuation token's key. When the query is restarted externally this flag is lost and effectively becomes false, which is totally fine because erased rows are skipped and the last erased row was already accounted for. Resuming with this flag set to false is correct (in a sense that the same row is not visited twice), though may produce sub-optimal caching behavior.

Note however that when we start "precharing" missing references we are positioned on the result row, which is not part of the result yet. When LastProcessedKey is set to that row's key and we already have some rows processed, we would produce a result that marks this row as "processed", when in reality it is not. After a temporary network failure the read actor will restart the query externally from this last known key, which would cause it to skip the row. This row, however, has never been part of the result, and now it never will.

Since iterator cannot travel in time and the last known key is already updated, we need to always page fault when encountering missing references. The LastProcessedKeyErased flag can only be used as an erased range stitching hack.

The text was updated successfully, but these errors were encountered:

snaury added bug Something isn't working area/datashard Issues related to datashard tablets (relational table partitions) labels Aug 14, 2024

snaury self-assigned this Aug 14, 2024

snaury mentioned this issue Aug 14, 2024

[[s3 perf]]: make precharge work with ext-blobs #6486

Closed

snaury mentioned this issue Sep 2, 2024

Fix external reference precharging in datashard read iterators #8592

Merged

snaury closed this as completed in #8592 Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DataShard: read iterators may have a broken external reference precharging #7769

DataShard: read iterators may have a broken external reference precharging #7769

snaury commented Aug 14, 2024

DataShard: read iterators may have a broken external reference precharging #7769

DataShard: read iterators may have a broken external reference precharging #7769

Comments

snaury commented Aug 14, 2024