-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dense reader fails early when tile offsets are too large. #5310
Conversation
This change allows to accurately compute the memory usage for tile offsets before they are loaded. We can then throw an error before we do any work if the tile offsets are too large for the memory budget. This will enable us to fail faster on REST when performing large queries and do fast retries. Note that the memory usage for tile offsets is done per fragment, which will ease the implementation of partial tile offsets loading for the dense reader. --- TYPE: IMPROVEMENT DESC: Dense reader fails early when tile offsets are too large.
@@ -145,6 +147,134 @@ bool ReaderBase::skip_field( | |||
/* PROTECTED METHODS */ | |||
/* ****************************** */ | |||
|
|||
std::vector<uint64_t> ReaderBase::tile_offset_sizes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to reviewers: This was move almost as is from SparseIndexReaderBase. I have only added two "if (!dense) {".
per_frag_tile_offsets_usage_.end(), | ||
static_cast<uint64_t>(0)); | ||
if (total_tile_offsets_sizes > | ||
memory_budget_ - array_memory_tracker_->get_memory_usage()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other readers we do > array_memory_tracker_->get_memory_available()
, is this case somehow different and we need to do the calculation here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the get_memory_usage()
function does not subtract the tile offsets from the new "counter" memory tracking, and the old tracking system doesn't account for them either. However, array_memory_tracker_->get_memory_available()
does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is desired. There are some memory structures allocated for tile offsets at this time and we don't want them to be included.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought the only structure allocated for tile offsets memory is the counters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other readers have tile offsets loaded at the time they are ready to process tiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"I thought the only structure allocated for tile offsets memory is the counters"
Yes they are, but get_memory_available() will subtract that counter from the available memory. Since our calculation include everything, we should not remove the tile offsets counter from the available memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we don't want to subtract it from available memory? Isn't that memory taken by some of our structures, so it cannot be used for loading offsets? We are we tracking those structures then?
If that's just some confusing design we wouldn't like to address as part of this ticket, let's make sure we address it in some follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't subtract it from the available memory because it's included in total_tile_offsets_sizes... We don't want to include it twice. There's already stories to reconcile everything and get rid of the legacy memory tracker which confuses everything. But before we can do that, we need to complete the partial tile offsets loading for all readers as well as partial rtree loading.
|
||
// Try to read. | ||
int data_r[NUM_CELLS] = {0}; | ||
uint64_t data_r_size = sizeof(data_r); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some more comments to make the test clearer? I don't understand the calculations here.
Also can we add a less trivial test too where we write more than 1 fragment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What computation? This just computes the size of the user buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the calculations in the test. It's wasn't evident to untrained eyes why a budget of 600 previously was not enough. Thank you for adding those comments.
One last question, does tile_upper_memory_limit_
play a role here? Could it be completely omitted so that we test what's actually the case, what happens when we only set total_budget
and leave that one as default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be removed from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's also run a nightly build with the new tests. I have seen them fail when dealing with memory budgets.
/backport to release-2.26 |
Started backporting to release-2.26: https://github.com/TileDB-Inc/TileDB/actions/runs/10884706119 |
This change allows to accurately compute the memory usage for tile offsets before they are loaded. We can then throw an error before we do any work if the tile offsets are too large for the memory budget. This will enable us to fail faster on REST when performing large queries and do fast retries.
Note that the memory usage for tile offsets is done per fragment, which will ease the implementation of partial tile offsets loading for the dense reader.
[sc-55116]
TYPE: IMPROVEMENT
DESC: Dense reader fails early when tile offsets are too large.