-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set minimum buffer size of BytesPool to maxChunk size #1581
Conversation
Signed-off-by: Jens Hausherr <jens.hausherr@xing.com>
fca27cf
to
92dfdb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we end up with:
Had: Sizes: [200000 400000 800000 1600000 3200000 6400000 12800000 25600000] overall: 8
Now: Sizes: [16000 32000 64000 128000 256000 512000 1024000 2048000 4096000 8192000 16384000 32768000] overall: 12
More sizes we have it defeats the purpose of sync pool indeed. But for queries that uses lot's of single chunks (e.g especially for downsampled data) having smaller pools might make sense. Any chance for some benchmarks before & after? (: In terms of memory usage for the same requests?
Otherwise LGTM 👍
The memory usage is only positively affected as described in the linked issue. The main issue is that the individual chunk files are fetched in parallel in 16k byte parts when the chunk is read. So this one: would be fetched in ~ 1,244 concurrent requests - each downloader requests 16k from the byte pool which caused in the previous version to grab 248.75 MB from the byte pool (because the 16k request was allocated the smallest buffer size: 200k), with the changed version it is only 19,904 MB (smallest buffer: 16k) The issue appears on primarily when quering as far as I can tell, because the byte chunks are allocated in Line 1272 in 4c7ecd3
and only returned when the reader is closed in Line 1887 in 4c7ecd3
For some of our index chunk files with 500+MB this amounted to a required pool size of 6.25GB to successfully read that chunk file, which blew our Max BytePool size of 4GB, resulting in the With the change it 'only' allocates the chunk size (plus a few bytes if the chunk is not evenly divisible by 16k) for reading the 500MB chunk, which is a lot better and fixes the issue for us for now |
OK I think there are few misunderstandings but overall this fix makes sense (:
Partitioning makes this really unlikely - however still you can exhaust pool with multiple specific queries at the same time.
Yes - which is done once the requests finish so it should be fine (?)
That makes sense, but I guess increasing the limit (and memory) will work.
This means to me that your queries are hitting small chunks a lot as I mentioned before. |
Signed-off-by: Jens Hausherr <jens.hausherr@xing.com> Signed-off-by: Giedrius Statkevičius <giedriuswork@gmail.com>
This PR fixes #1580
It fixes the immediate issue, but it could be worthy to discuss if pool.BytesPool is necessary at all or could be replaced by a simpler direct usage of sync.Pool
Changes
Verification
The store does no longer exhaust its byte pool size on larger buckets