-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rgw/sfs: linear performance degradation with the number of objects in buckets #203
Comments
This is likely one of the culprits for the performance numbers being so crappy right now. One of the paths forward may simply be caching objects in memory instead of refreshing all the time, invalidating solely when objects are changed (or deleted, depending on versioning being enabled or not). This can also limit the number of objects in memory, and do a basic LRU in case there's a missing object in cache that needs to be promoted should the cache be full. We may not even have to refresh if objects are added or changed; we may just need to add a new object or update an existing one. There's a lot of room to be smarter here. |
I think the approach to define a maximum amount of memory loaded items per bucket is preferable (maybe something configurable). |
Before we start working on implementing performance improvements, we want to set up a call and plan this properly. |
build: add missing 'libcap-devel' to radosgw build
https://aquarist-labs.github.io/s3gw-perf-reports/reports/release_comprehensive_v0.14.0.html |
Discussion moved on since December (last comment). Summary: aquarist-labs/ceph#124 refactored and removed the object cache. Plan is to rewrite list bucket as SQL queries. Right now it is basically a get all from database and filter the results. I'm working on it. |
@irq0 is this issue still valid? |
It's not solved, but enough changed to warrant a new issue. Closing in favor of #509 |
Bucket::_refresh_objects()
is a very frequent called function insfs
and it is iterating all the objects held by a bucket.This exposes
sfs
to an inherent linear complexity directly proportional with the number of objects held by buckets.I made a test filling a bucket with 10k objects and the latency increases significantly:
Every time a
list_buckets
is called there is a magnitude latency of seconds:We should discuss how to avoid the
_refresh_objects
call in thesfs::Bucket
class.More in general,
_refresh_objects
is an inherently slow, linear operation and we should see if we can improve this design.This also applies to
_refresh_buckets
function (it is less exposed, but the problem is the same).The text was updated successfully, but these errors were encountered: