Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

too many folders in root of bucket #7

Open
bunnybones1 opened this issue Nov 30, 2019 · 4 comments
Open

too many folders in root of bucket #7

bunnybones1 opened this issue Nov 30, 2019 · 4 comments

Comments

@bunnybones1
Copy link

During a quick inspection with cyberduck, the root bucket took a long time to list (with warnings):
Screen Shot 2019-11-30 at 12 37 06 AM
It might be prudent to chop our prefixes into 2 character chunks, inspired by how git organizes objects:
Screen Shot 2019-11-30 at 12 40 17 AM

@pkieltyka
Copy link
Member

alternatively, instead of making it a prefix hash, we could make it a suffix hash on the filename/ext...

cc: @dencoded @c2h5oh

@c2h5oh
Copy link
Member

c2h5oh commented Dec 6, 2019

@pkieltyka

  • Suffix is no-go: S3 separates files into shards based on file path prefix and is not smart about it - it splits into equal character ranges, not equal file count.
    Each shard supports up to 100 iops.

  • using 1-2 cards of file hash as top level directory does, but remember it's base64 and not hex, so 1 char = 64 directories, 2 chars = 4096 directories.

@dencoded
Copy link
Contributor

dencoded commented Dec 6, 2019

btw, they claimed to fix this issue with path prefix/shard capacity but people are still complaining so I wouldn't trust 100% they fixed it

@c2h5oh
Copy link
Member

c2h5oh commented Dec 9, 2019

@dencoded when? I know that 14 months ago that was still the case (I was doing a large s3 migration with almost 16M bucket objects).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants