Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-work bucket naming rules. #1279

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions S3/Config.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,8 @@ class Config(object):
force = False
server_side_encryption = False
enable = None
# Used to allow colons in bucket names for Ceph compatibility
bucket_name_quirks = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like so much "quirks" and to restrict that to colons and Ceph.

Maybe you can call that something like: "bucket_name_extended", "extended_bucket_name", "bucket_name_relaxed", "bucket_name_lenient".

or invert the condition like:
"bucket_name_strict = True",

And in the comment say that it relax rules for bucket name for S3 compatible services that supports path style requests.

get_continue = False
put_continue = False
upload_id = u""
Expand Down
24 changes: 14 additions & 10 deletions S3/S3.py
Original file line number Diff line number Diff line change
Expand Up @@ -407,17 +407,21 @@ def bucket_create(self, bucket, bucket_location = None, extra_headers = None):
headers.update(extra_headers)

body = ""
if bucket_location and bucket_location.strip().upper() != "US" and bucket_location.strip().lower() != "us-east-1":
bucket_location = bucket_location.strip()
if bucket_location.upper() == "EU":
bucket_location = bucket_location.upper()
body = "<CreateBucketConfiguration><LocationConstraint>"
body += bucket_location
body += "</LocationConstraint></CreateBucketConfiguration>"
debug("bucket_location: " + body)
check_bucket_name(bucket, dns_strict = True)
if self.config.bucket_name_quirks:
# We are explicitly not AWS
check_bucket_name(bucket, dns_strict = False, name_quirks = True)
else:
check_bucket_name(bucket, dns_strict = False)
if bucket_location:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CreateBucketConfiguration should be set in all cases, that we use "name quirks" or not.

For the moment, keep the existing code but remove the "check_bucket_name" of this block.
(In the future, I will change the EU/US thing to avoid some specific issues, but it can be done after).

After the previous block, add a new block of code:

if self.config.bucket_name_quirks:
    checkbucket(dns_strict = False)
else:
   check_bucket(dns_strict= True)

# We follow AWS rules
bucket_location = bucket_location.strip()
if bucket_location.upper() == "EU":
bucket_location = bucket_location.upper()
body = "<CreateBucketConfiguration><LocationConstraint>"
body += bucket_location
body += "</LocationConstraint></CreateBucketConfiguration>"
debug("bucket_location: " + body)
check_bucket_name(bucket, dns_strict = True, name_quirks = False)

if self.config.acl_public:
headers["x-amz-acl"] = "public-read"

Expand Down
56 changes: 40 additions & 16 deletions S3/Utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -234,8 +234,30 @@ def time_to_epoch(t):
raise S3.Exceptions.ParameterError('Unable to convert %r to an epoch time. Pass an epoch time. Try `date -d \'now + 1 year\' +%%s` (shell) or time.mktime (Python).' % t)


def check_bucket_name(bucket, dns_strict=True):
if dns_strict:
def check_bucket_name(bucket, dns_strict=True, name_quirks=False):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that us-east-1 also doesn't support the previous set of extended chars, I think that it is ok to not add your "name_quirks" and just add the ":" inside the char list checked when not in "dns_strict" mode.

btw, can you escape the "-" inside the regexp, to be able to have it before the additional characters, and also avoid errors in the future?

"""Check that the bucket name is valid for our situation.

Check against the rules from: https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucketnamingrules.html

dns_strict: True means follow above rules exactly. False allows
for relaxed naming conventions that existed in use-east-1 prior to
March 1 2018.

name_quirks: If true allow compatibility with services implimenting
the S3 API but are not fully bound by the S3 rules.

"""

# name_quirks has priority over dns_strict. So instead of a 4-way
# comparison we can get away with a 3 way one.
max_length = 255
if name_quirks:
invalid = re.search("([^A-Za-z0-9\._:-])", bucket, re.UNICODE)
if invalid:
raise S3.Exceptions.ParameterError("Bucket name '%s' contains disallowed character '%s'. The only supported ones are: us-ascii letters (a-z, A-Z), digits (0-9), dot (.), hyphen (-), colon (:), and underscore (_)." % (bucket, invalid.groups()[0]))
elif dns_strict:
# This is the default
max_length = 63
invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
if invalid:
raise S3.Exceptions.ParameterError("Bucket name '%s' contains disallowed character '%s'. The only supported ones are: lowercase us-ascii letters (a-z), digits (0-9), dot (.) and hyphen (-)." % (bucket, invalid.groups()[0]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can profit of your PR to change the following in this file:
import S3.Exceptions into from S3.Exceptions import ParameterError

And so simplify all of its usage here.

Expand All @@ -244,27 +266,29 @@ def check_bucket_name(bucket, dns_strict=True):
if invalid:
raise S3.Exceptions.ParameterError("Bucket name '%s' contains disallowed character '%s'. The only supported ones are: us-ascii letters (a-z, A-Z), digits (0-9), dot (.), hyphen (-) and underscore (_)." % (bucket, invalid.groups()[0]))

# The above block pre-filters some things. But the lower stuff has to
# be more permissive to allow for what's allowed in the least restrictive
# filters above.

if len(bucket) < 3:
raise S3.Exceptions.ParameterError("Bucket name '%s' is too short (min 3 characters)" % bucket)
if len(bucket) > 255:
raise S3.Exceptions.ParameterError("Bucket name '%s' is too long (max 255 characters)" % bucket)
if dns_strict:
if len(bucket) > 63:
raise S3.Exceptions.ParameterError("Bucket name '%s' is too long (max 63 characters)" % bucket)
if re.search("-\.", bucket, re.UNICODE):
raise S3.Exceptions.ParameterError("Bucket name '%s' must not contain sequence '-.' for DNS compatibility" % bucket)
if re.search("\.\.", bucket, re.UNICODE):
raise S3.Exceptions.ParameterError("Bucket name '%s' must not contain sequence '..' for DNS compatibility" % bucket)
if not re.search("^[0-9a-z]", bucket, re.UNICODE):
raise S3.Exceptions.ParameterError("Bucket name '%s' must start with a letter or a digit" % bucket)
if not re.search("[0-9a-z]$", bucket, re.UNICODE):
raise S3.Exceptions.ParameterError("Bucket name '%s' must end with a letter or a digit" % bucket)
if len(bucket) > max_length:
raise S3.Exceptions.ParameterError("Bucket name '%s' is too long (max %d characters)" % (bucket, max_length))
if re.search("-\.", bucket, re.UNICODE):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can profit of the rework to simplify this to:
if '-.' in bucket

raise S3.Exceptions.ParameterError("Bucket name '%s' must not contain sequence '-.' for DNS compatibility" % bucket)
if re.search("\.\.", bucket, re.UNICODE):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same a previous one, you could probably simplify that in:
if '..' in bucket

And maybe merge the 2 tests in fact!

raise S3.Exceptions.ParameterError("Bucket name '%s' must not contain sequence '..' for DNS compatibility" % bucket)
if not re.search("^[0-9a-zA-Z]", bucket, re.UNICODE):
raise S3.Exceptions.ParameterError("Bucket name '%s' must start with a letter or a digit" % bucket)
if not re.search("[0-9a-zA-Z]$", bucket, re.UNICODE):
raise S3.Exceptions.ParameterError("Bucket name '%s' must end with a letter or a digit" % bucket)
return True
__all__.append("check_bucket_name")


def check_bucket_name_dns_conformity(bucket):
try:
return check_bucket_name(bucket, dns_strict = True)
return check_bucket_name(bucket, dns_strict = True, name_quirks = False)
except S3.Exceptions.ParameterError:
return False
__all__.append("check_bucket_name_dns_conformity")
Expand Down