Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Suggest uppercase #3

Open
mv-i22 opened this issue Feb 19, 2018 · 7 comments
Open

Proposal: Suggest uppercase #3

mv-i22 opened this issue Feb 19, 2018 · 7 comments

Comments

@mv-i22
Copy link

mv-i22 commented Feb 19, 2018

I just saw that there are implementations of ULID that provide uppercase only ULIDs, others (like the PHP implementation by @robinvdvleuten) provide lowercase ULIDs. The specification does not yet impose uppercase or lowercase but states that ULID is "case-insensitive". This is a great feature.

Nevertheless, I'd like to propose suggesting Uppercase ULID as "the right way". Mainly for two reasons:

  • most of the libraries provide uppercase ULIDs. The existing libraries will probably become the "standard solution" for their language (given that they are well designed, tested and documented).
  • Having different solutions in different languages could lead to problems in multi language environments. Consider a database that is accessed by more than one project with different languages across the projects. Consider Regex validation that will often be made up for the existing library and may lead to problems when other librarys or languages provide stuff.

I understand that this is debatable, as being flexible in your setup is a strength. But I also think, that having either uppercase or lowercase as the proposed (or imposed) way to implement ULIDs will help the Spec to spread because there is less potential for conflicts.

What do you think of this?

@openthc
Copy link

openthc commented Jul 18, 2018

Bullet point 5 of ULID spec reads, in part: "Uses Crockford's base32".
Crockford Spec reads, in part: "When decoding, upper and lower case letters are accepted, and i and l will be treated as 1 and o will be treated as 0. When encoding, only upper case letters are used."

@nelsonjchen
Copy link

I think the spec should explicitly note that.

@RobThree
Copy link

RobThree commented May 27, 2019

When decoding [...] i and l will be treated as 1 and o will be treated as 0

I'm also pretty sure very few implementations will respect this. And although the spec explicitly mentions Crockford's Base32 (which allows for hyphens (-) anywhere in the string), AFAIK most implementations don't allow for these. So either we're not (100%) using Crockford's Base32 but some 'derivative' OR these things should be called out more explicitly in the spec.

I have implemented both (allowing i, l, I, L, o and O and allowing hyphens) in my .Net implementation.

@wenerme
Copy link

wenerme commented Oct 15, 2021

case-insensitive is for codec only, but not for all other case (like db pk, redis key), please specify (at least prefer which case) the case, otherwise the db may not found the "same" ULID.

BTW, I prefer lowercase, because in web env, most of time are case-insensitive, use lower case make more sense, like pg gen_random_uuid in lowercase.

@RobThree
Copy link

I don't see why the spec would have to define / enforce something as simple as upper/lowercase. If you have a specific usecase where you require either one, then call a .ToUpper() or strtolower() or whatever your language provides on it before inserting it or searching for a ULID. As you say, most usecases will be case-insensitive; for the cases where case matters, enforce it.

@wenerme
Copy link

wenerme commented Oct 15, 2021

from https://datatracker.ietf.org/doc/html/rfc4122#section-3

The hexadecimal values "a" through "f" are output as lower case characters and are case insensitive on input.

UUID specified the output case here, the underlying codec is not ulid, the output is.

Without consistency on case, we can not just call gen_now_uuid, always used like to_lower(gen_now_uuid) or to_upper(gen_now_uuid())

@RobThree
Copy link

RobThree commented Oct 15, 2021

UUID specified the output case here

What they do is up to them, isn't it?

the underlying codec is not ulid, the output is.

I'm not sure I understand what you mean here. You mean the underlying encoding I guess? GUID's are case-sensitive in most languages AFAIK too. To me, I don't see why we would enforce either lower or upper case; it's trivial in most cases where it matters to make the ULID upper- or lowercase. I can see that agreeing on a canonical notation would be beneficial, but the benefits are minor and next to none IMHO. So my reasoning then is to leave it up to whomever uses it and their usecase. There's no real technical reason to enforce either notation IMHO.

Without consistency on case, we can not just call gen_now_uuid, always used like to_lower(gen_now_uuid) or to_upper(gen_now_uuid())

If it really matters then why not create a wrapper/proxy/adapter/derived class that handles the upper- or lowercasing for your specific usecase? Shouldn't be more than a few lines of code in most languages.

I know, you could argue that it costs extra CPU cycles to uppercase an entire lowercase string or vice versa and is wasteful if you can just output the correct case directly. So then let's argue we choose lowercase as 'canonical form' and then still, from the cases where casing does matter, 50% will have to run it through uppercasing methods; and if we choose uppercase then the other 50% will have to do the same...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants