Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative proposal for Hashspace ID Values #143

Closed
kyzer-davis opened this issue Sep 6, 2023 · 38 comments
Closed

Alternative proposal for Hashspace ID Values #143

kyzer-davis opened this issue Sep 6, 2023 · 38 comments
Labels
wontfix This will not be worked on

Comments

@kyzer-davis
Copy link
Collaborator

kyzer-davis commented Sep 6, 2023

From Paul Wouters

Why are hashspace IDs chosen to look like random uuids? Eg why not encode
"SHA2_224" (hex 534841325F323234) as 53484132-5F32-3234-0000-000000000000
or 00000000-0000-0000-5348-41325F323234 (plus or minus variant/version bits)
so that it becomes far more clear this is not an ordinary random uuid?

Current, All random

SHA2_224     = "59031ca3-fbdb-47fb-9f6c-0f30e2e83145"
SHA2_256     = "3fb32780-953c-4464-9cfd-e85dbbe9843d"
SHA2_384     = "e6800581-f333-484b-8778-601ff2b58da8"
SHA2_512     = "0fde22f2-e7ba-4fd1-9753-9c2ea88fa3f9"
SHA2_512_224 = "003c2038-c4fe-4b95-a672-0c26c1b79542"
SHA2_512_256 = "9475ad00-3769-4c07-9642-5e7383732306"
SHA3_224     = "9768761f-ac5a-419e-a180-7ca239e8025a"
SHA3_256     = "2034d66b-4047-4553-8f80-70e593176877"
SHA3_384     = "872fb339-2636-4bdd-bda6-b6dc2a82b1b3"
SHA3_512     = "a4920a5d-a8a6-426c-8d14-a6cafbe64c7b"
SHAKE_128    = "7ea218f6-629a-425f-9f88-7439d63296bb"
SHAKE_256    = "2e7fc6a4-2919-4edc-b0ba-7d7062ce4f0a"
@kyzer-davis
Copy link
Collaborator Author

kyzer-davis commented Sep 6, 2023

Alternative from me:
Pick a random UUID to start, increment the lowest bits by 1 for each.

Logic for Namespace is a specific UUIDv1
6ba7b810-9dad-11d1-80b4-00c04fd430c8 starts and next is 6ba7b811-9dad-11d1-80b4-00c04fd430c8 is next where the 6ba7b810 and 6ba7b811 increments for each up through 0-4 in that position.

Increment by UUID in least significant position.

  • First part is UUIDv4 was initiated from random and frozen to 59031ca3-fbdb-47fb-9f6c
SHA2_224     = "59031ca3-fbdb-47fb-9f6c-000000000000"
SHA2_256     = "59031ca3-fbdb-47fb-9f6c-000000000001"
SHA2_384     = "59031ca3-fbdb-47fb-9f6c-000000000002"
SHA2_512     = "59031ca3-fbdb-47fb-9f6c-000000000003"
SHA2_512_224 = "59031ca3-fbdb-47fb-9f6c-000000000004"
SHA2_512_256 = "59031ca3-fbdb-47fb-9f6c-000000000005"
SHA3_224     = "59031ca3-fbdb-47fb-9f6c-000000000006"
SHA3_256     = "59031ca3-fbdb-47fb-9f6c-000000000007"
SHA3_384     = "59031ca3-fbdb-47fb-9f6c-000000000008"
SHA3_512     = "59031ca3-fbdb-47fb-9f6c-000000000009"
SHAKE_128    = "59031ca3-fbdb-47fb-9f6c-00000000000A"
SHAKE_256    = "59031ca3-fbdb-47fb-9f6c-00000000000B"

@kyzer-davis kyzer-davis changed the title Alternative proposal for Hashspace IDs Alternative proposal for Hashspace ID Values Sep 6, 2023
@kyzer-davis
Copy link
Collaborator Author

kyzer-davis commented Sep 6, 2023

Paul's proposal, TEXT to HEX, is tough because the current hashspace ID labels for a value is at minimum 16 characters and at max 24 characters after encoding in hex. (See the end)
The Version/Variant bits need to be set which leaves position that isn't ideal to slot these in easily.

  • X, the start of the UUID 12 chars
  • Y, the end of the UUID 15 chars
  • Z, the middle of the UUID 3 chars
xxxxxxxx-xxxx-Mzzz-Nyyy-yyyyyyyyyyyy

Text to Hex

SHA2_224     = 534841325f323234
SHA2_256     = 534841325f323536
SHA2_384     = 534841325f333834
SHA2_512     = 534841325f353132
SHA2_512_224 = 534841325f3531325f323234
SHA2_512_256 = 534841325f3531325f323536
SHA3_224     = 534841335f323234
SHA3_256     = 534841335f323536
SHA3_384     = 534841335f333834
SHA3_512     = 534841335f353132
SHAKE_128    = 5348414b455f313238
SHAKE_256    = 5348414b455f323536

Edit: I could change the names labels, remove underscore but it does not scale nicely unless they can all be 12-15 chars after the encoding or one must navigate the Ver/Var hex.

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

For my online service I have done the following approach:

HashSpaceUuid<AlgoName> := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", AlgoName).

which results in the following UUIDs:

-- Algorithms from this draft (revision 00-11)
-- The payload of the UUIDv5 is the algorithms name from PHP hash_algos(), as well as "shake128" and "shake256"
SHA224                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha224")      = "59031ca3-fbdb-47fb-9f6c-0f30e2e83145".
SHA256                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha256")      = "3fb32780-953c-4464-9cfd-e85dbbe9843d".
SHA384                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha384")      = "e6800581-f333-484b-8778-601ff2b58da8".
SHA512                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha512")      = "0fde22f2-e7ba-4fd1-9753-9c2ea88fa3f9".
SHA512/224             := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha512/224")  = "003c2038-c4fe-4b95-a672-0c26c1b79542".
SHA512/256             := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha512/256")  = "9475ad00-3769-4c07-9642-5e7383732306".
SHA3/224               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-224")    = "9768761f-ac5a-419e-a180-7ca239e8025a".
SHA3/256               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-256")    = "2034d66b-4047-4553-8f80-70e593176877".
SHA3/384               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-384")    = "872fb339-2636-4bdd-bda6-b6dc2a82b1b3".
SHA3/512               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "sha3-512")    = "a4920a5d-a8a6-426c-8d14-a6cafbe64c7b".
SHAKE128               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "shake128")    = "7ea218f6-629a-425f-9f88-7439d63296bb".
SHAKE256               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "shake256")    = "2e7fc6a4-2919-4edc-b0ba-7d7062ce4f0a".

-- Other algorithms
-- Excluded are algorithms whose output is too short to fit into an UUIDv8
-- The payload of the UUIDv5 is the algorithms name from PHP hash_algos()
GOST                   := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "gost")        = "be782e40-b9e8-59c4-8500-31a6cfb91a75".
GOST-CryptoProParamSet := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "gost-crypto") = "9c1d4a70-75ec-5c6a-84e2-09b400fe8f21".
HAVAL-3-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval128,3")  = "176e81e1-9fc8-50f3-b569-08f264e5ae58".
HAVAL-3-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval160,3")  = "8d160752-d034-54e0-ac73-930ec60580c2".
HAVAL-3-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval192,3")  = "1f53cfc9-a36c-5a36-b27c-6dc88074ca38".
HAVAL-3-224            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval224,3")  = "56ef61fc-16de-55f4-bd3f-f44856d3d436".
HAVAL-3-256            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval256,3")  = "66111477-b9e1-54cc-a38f-b73b228964cc".
HAVAL-4-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval128,4")  = "55d554f5-7c2e-5e08-a8e3-cd6ecbac1e32".
HAVAL-4-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval160,4")  = "5d3d8b32-d57d-54b4-9a01-342e8fa9df5b".
HAVAL-4-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval192,4")  = "c7c66b4d-4299-5489-aa29-991b7bd4aa52".
HAVAL-4-224            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval224,4")  = "5c86a1f5-6b47-576e-a900-087897bf83a7".
HAVAL-4-256            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval256,4")  = "5851bbb5-56a3-55b3-8775-371739d251ca".
HAVAL-5-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval128,5")  = "9d40aac4-d8e5-5846-ad91-dc3429294d0d".
HAVAL-5-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval160,5")  = "764e5acb-88c4-5b24-b3bd-cca0941de88a".
HAVAL-5-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval192,5")  = "4455573f-5ff9-5cac-aadc-7ecf5c0c7ad1".
HAVAL-5-224            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval224,5")  = "0336bbe3-f703-5184-b52d-4e9d8163ddcd".
HAVAL-5-256            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "haval256,5")  = "5f4a8511-9e92-5d62-b94d-0b910a1e3d9a".
MD2                    := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "md2")         = "6ca7dd19-4755-5c6a-8b3f-3056ef6bebf6".
MD4                    := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "md4")         = "15329616-0af7-535a-b4b6-41c4eba21457".
Murmur3c               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "murmur3c")    = "b941a86c-9e70-5044-9496-da00eec9b934".
Murmur3f               := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "murmur3f")    = "4a2262be-0dec-587f-843a-eb50c707d779".
RIPEMD128              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd128")   = "efd0677a-e9f4-5337-8764-51be1c353d4a".
RIPEMD160              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd160")   = "b54b1a0a-ce07-5d4b-9d03-96d57da2bf29".
RIPEMD256              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd256")   = "e288aa2a-5260-5aaf-825a-f40ce0514d19".
RIPEMD320              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "ripemd320")   = "2919713b-ae42-58a3-916a-039989f07300".
SNEFRU                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "snefru")      = "d38f5891-c553-5d58-88de-199cbf48291e".
SNEFRU256              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "snefru256")   = "30b628e4-4587-5f06-ae1b-be9d0cba1187".
Tiger-3-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger128,3")  = "6f5ba86a-a362-50f1-bc6a-62787ee998b8".
Tiger-4-128            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger128,4")  = "13ff3c12-da5d-5437-89d9-a76e44abd0c8".
Tiger-3-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger160,3")  = "50d3d8af-6a6c-5ea3-bfad-450424668dee".
Tiger-4-160            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger160,4")  = "31d39089-28e0-584b-bf72-7cf33c2caeea".
Tiger-3-192            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger192,3")  = "63cfdad3-a720-55e2-83a4-b8c762ad4012".
Tiger-4-195            := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "tiger192,4")  = "5b07ff46-f679-5d3c-97e1-10b800be9246".
WHIRLPOOL              := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "whirlpool")   = "74fd261c-1f13-5015-81cc-fdc9a7354ae5".
XXH128                 := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "xxh128")      = "1a66c377-af3d-5f24-b9fb-d6c067d8b588".

What do you think about it?

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

I know that not being a hash function expert, I shouldn't even question the advantage and simplicity of incrementing a base UUID. However, in terms of avalanche effect, which is better: changing only 1 bit or changing 128 bits changing more bits?

I also know that it is not a requirement that an ID produced with a hashspace has a very high probability of not clashing with IDs produced with a different hashspace. SHA-x algorithms already guarantee that changing 1 bit in the input produces a drastically different output, so this probability must be extremely low to be taken into account. However, if it is possible to maximize this effect by changing as many input bits as possible, wouldn't that be more desirable?

--

EDIT: I crossed out the "changing 128 bits" phrase because changing 128 bits means inverting all bits, which is the result of an XOR operation. It seems more appropriate to change more than 1 bit, but not all.

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

For my online service I have done the following approach:

HashSpaceUuid := UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", AlgoName).

Sounds logical to me and easy to describe as you only need to define the namespace UUID that will be used with the UUIDv5 function to generate the hashspaces for each algorithm name.

However, it is necessary to use the "canonical name" of each algorithm, which implies text encoding (UTF-8, ASCII etc), case sensitivity (uppercase, lowercase), use (or not) of "non-word" characters (dash, space), etc.

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

Yes, the canonical name is indeed my concern, too. (I wonder, is there a RFC that defines naming schemes for algorithms?)

Proposal 1

I have the following idea:

  • Appendix B could describe the UUIDv5 mechanism and list SHA2, SHA3, and SHAKE as examples (and hence define their name. You have to think about if you prefer uppercase or lowercase and use dashes or hyphens).
  • The naming of other algorithms (like GOST or HAVAL) would be out-of-scope. However, people can orient on our naming scheme. For example, if we define the name to be "SHA3", then it is likely that people will choose "HAVAL" instead of "Haval" or "haval". If we choose the name "sha3", then it is likely that people will choose "haval" instead of "HAVAL" or "Haval".
  • If you like, you could add a note that the UUIDv5 mechanism is OPTIONAL, which means people are not forced to use it. They can still use UUIDv1 or UUIDv4 for other algorithms if they prefer it that way. (This might make sense in case a hash algorithm is already defined by a UUID)

Here is a proposal (I have calculated the UUIDv5, but please double-check them):

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based UUIDs.

   The following UUIDs were created by using a UUIDv5 with namespace ID
   "1ee317e2-1853-64b2-8fe9-3c4a92df8582" and the algorithm name in
   upper-case and with underscores as data.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace UUID.

   SHA2_224     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_224")     = "5385c476-6ffc-578a-908f-91b5cd2eac03"
   SHA2_256     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_256")     = "f660b1c5-f2c9-5f3a-981f-8652227fc329"
   SHA2_384     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_384")     = "43794fb1-7e34-558f-a8a5-5b4b8f8470d5"
   SHA2_512     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_512")     = "250cb2ab-c480-5f24-83fd-16ea8b0b9e36"
   SHA2_512_224 = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_512_224") = "70bade2b-c68a-5894-b31d-7c3581b6c647"
   SHA2_512_256 = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA2_512_256") = "a05efbcf-0a2a-5aab-9c62-8c94d05e0760"
   SHA3_224     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_224")     = "2862da96-f3c7-586a-8cc3-b1f424cdf040"
   SHA3_256     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_256")     = "72727812-3cea-56bd-a57f-ed3445acca4f"
   SHA3_384     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_384")     = "279978a1-86d1-56e6-bce2-019f5eaa3437"
   SHA3_512     = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA3_512")     = "33e6927a-d382-5dbd-b415-402610340bcd"
   SHAKE_128    = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHAKE_128")    = "8835c536-6ab4-55bc-be61-7029cdcbd1db"
   SHAKE_256    = UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHAKE_256")    = "311a1f8e-0a71-554a-bcdb-436c4e9f55e8"

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

Here is a proposal (I have calculated the UUIDv5, but please double-check them):

I prefer to keep the previous defined UUIDv4-based hashspaces, but I think this UUIDv5 mechanism is a better way to define pseudo-random or "random-looking" hashspaces which can be easily reproduced to define new hashspaces for cryptographic hash functions that could not be included in the document.

I just don't know which are the current canonical names for the SHA-2 family. For example, Wikipedia and Java use SHA-256 (with a dash), but not SHA2_256 (with a 2 and an underline).

P.S.: can we use this document as a reference?: https://csrc.nist.gov/files/pubs/fips/180-4/upd1/final/docs/fips180-4-draft-aug2014.pdf

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

Proposal 2

There is another method which does not rely on canonical names (or even English language) at all.

A lot of hash algorithms are identified by OIDs. Some of them are located in this arc:
http://oid-info.com/get/2.16.840.1.101.3.4.2

We could use a UUIDv5 with namespace OID (6ba7b812-9dad-11d1-80b4-00c04fd430c8)
and the hash algorithm OID as payload.

Here is my proposal:

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based UUIDs.

   The following UUIDs were created by using a UUIDv5 with the OID namespace ID
   ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the OID identifying the
   hash algorithm.  This mechanism of generating a hashspace ID is OPTIONAL.
   Any UUID can be used as a hashspace UUID.

   SHA-224      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4")  = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.1")  = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.2")  = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.3")  = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.5")  = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.6")  = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.7")  = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.8")  = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.9")  = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.10") = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.11") = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.12") = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

(Edit after my initial post: Changed the algorithm names as defined by FIPS180-4 and FIPS202)

(Note to self:) Here is a list of Algorithms/OIDs I have found:

GOST = 1.2.643.2.2.30.0
GOST-CryptoProParamSet = 1.2.643.2.2.30.1
GOST3410-2012-256 = 1.2.643.7.1.1.3.2
GOST3410-2012-512 = 1.2.643.7.1.1.3.3
HAVAL-3-128 = 1.3.6.1.4.1.18105.2.1.1.1
HAVAL-3-160 = 1.3.6.1.4.1.18105.2.1.1.2
HAVAL-3-192 = 1.3.6.1.4.1.18105.2.1.1.3
HAVAL-3-224 = 1.3.6.1.4.1.18105.2.1.1.4
HAVAL-3-256 = 1.3.6.1.4.1.18105.2.1.1.5
HAVAL-4-128 = 1.3.6.1.4.1.18105.2.1.1.6
HAVAL-4-160 = 1.3.6.1.4.1.18105.2.1.1.7
HAVAL-4-192 = 1.3.6.1.4.1.18105.2.1.1.8
HAVAL-4-224 = 1.3.6.1.4.1.18105.2.1.1.9
HAVAL-4-256 = 1.3.6.1.4.1.18105.2.1.1.10
HAVAL-5-128 = 1.3.6.1.4.1.18105.2.1.1.11
HAVAL-5-160 = 1.3.6.1.4.1.18105.2.1.1.12
HAVAL-5-192 = 1.3.6.1.4.1.18105.2.1.1.13
HAVAL-5-224 = 1.3.6.1.4.1.18105.2.1.1.14
HAVAL-5-256 = 1.3.6.1.4.1.18105.2.1.1.15
ISO/IEC 10118-2 "Hash Function 1" = 1.0.10118.2.0.1
ISO/IEC 10118-2 "Hash Function 2" = 1.0.10118.2.0.2
ISO/IEC 10118-2 "Hash Function 3" = 1.0.10118.2.0.3
ISO/IEC 10118-2 "Hash Function 4" = 1.0.10118.2.0.4
MD2 = 1.2.840.113549.2.2
MD4 = 1.2.840.113549.2.4
MD5  = 1.2.840.113549.2.5  (use UUIDv3)
MURMUR3C = ???
MURMUR3F = ???
Modular Arithmetic Secure Hash 1 (MASH-1) algorithm = 1.0.10118.4.0.65
Modular Arithmetic Secure Hash 2 (MASH-2) algorithm = 1.0.10118.4.0.66
RIPEMD128 = 1.3.36.3.2.2 or 1.0.10118.3.0.50
RIPEMD160 = 1.3.36.3.2.1 or 1.0.10118.3.0.49
RIPEMD256 = 1.3.36.3.2.3
RIPEMD320 = ???
SHA-224 = 2.16.840.1.101.3.4.2.4
SHA-256 = 2.16.840.1.101.3.4.2.1
SHA-384 = 2.16.840.1.101.3.4.2.2
SHA-512 = 2.16.840.1.101.3.4.2.3
SHA-512/224 = 2.16.840.1.101.3.4.2.5
SHA-512/256 = 2.16.840.1.101.3.4.2.6
SHA0 = OID = 1.3.14.3.2.18
SHA1 = 1.3.14.3.2.26  (use UUIDv5)
SHA3-224 = 2.16.840.1.101.3.4.2.7
SHA3-256 = 2.16.840.1.101.3.4.2.8
SHA3-384 = 2.16.840.1.101.3.4.2.9
SHA3-512 = 2.16.840.1.101.3.4.2.10
SHAKE-128 = 2.16.840.1.101.3.4.2.11
SHAKE-256 = 2.16.840.1.101.3.4.2.12
SM3 (ISO/IEC 10118-3) = 1.0.10118.3.0.65
SNEFRU = ???
SNEFRU256 = ???
Streebog 256 = 1.0.10118.3.0.60
Streebog 512 = 1.0.10118.3.0.59
TIGER-3-128 = ???
TIGER-3-160 = ???
TIGER-3-192 = ??? (1.3.6.1.4.1.11591.12.2 specifies 192 bits, but rounds are unknown)
TIGER-4-128 = ???
TIGER-4-160 = ???
TIGER-4-192 = ???
WHIRLPOOL = 1.0.10118.3.0.55
XXH128 = ???

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

@danielmarschall

I think it's way better.

Why not using the URN notation in lowercase mode only, e.g. urn:oid:2.16.840.1.101.3.4.2.4?

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

@fabiolimace Are you confused about my notation UUIDv5(OID 2.16.840.1.101.3.4.2.6) ? With this notation I meant taking the OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and use the OID "2.16.840.1.101.3.4.2.6" as payload.

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

Sorry I meant the string "urn:oid:2.16.840.1.101.3.4.2.4" as the name input for the UUIDv5 function.

This:

   SHA2_224     = UUIDv5(urn:oid:2.16.840.1.101.3.4.2.4)  = 85eed581-369c-5931-a7fe-0d8158e83871

Not this:

   SHA2_224     = UUIDv5(OID 2.16.840.1.101.3.4.2.4)  = e0f20710-25d9-54ab-8325-ccf2d456ad0b

But I'm not sure if it's important.

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

UUIDv5 requires two parameters: Namespace ID and Payload. So the notation UUIDv5(urn:oid:2.16.840.1.101.3.4.2.4) is incomplete.

The full notation of UUIDv5(OID 2.16.840.1.101.3.4.2.4) would be UUIDv5("6ba7b812-9dad-11d1-80b4-00c04fd430c8", "2.16.840.1.101.3.4.2.4"), but then the line becomes too long.

Edit: I have changed my proposal to UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4") . This should be more clear what I mean.

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

The full notation of UUIDv5(OID 2.16.840.1.101.3.4.2.4) would be UUIDv5("6ba7b812-9dad-11d1-80b4-00c04fd430c8", "2.16.840.1.101.3.4.2.4"), but then the line becomes too long.

Yes, I noticed that the namespace parameter was implicit.

@fabiolimace
Copy link

fabiolimace commented Sep 6, 2023

The following UUIDs were created by using a UUIDv5 with the OID namespace ID
("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the OID identifying the
hash algorithm.
SHA2_224 = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4") = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"

I completely agree now. ❤️

--

P.S.
It also breaks my implementation of UUIDv8 using SHA-256. 😢

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

Yes, it breaks some implementations, including mine. But after all, Internet Drafts are supposed to change. :-)
I really hope my proposal gets accepted, because I think it is perfect to use OIDs. They are unambiguous and so everyone can define their own hashspace IDs.

@kyzer-davis
Copy link
Collaborator Author

@danielmarschall, Your proposal of UUIDv5(NS_OID, "Hash_OID_NO_LEADING_DOT") works for the NIST ones that have OIDs.
Are we to assume every cryptographic hashing function will have an OID?

Checking against your list earlier:

MD2               = "1.2.840.113549.2.2"
MD4               = "1.2.840.113549.2.4"
MD5               = "1.2.840.113549.2.5" (But probably just use v3)
TIGER/192         = "1.3.6.1.4.1.11591.12.2"
RIPEMD160         = "1.0.10118.3.0.49" or "1.3.36.3.2.1"
RIPEMD128         = "1.0.10118.3.0.50" or "1.3.36.3.2.2"
RIMEMD256         = "1.3.36.3.2.3"
WHIRLPOOL         = "1.0.10118.3.0.55"?
GOST3410-2012-256 = "1.2.643.7.1.1.3.2"
GOST3410-2012-512 = "1.2.643.7.1.1.3.3"
HAVAL-3-128       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-3-160       = "1.3.6.1.4.1.18105.2.1.1.2"
HAVAL-3-192       = "1.3.6.1.4.1.18105.2.1.1.3"
HAVAL-3-224       = "1.3.6.1.4.1.18105.2.1.1.4"
HAVAL-3-256       = "1.3.6.1.4.1.18105.2.1.1.5"
HAVAL-4-128       = "1.3.6.1.4.1.18105.2.1.1.6"
HAVAL-4-160       = "1.3.6.1.4.1.18105.2.1.1.7"
HAVAL-4-192       = "1.3.6.1.4.1.18105.2.1.1.8"
HAVAL-4-224       = "1.3.6.1.4.1.18105.2.1.1.9"
HAVAL-4-256       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-128       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-160       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-192       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-224       = "1.3.6.1.4.1.18105.2.1.1.1"
HAVAL-5-256       = "1.3.6.1.4.1.18105.2.1.1.1"
SNEFRU            = ???

RIPEMD may have two and SNEFRU does not have one that I can find?
How would we handle something like that?


@fabiolimace

I just don't know which are the current canonical names for the SHA-2 family. For example, Wikipedia and Java use SHA-256 (with a dash), but not SHA2_256 (with a 2 and an underline).

I can change them to the NIST document items easy enough. I added the "2" so they were somewhat inline with SHA3 from a formatting perspective and I I swapped the "/" char for an underscore. Underscores were used because they matched the underscores used in the namespace items.

But I am not partial. I can change them to the following as defined by FIPS180-4 and FIPS202

SHA-224      = "...whatever we choose..."
SHA-256      = "...whatever we choose..."
SHA-384      = "...whatever we choose..."
SHA-512      = "...whatever we choose..."
SHA-512/224  = "...whatever we choose..."
SHA-512/256  = "...whatever we choose..."
SHA3-224     = "...whatever we choose..."
SHA3-256     = "...whatever we choose..."
SHA3-384     = "...whatever we choose..."
SHA3-512     = "...whatever we choose..."
SHAKE128     = "...whatever we choose..."
SHAKE256     = "...whatever we choose..."

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 6, 2023

The new names according to FIPS180-4 and FIPS202 look good to me.
I think "SHA-512/256" reads much better than "SHA2_512_256".


About algorithms with multiple OIDs, I would try to find the "official" ones.
But I know that task can be hard and it might be ambigous.

About algorithms without known OID, I think this could be out-of-scope.
Since the mechanism is optional, people would need to define own UUIDs,
e.g. UUIDv1 or UUIDv4 for these hash algorithms.

I am not sure if my proposal 1 (that used algorithm names in a custom namespace,
e.g. UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "SHA-512/224") ) would be better.
@kyzer-davis What is your opinion to my proposal 1 ?
Many people and even implementations like PHP use hash algorithm names like "GOST",
but there are so many GOST algorithms, that we do not know what is implemented,
so the risk is that someone does this: UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "GOST")

@fabiolimace
Copy link

fabiolimace commented Sep 7, 2023

I found a list of OIDs here (extracted from github.com/openssl):
https://version.cs.vt.edu/techstaff/linux-audit/-/blob/861ecd5cf6005a1bb1a16d840713f56c425f1039/ansible2/ansible/module_utils/crypto.py#L405

I also found OIDs for some GOST (GOvernment STandard, RU) digests in this document: [RFC-9215: Using GOST R 34.10-2012 and GOST R 34.11-2012 Algorithms with the Internet X.509 Public Key Infrastructure](https://www.rfc-editor.org/rfc/rfc9215.html#section-3).

The ASN.1 OID used to identify the GOST R 34.11-2012 hash function with a 256-bit hash code is:

id-tc26-gost3411-12-256 OBJECT IDENTIFIER ::= { iso(1) member-body(2) ru(643) rosstandart(7) tc26(1) algorithms(1) digest(2) gost3411-12-256(2)}

The ASN.1 OID used to identify the GOST R 34.11-2012 hash function with a 512-bit hash code is:

id-tc26-gost3411-12-512 OBJECT IDENTIFIER ::= { iso(1) member-body(2) ru(643) rosstandart(7) tc26(1) algorithms(1) digest(2) gost3411-12-512(3)}

Links:

EDIT: GOST OIDs are already in kyzer's list.

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 7, 2023

It might be a bit off-topic, but I am very confused about the implementations in PHP.

- There is the algorithm "gost" which is "GOST R 34.11-94" (OID = 1.2.643.2.2.9); I have verified it with test vectors. Wikipedia and other sources imply that the hash algorithm name "GOST" is describing the algorithm "GOST R 34.11-94". So, would be UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "GOST") unambiguous to the majority of users?

- And there is the "gost-crypto" hash algorithm where I do not understand what it does and how it can be identified (neither as algorithm name nor as OID). My software solution gave the hashspace ID UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "gost-crypto"), but I guess this is nonsense and ambigous.

So, algorithm names are tricky..

(Edit: Found the solution)

  • PHP algorithm name "gost" is "GOST R 34.11-94" with "Test parameter set" (OID 1.2.643.2.2.30.0), defined in RFC 4357, section 11.2.

    Hashspace ID with my proposal 1: UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "GOST") ?????
    Hashspace ID with my proposal 2: UUIDv5(NS_OID, "1.2.643.2.2.30.0")

  • PHP algorithm name "gost-crypto" is also "GOST R 34.11-94", but with "CryptoPro parameter set" (OID 1.2.643.2.2.30.1), defined in RFC 4357, section 11.2.

    Hashspace ID with my proposal 1: UUIDv5("1ee317e2-1853-64b2-8fe9-3c4a92df8582", "GOST-CryptoProParamSet") ?????
    Hashspace ID with my proposal 2: UUIDv5(NS_OID, "1.2.643.2.2.30.1")

  • However, the algorithm "GOST R 34.11-94" seems to be identified by the OID "1.2.643.2.2.9" and "1.2.643.2.2.20" too. This is confusing because this OID does not define which parameter set is used. So, with my proposal 2, the OID must not only identify the algorithm itself but also its parameters, variants, etc.

@fabiolimace
Copy link

From reading Wikipedia, md_gost94 appears to be obsolete like MD5 or SHA-1, and md_gost12_256/md_gost12_512 the counterparts of SHA-2 or SHA-3.

GOST = government standard R 34.11-94
Streebog = government standard R 34.11-2012

Is that correct?

@kyzer-davis
Copy link
Collaborator Author

kyzer-davis commented Sep 7, 2023

@danielmarschall, personally I like proposal 2 of the OIDs because they are "well formatted" that is they are a set of "numbers and a dots".

Proposal 1 has the challenge that SHA256, sha256, sha-256, SHA-256 all produce different hashes and proposal 2 removes that.

Proposal 2 has the challenges I listed but the points may be moot as many of the items we are discussing are algos nobody will likely ever use...

kydavis@ubuntu-web-server:~$ echo -n "SHA256" | sha256sum
b3abe5d8c69b38733ad57ea75e83bcae42bbbbac75e3a5445862ed2f8a2cd677  -

kydavis@ubuntu-web-server:~$ echo -n "SHA-256" | sha256sum
bbd07c4fc02c99b97124febf42c7b63b5011c0df28d409fbb486b5a9d2e615ea  -

kydavis@ubuntu-web-server:~$ echo -n "sha256" | sha256sum
5d5b09f6dcb2d53a5fffc60c4ac0d55fabdf556069d6631545f42aa6e3500f2e  -

kydavis@ubuntu-web-server:~$ echo -n "sha-256" | sha256sum
3128f8ac2988e171a53782b144b98a5c2ee723489c8b220cece002916fbc71e2  -

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 7, 2023

points may be moot as many of the items we are discussing are algos nobody will likely ever use...

@kyzer-davis Are you referring to the small discussion(s) about HAVAL and GOST and my long OID list above? Don't worry, they were just part of my personal evaluation process to find out if proposal 1 or proposal 2 are better in regards to the Non-NIST algorithms, because you mentioned missing and ambiguous OIDs, so I was wondering if this is a serious issue or not. I don't propose that GOST, HAVAL, Tiger, ... get added to the RFC.

To avoid confusion in this large thread, here is my proposed text (Proposal 2):

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based
   UUIDs.

   The following UUIDs were created by using a UUIDv5 with the
   OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the
   OID identifying the hash algorithm.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace ID.

   SHA-224      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4")  = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.1")  = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.2")  = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.3")  = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.5")  = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.6")  = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.7")  = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.8")  = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.9")  = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.10") = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.11") = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.12") = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

Since the lines are too long for RFC, here is a variant with line breaks:
(Unfortunately, the line breaks are very ugly)

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based
   UUIDs.

   The following UUIDs were created by using a UUIDv5 with the
   OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the
   OID identifying the hash algorithm.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace ID.

   SHA-224      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.4")
                = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.1")
                = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.2")
                = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512      = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.3")
                = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.5")
                = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256  = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.6")
                = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.7")
                = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.8")
                = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.9")
                = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.10")
                = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.11")
                = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256     = UUIDv5(NS_OID, "2.16.840.1.101.3.4.2.12")
                = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

Another format that does not use the UUIDv5() pseudo-method and NS_OID constant (line breaks are still ugly):

Appendix B.  Some Hashspace IDs

   This appendix lists some hashspace IDs for use with UUIDv8 name-based
   UUIDs.

   The following UUIDs were created by using a UUIDv5 with the
   OID namespace ID ("6ba7b812-9dad-11d1-80b4-00c04fd430c8") and the
   OID identifying the hash algorithm.  This mechanism of generating a
   hashspace ID is OPTIONAL.  Any UUID can be used as a hashspace ID.

   SHA-224 (2.16.840.1.101.3.4.2.4)
                = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256 (2.16.840.1.101.3.4.2.1)
                = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384 (2.16.840.1.101.3.4.2.2)
                = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512 (2.16.840.1.101.3.4.2.3)
                = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224 (2.16.840.1.101.3.4.2.5)
                = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256 (2.16.840.1.101.3.4.2.6)
                = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224 (2.16.840.1.101.3.4.2.7)
                = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256 (2.16.840.1.101.3.4.2.8)
                = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384 (2.16.840.1.101.3.4.2.9)
                = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512 (2.16.840.1.101.3.4.2.10)
                = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128 (2.16.840.1.101.3.4.2.11)
                = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256 (2.16.840.1.101.3.4.2.12)
                = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

@kyzer-davis If you agree, can you please add one of these to a pull request? Thank you very much!

@fabiolimace
Copy link

fabiolimace commented Sep 8, 2023

Another way to demonstrate the hashspaces is to show a predefined list followed by the pseudocode used to generate the list. I find it (almost) impossible to have doubts about how the list was generated. Separating the list from the steps to generate it takes less "cognitive effort", in my opinion.

Predefined list of hashspaces:

   SHA-224     = "e0f20710-25d9-54ab-8325-ccf2d456ad0b"
   SHA-256     = "29f7f1c6-6258-5be3-b9f0-2adc24eb96c6"
   SHA-384     = "728cae51-ddd4-5401-b52c-5775cd8913d8"
   SHA-512     = "a5dd0c9d-04b0-5f15-9332-c0ff97053dda"
   SHA-512/224 = "3b72d097-ee4d-54e0-a3ab-cbc072d6d159"
   SHA-512/256 = "af4f3a3f-167e-53b5-8817-66d5201cb05a"
   SHA3-224    = "4b5be759-1e10-56ff-9187-1a34f5773d9f"
   SHA3-256    = "2eea60a9-2f2b-5d32-93b2-dc9688165cfa"
   SHA3-384    = "f49d66f8-0755-5500-b747-ef5f7a18e9bc"
   SHA3-512    = "e3936772-aa32-5463-870b-0a05a6bb2dd3"
   SHAKE128    = "bd2da541-a66d-52a7-8cf7-3773e204c114"
   SHAKE256    = "a1a5a6ea-b4fd-5f93-91e3-a25de6121457"

Pseudocode to derive hashspaces from message digest OIDs:

   # array of message digest OIDs
   OID["SHA-224"]     = "2.16.840.1.101.3.4.2.4"
   OID["SHA-256"]     = "2.16.840.1.101.3.4.2.1"
   OID["SHA-384"]     = "2.16.840.1.101.3.4.2.2"
   OID["SHA-512"]     = "2.16.840.1.101.3.4.2.3"
   OID["SHA-512/224"] = "2.16.840.1.101.3.4.2.5"
   OID["SHA-512/256"] = "2.16.840.1.101.3.4.2.6"
   OID["SHA3-224"]    = "2.16.840.1.101.3.4.2.7"
   OID["SHA3-256"]    = "2.16.840.1.101.3.4.2.8"
   OID["SHA3-384"]    = "2.16.840.1.101.3.4.2.9"
   OID["SHA3-512"]    = "2.16.840.1.101.3.4.2.10"
   OID["SHAKE128"]    = "2.16.840.1.101.3.4.2.11"
   OID["SHAKE256"]    = "2.16.840.1.101.3.4.2.12"
   
   # function do derive hashspaces from message digest OIDs
   function hashspace(algo) { return UUIDv5(NAMESPACE_OID, OID[algo]) }

Note: the pseudocode is based on AWK syntax. Implementers can simply copy the pseudocode and change it to suit the target language syntax. If I was the implementer, I would appreciate it.

@kyzer-davis
Copy link
Collaborator Author

Got it @fabiolimace and @danielmarschall.
I will go with the OID method for obtaining the Hashspace ID. aka "Proposal 2"

Don't worry about formatting, I will get that figured out. Could end up as some ascii, some table, etc.

PR will likely happen next week.

Finally, depending on how the discussion over in #144 shakes out one could possibly add a new hashspace ID to the IANA registry without needing a full on spec to do so. Just needs to be defined by the way we say in this doc and then added to that table.
Name, OID, ID, Doc for Hash Algo would be the columns in my mind
This would help when some next gen crypto comes out and somebody wants to define the hashspace for it. Much easier via an email template than a full on RFC. (Same goes for some legacy algo if somebody wanted to use it, update the registry and now anybody can leverage it.

This was referenced Sep 13, 2023
@LiosK
Copy link
Contributor

LiosK commented Sep 14, 2023

I'm somewhat concerned about this OID + UUIDv5 approach because:

  • With this approach, any future hash function will automatically get its hash space ID when it gets an OID. Technically, such a new hash space ID will have to get ratified through a formal process, but this approach does create expectations that such a new ID will be ratified. In this way future spec authors might lose control over the hash space ID definitions.
  • This approach creates a use case of v5 from now on, whereas v5 and SHA1 are no way recommended for future uses.

I think v4 based IDs are simpler and safer.

@fabiolimace
Copy link

fabiolimace commented Sep 15, 2023

I think v4 based IDs are simpler and safer.

This is a question I've been trying to answer myself for a while: how good is a 160- or 256-bit truncated hash compared to a 128-bit random number?

I've tried a few times, but I always fail miserably because I don't have the statistical knowledge to give an answer.

I always end up, in my naive attempts, trusting in the principle of Saint Thomas: seeing is believing. However, I can't see any difference with my eyes.

However, I believe that hash-based UUIDs are still very useful for associating a binary or textual value with a relatively short ID in a permanent and (almost) univocal way.

EDIT: I crossed out the text because I realized I misunderstood the sentence. Please ignore. (but the question still remains)

@LiosK
Copy link
Contributor

LiosK commented Sep 15, 2023

The crossed out question is a different topic but I think is a very good question, which neither do I have an answer to. Please take a look at several posts relating to FIPS stuff following my original post about the hash space approach.

@danielmarschall
Copy link
Contributor

danielmarschall commented Sep 15, 2023

I'm somewhat concerned about this OID + UUIDv5 approach because:

  • With this approach, any future hash function will automatically get its hash space ID when it gets an OID. Technically, such a new hash space ID will have to get ratified through a formal process, but this approach does create expectations that such a new ID will be ratified. In this way future spec authors might lose control over the hash space ID definitions.
  • This approach creates a use case of v5 from now on, whereas v5 and SHA1 are no way recommended for future uses.

I think v4 based IDs are simpler and safer.

I can understand your concern that UUIDv5 is using a deprecated hash algorithm.

But I think it is very useful that the hash space is not just random, but connected with the algorithm.
Let's imagine the case when someone wants to use a Non-NIST hash algorithm, e.g. HAVAL-3-128.

Imagine IANA does not have that hash listed. By using random UUIDv4, someone needs to choose/generate a hash space id, and IANA needs to add it. Maybe IANA even insists that a RFC is written that defines the hash space ID. But do you think every developer who wants to use a Non-NIST hash will contact IANA or even write a RFC?

A lot of algorithms have OIDs. This is important for some technologies like X.509. By having the hash space (optionally) be derivated from the OID means that two developers can hash using HAVAL-3-128, and since HAVAL-3-128 has OID "1.3.6.1.4.1.18105.2.1.1.1", both implementations output the same UUID. Without writing a RFC, without contacting IANA.
(And yes, I know that some hash algorithms have an biguous OID or no OID at all. But my research showed that I the majority of algorithms has exactly one OID)

@LiosK
Copy link
Contributor

LiosK commented Sep 15, 2023

In my opinion, such a new hash function must be registered through a formal process (by a separate RFC or IANA registry, I don't know) unless the new UUID RFC specifies the algorithm to derive a hash space ID in a normative manner. Otherwise, the de facto hash space ID crafted by future implementers will be put on an uncertain state. So far, the name-based v8 is just an example of v8 implementation techniques, and we will have no time to put this in the normative section. With this in mind, we shouldn't create any expectations related to the future hash space IDs. UUIDv4-based hash space IDs do require a formal process to ratify new hash functions, and accordingly give the full control over the UUID specification to the future spec authors to recommend one hashing algorithm and discourage another.

@cbandy
Copy link

cbandy commented Sep 16, 2023

My naive and scattered thoughts:

  • v4 random can potentially collide with any other v4 value generated in the past or future.
  • Implicitly, the scheme will work because people use the values in a specific context: RFC 4122 hashspaces.
  • However, the list of meaningful values is trapped in a single RFC. After it is published, how can people in the future, outside of the RFC, know they aren't generating a v4 value that someone else has already attributed meaning to? The topic of a registry has already been covered.
  • v5 values have a context that lives beyond/outside a v4 list in the RFC. The downside is that uniqueness must come from elsewhere: OIDs in the above examples.
    • That uniqueness is, again, a list somewhere. The topic of a registry has already been covered.
    • The deterministic nature of v5 also means that two different inputs could (theoretically) collide in the future without recourse.

  • I keep using the word "future." Things from the past are facts and do not change. The risk of collision is in the future.
  • We have UUID schemes that include time. In those, values from "the past" are separate and discernable from values of "the future."

Would it be better to identify these hashspaces using v7?

@LiosK
Copy link
Contributor

LiosK commented Sep 16, 2023

v7 works, so does v4, I think though.

EDIT: v4 is better because of its randomness. Hash space IDs are passed to another hash function so should be very different from each other.

@chorman0773
Copy link

I saw the OID proposal above, and I'd like to second that.

This would allow 3rd parties can also define new Hashspace UUIDs, if they have an OID they can control (and hand out sub-OIDs from), which they can get from the IANA. It would also allow users of v9 to substitute a v5 UUID in out-of-band transport with simply the OID for the algorithm itself. The main risk of doing this, in my opinion, without a centralized registry is that one algorithm might end up with 2 different OIDs in different contexts. If this route is taken, there should be guidance to avoid anti-collisions.

@LiosK
Copy link
Contributor

LiosK commented Sep 18, 2023

Since it's v8, any third party can generate a UUID and use it in their application as a hashspace ID for any hash function. Perhaps, we should expand the following statement in Section 6.5 to clarify that any user-defined UUID value may be used as a hashspace ID within an application context. This point is not sufficiently clear in the current draft, despite #132.

These MAY leverage newer hashing protocols such as ... or even protocols that have not been defined yet.

Within an implementation can the implementer do whatever they want, but a standard has to focus in the coordination of such implementations. What if SHA-4 has a parameter that is not expressed in the OID? What if a widespread implementation applies SHA-5 differently than expected? These circumstances may risk the future interoperability under the OID-based hashspace scheme. Plus, observing such a situation, future RFC authors might even avoid ratifying an OID-based hashspace ID because officially specifying the meaning of widely used hashspace ID can destroy the existing implementations.

@kyzer-davis
Copy link
Collaborator Author

Getting caught up on these longer threads after being out unexpectedly.
I see there have been lots of discussions...
As such, I will hold off on changes for the moment. We can aim for this as a topic on the interim call the chairs have requested.

@mcr
Copy link
Contributor

mcr commented Sep 22, 2023

If you do not have a datatracker.ietf.org login, please get one, as you'll need it for the virtual interim. That's the only barrier to participation. Slides uploaded to datatracker would also be appreciated.

fabiolimace added a commit to f4b6a3/uuid-creator that referenced this issue Sep 23, 2023
The `GUID.v8()` method is no longer supported due to recent sudden
changes in the UUIDv8 discussions. It will be removed when the new RFC
is finally published.

See the latest discussions about UUIDv8:

* ietf-wg-uuidrev/rfc4122bis#143
* ietf-wg-uuidrev/rfc4122bis#144
* ietf-wg-uuidrev/rfc4122bis#147
@kyzer-davis
Copy link
Collaborator Author

@mcr "Slides uploaded to datatracker would also be appreciated." yeah, I will get some to the chairs this week!

@mcr
Copy link
Contributor

mcr commented Sep 26, 2023

@mcr "Slides uploaded to datatracker would also be appreciated." yeah, I will get some to the chairs this week!

if @danielmarschall or others still feel they want v9, then they also need to explain the proposal in a slide or two.

@kyzer-davis
Copy link
Collaborator Author

As per #147, Hashspace IDs were removed. Which would resolve this discuss item if merged.
Please see the proposal in #147 and leave feedback on that topic there.

@kyzer-davis kyzer-davis added the wontfix This will not be worked on label Oct 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

7 participants