-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add sha2-256-trunc2 multihash: 0x1012 #170
Conversation
table.csv
Outdated
@@ -103,6 +103,7 @@ http, multiaddr, 0x01e0, | |||
json, serialization, 0x0200, JSON (UTF-8-encoded) | |||
messagepack, serialization, 0x0201, MessagePack | |||
libp2p-peer-record, libp2p, 0x0301, libp2p peer record type | |||
sha2-256-trunc2, multihash, 0x1012, SHA2-256 truncated with trailing 2 bits replaced with zeros - used for proving trees as in Filecoin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'trailing 2 bits' is slightly ambiguous. This specifically means the two most significant bits of the last byte. Or, said differently, the two most significant bits when the value is treated as a little-endian unsigned integer. The latter also explains the semantic requirement: we are 'truncating' in order to ensure conversion to a field element results in a value representable in 254 bits.
I'm expanding only in case that helps choose clearer descriptive text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah yes, that's pretty critical to get right, endianness mistakes are cause for much pain, will ammend
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced with this, although it's rather wordy, but it's precise:
SHA2-256 truncated with 2 most significant bits of the last byte (interpreted as a little-endian uint) replaced with zeros - used for proving trees as in Filecoin
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit late to the party, but the current iteration still strikes me as confusing. How about:
0x1012, SHA2-256 with bit 249 and 250 (7th and 8th bit counted from the end) both forced to 0. Used for proving trees as in Filecoin
( if I got this wrong - even more indication that the current description is confusing )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit counting and "from the end" might be a bit ambiguous here, I find it way too tempting to think of bits laid out sequentially like that but it's not usually how we deal with them when they're in byte form. What you want an implementation to do is & 0b00111111
(& 0x3f
). Does that look like "7th and 8th bit counted from the end"? It looks more like the "2 most significant bits" to me, but it also needs the LE qualifier because in BE they might programatically look more like "from the end".
🤷
All this to say, I'm not sure your wording clears it up any more than the current wording. Maybe it needs & 0x3f
in there for good measure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went down the bit-counting path because the discussion of BE
/LE
confuses things for me, since these are concepts strictly for ordering within integer types. That is, endianness is well defined/understood for 16/32/64 bit integers. A 256+ bit hash is almost always* a bytestream, so speaking of endianness is.. odd. I suppose just saying:
0x1012, SHA2-256 with the last byte having two of its bits cleared via `& 0b0011_1111`. Used for proving trees as in Filecoin
*when you do base-256 conversions e.g. to btc58 then you are working with an abstract uint256, but that doesn't apply here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better now! No further questions ;)
Anyone want to weigh on on the name? One alternative might be to flip it around to be |
I'm not familiar with crypto lingo, so "trunc" might be the spot on word. Though I think of truncation, I think of something being cut-off, hence shortened. The hash is, but the value stored is not. Would perhaps something like I don't want to bikeshed this too much, so feel free to ignore :) |
Yeah, I'm not totally comfortable with the word "truncated" either as it doesn't seem to quite describe what's going on. In terms of practical usage I think it is truncated, they're only using the 254 bits I think. But in terms of what this multihash would be describing, it's the truncated + 2 zeros. |
have gone with |
Thanks to @ribasushi for pinging on the clarity of the note, which was:
Thanks to some input from @davidad I've gone with:
Any objections to that? |
SHA2-256 with the trailing 2 bits zeroed out. Primary current use is Filecoin. Ref: #161
* sha2-256-trunc254-padded * poseidon-bls12_381-a2-fc1 Ref: multiformats/multicodec#161 Ref: multiformats/multicodec#171 Ref: multiformats/multicodec#170
SHA2-256 with the trailing 2 bits zeroed out. Primary current use is Filecoin.
It's only "truncated" in the sense that the output is truncated, but there are the same number of output bits.
0x1012
was chosen to mirror the0x12
sha2-256 entry and it's close to some other, less common, hash functions.Current implementations:
Ref: #161