-
Notifications
You must be signed in to change notification settings - Fork 171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenSSL.memcmp?
for constant time/timing safe comparison
#269
Conversation
long len1 = RSTRING_LEN(str1); | ||
long len2 = RSTRING_LEN(str2); | ||
|
||
if(len1 != len2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a crytographic function that's trying to be timing invariant, is this acceptable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to do something to ensure the strings are of equal length, otherwise the refute OpenSSL.memcmp?("aaa", "aaaa")
test fails since CRYPTO_memcmp compares up to len1
bytes.
Rack, Active Support up to 5.1 (rails/rails#24510), .NET Core and the snippets in https://codahale.com/a-lesson-in-timing-attacks/ do the same thing and exit early. I think it's not terrible, but calling it out in the docs would be the least we can do to make it better.
Active Support 5.2+ replaced the default behaviour by always hashing with SHA-256 first and raising on length inequality in a separate method. I can add something similar if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@picatz do you have any input on this?
Here is an example which shows that such an implementation leaks the string length:
#!/usr/bin/env ruby
require 'benchmark'
def compare(a, b)
return false unless a.bytesize == b.bytesize
l = a.unpack "C#{a.bytesize}"
res = 0
b.each_byte { |byte| res |= byte ^ l.shift }
res == 0
end
def find_length_of_secret(secret)
samples = 100.times.collect do |i|
test = "\0" * i
[i, Benchmark.realtime do
1000.times do
compare(test, secret)
end
end]
end
best_guess = samples.sort_by{|sample| sample.last}.last
return best_guess.first
end
pp find_length_of_secret("foobar" * 2)
The example you give from ActiveSupport throws an exception when strings are not the same length. Provided we clearly mark the arguments (user input) and (secret), we can run the algorithm up to the length of secret in every case, that will minimise leaking any details about the length of the secret string... the entire runtime of the algorithm might be used to deduce the string length. For example:
compare(user_input, "twenty characters!!!") -> 20ns
compare(user_input, secret) -> 40ns
Maybe user can figure out secret is 40 characters long.
But, given how much noise and other overheads we expect, maybe it's not a concern. Even though I can think of several ways to make this harder, I'm not sure it's feasible to fix - e.g. adding a small random amount of noise to the function timing (sleep rand
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throwing an exception if the string lengths are different isn't such a stupid idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After thinking about this a bit more, I think it's the most safe option to pick yeah. The OpenSSL variant might be harder to attack than the Ruby one due to the lower speed variance as seen in the benchmark. Or the length of the secret might not be much of a secret in itself (e.g. 32 bytes/SHA-256). But then again you might never know how people use this API and how that can be abused 🤷♂
I'll work on this change and some additional documentation tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe do CRYPTO_memcmp('a', 'b', 1)
(or another example that always returns false) if the lenghts are not the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In ruby psuedo code you could do something like this:
def ossl_crypto_memcmp(str1, str2)
len1 = str1.length
len2 = str2.length
# always generate incorrect answers even when str1 and str2 are the same length so it doesn't effect the time.
incorrect1 = 'a' * len1
incorrect2 = 'b' * len1
if len1 != len2
CRYPTO_memcmp(incorrect1 , incorrect2, len1)
else
CRYPTO_memcmp(str1, str2, len1)
end
end
But reading the documentation for CRYPTO_memcmp it can still leak the length of the other string. Using this would require users to always make sure the strings to compare are always the same length.
The CRYPTO_memcmp function compares the len bytes pointed to by a and b for equality. It takes an amount of time dependent on len, but independent of the contents of the memory regions pointed to by a and b.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a good discussion about the keeping the length secret https://security.stackexchange.com/a/92238
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks very good.
return Qfalse; | ||
|
||
switch (CRYPTO_memcmp(p1, p2, len1)) { | ||
case 0: return Qtrue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to prevent false positives in "aaa"
vs. "aaaa"
-like comparisons, You can return (len1==len2) ? Qtrue : Qfalse
here instead of just Qtrue
. That way the if
statement above can be eliminated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean using minimum length for comparison and then doing final length check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. And I now think I was wrong.
Basically, when you do a timing-safe comparison, one side is "good" and the other side is "bad", like SHA-2 versus user inputs. It seems C APIs for timing-safe comparison are expecting the length argument to be equal to that of "good" ones, which is normally known before actual comparison.
For us, because this API is going to take only 2 arguments, I think we have to assume either side being "good" and take its length, like OpenSSL.memcmp?(good, bad)
or vice versa. This is fragile. I can think of several ways to reroute:
- Change the API so that we can take third argument:
OpenSSL.memcmp?(good, bad, length=good.length)
. This way it is up to the user to properly pass expected length. Not friendly, though. - If we are going with the current API: allocate buffers internally, align length of both sides, fill them with the arguments + zero padded, and do the actual comparison. This might introduce another problem (attacker can feed 1TB junk password).
- Like @bdewater wrote other implementations just early return so it might be okay as-is.
- Not sure about raising exceptions, at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the risk of knowing the secret length?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- An attacker can find users with the shortest/weakest passwords making them an easier attack target.
- An attacker doesn't have to try secrets with a smaller secret length. This is can be around a 1-2% advantage for random characters (1/number of characters).
See also: https://security.stackexchange.com/a/92238
A better question might be: what's a valid case for comparing different length strings? Shouldn't this method always compare hashes anyway?
Changed to raise exception on length mismatch and added documentation. |
I'm not sure why the AppVeyor build is cancelled, other than that I believe this is ready. Let me know what you think :) |
Co-authored-by: arrtchiu <arrtchiu@gmail.com>
Okay, so the only thing I'm wondering about is the naming right for Ruby.
I guess it's reflecting the underlying OpenSSL method. But maybe something more "obvious" could be better. Like "OpenSSL.secure_compare(s1, s2)" I do like Thoughts? |
@ioquatix there's been a healthy discussion over at https://bugs.ruby-lang.org/issues/10098 but no clear consensus emerged for a name. The Rails-centric |
I am going to merge this as is, but I will probably rename the function before the next release. |
I just need some time to review the existing notes.. and I don't want this PR to get forgotten. |
Thanks for merging! |
I am reviewing this code.
Isn't it possible for Thoughts? |
Just FYI, I want to avoid making recommendations in the documentation that might be wrong or give the wrong idea about how to do something securely. |
I made a minor change to clarify the risk of hashing the secret. |
From that commit:
I strongly agree with wanting to not make wrong recommendations 🙂 but this feels wrong to me since no options are given to address this issue mentioned, and could make people decide not to hashing which I believe to be a worse outcome if you compare the two. To my understanding SHA-256 works on 64 byte message blocks, anything that fits within that block size always takes the same amount of time to compute. So at best you'd be able to guess the multiples of 64 bytes the original input was, which is useless info in a world where strong secrets fit into that size. Compare that to not using SHA-256 on user input at all, where you would be able to guess the secret's length exactly. Benchmarking supports this: require 'benchmark/ips'
require 'openssl'
STRING1 = 'A' * 16
STRING2 = 'A' * 17
STRING3 = 'A' * 32
def compare_with_hash(a, b)
hashed_a = OpenSSL::Digest::SHA256.digest(a)
hashed_b = OpenSSL::Digest::SHA256.digest(b)
begin
OpenSSL.memcmp?(hashed_a, hashed_b) && a == b
rescue ArgumentError
false
end
end
def compare_direct(a, b)
begin
OpenSSL.memcmp?(a, b)
rescue ArgumentError
false
end
end
Benchmark.ips do |x|
x.report('readme example, same string lengths') { compare_with_hash(STRING1, STRING1) }
x.report('readme example, small difference length') { compare_with_hash(STRING1, STRING2) }
x.report('readme example, double length') { compare_with_hash(STRING1, STRING3) }
x.report('direct comparison, same string lengths') { compare_direct(STRING1, STRING1) }
x.report('direct comparison, small difference length') { compare_direct(STRING1, STRING2) }
x.report('direct comparison, double length') { compare_direct(STRING1, STRING3) }
x.compare!
end Results:
I also found https://paragonie.com/blog/2015/11/preventing-timing-attacks-on-string-comparison-with-double-hmac-strategy where @paragonie-scott writes the following about timing SHA-256:
|
Fixes https://bugs.ruby-lang.org/issues/10098 - at the 2019-02-07 developer meeting it was punted back to OpenSSL to implement. I went with @shyouhei's suggestion to mirror
String#casecmp?
naming.One way or the other, I think having a function like this built in to the standard library instead of several application implementations is a good thing. E.g. .NET Core has it too.
Some benchmarking on my Macbook Pro 15" 2018 shows this is also faster than Ruby implementations: