-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/crypto/pbkdf2: Optimize PBKDF2 #19941
Comments
In general yes. The problem is: I don't know whether it's worth to implement hmac again within the pbkdf2 package just for performance reasons - but I guess it's not: If performance is an issue for you, you could use a faster hash function - for example switch form SHA-2 to BLAKE2 (gives a 2-3x speed-up on x86/amd64) Further PBKDF2 is vulnerable to GPU or FPGA/ASIC password cracking attacks - it's recommended to use a memory-hard function like scrypt or maybe you can wait for argon2 - See #19896 |
@bradfitz I think this can be closed (expect there's a good argument against closing?!) |
@aead, sounds good. Will do. Thanks for the reply. |
I don't agree with you closing this issue. The issue is still not fixed and only because the solution might be a bit ugly doesn't mean that the issue no longer exists. |
@Commenter123 Every change has both benefits and drawbacks. You are focused on the benefits. @aead is saying that the drawbacks outweigh the benefits. This is not something we are going to put our limited resources into. If this is important to you, I encourage you to write the change yourself. Then we can evaluate it with more information. |
Change https://golang.org/cl/27458 mentions this issue: |
Speed up repeated HMAC operations with the same key by not recomputing the first block of the inner and outer hashes in Reset and Sum, saving two block computations each time. This is a significant win for applications which hash many small messages with the same key. In x/crypto/pbkdf2 for example, this optimization cuts the number of block computations in half, speeding it up by 25%-40% depending on the hash function. The hash function needs to implement binary.Marshaler and binary.Unmarshaler for this optimization to work, so that we can save and restore its internal state. All hash functions in the standard library are marshalable (CL 66710) but if the hash isn't marshalable, we fall back on the old behaviour. Marshaling the hashes does add a couple unavoidable new allocations, but this only has to be done once, so the cost is amortized over repeated uses. To minimize impact to applications which don't (or can't) reuse hmac objects, marshaling is performed in Reset (rather than in New), since calling Reset seems like a good indication that the caller intends to reuse the hmac object later. I had to add a boolean field to the hmac state to remember if we've marshaled the hashes or not. This is paid for by removing the size and blocksize fields, which were basically unused except for some initialization work in New, and to fulfill the Size and Blocksize methods. Size and Blocksize can just be forwarded to the underlying hash, so there doesn't really seem to be any reason to waste space caching their values. crypto/hmac benchmarks: name old time/op new time/op delta HMAC_Reset/SHA1/1K-2 4.06µs ± 0% 3.77µs ± 0% -7.29% (p=0.000 n=8+10) HMAC_Reset/SHA1/32-2 1.08µs ± 0% 0.78µs ± 1% -27.67% (p=0.000 n=10+10) HMAC_Reset/SHA256/1K-2 10.3µs ± 0% 9.4µs ± 0% -9.03% (p=0.000 n=10+10) HMAC_Reset/SHA256/32-2 2.32µs ± 0% 1.42µs ± 0% -38.87% (p=0.000 n=10+10) HMAC_Reset/SHA512/1K-2 8.22µs ± 0% 7.04µs ± 0% -14.32% (p=0.000 n=9+9) HMAC_Reset/SHA512/32-2 3.08µs ± 0% 1.89µs ± 0% -38.54% (p=0.000 n=10+9) HMAC_New/SHA1/1K-2 4.86µs ± 1% 4.93µs ± 1% +1.30% (p=0.000 n=10+9) HMAC_New/SHA1/32-2 1.91µs ± 1% 1.95µs ± 1% +1.84% (p=0.000 n=10+9) HMAC_New/SHA256/1K-2 11.2µs ± 1% 11.2µs ± 0% ~ (p=1.000 n=9+10) HMAC_New/SHA256/32-2 3.22µs ± 2% 3.19µs ± 2% -1.07% (p=0.018 n=9+10) HMAC_New/SHA512/1K-2 9.54µs ± 0% 9.66µs ± 1% +1.31% (p=0.000 n=9+10) HMAC_New/SHA512/32-2 4.37µs ± 1% 4.46µs ± 1% +1.97% (p=0.000 n=10+9) name old speed new speed delta HMAC_Reset/SHA1/1K-2 252MB/s ± 0% 272MB/s ± 0% +7.86% (p=0.000 n=8+10) HMAC_Reset/SHA1/32-2 29.7MB/s ± 0% 41.1MB/s ± 1% +38.26% (p=0.000 n=10+10) HMAC_Reset/SHA256/1K-2 99.1MB/s ± 0% 108.9MB/s ± 0% +9.93% (p=0.000 n=10+10) HMAC_Reset/SHA256/32-2 13.8MB/s ± 0% 22.6MB/s ± 0% +63.57% (p=0.000 n=10+10) HMAC_Reset/SHA512/1K-2 125MB/s ± 0% 145MB/s ± 0% +16.71% (p=0.000 n=9+9) HMAC_Reset/SHA512/32-2 10.4MB/s ± 0% 16.9MB/s ± 0% +62.69% (p=0.000 n=10+9) HMAC_New/SHA1/1K-2 211MB/s ± 1% 208MB/s ± 1% -1.29% (p=0.000 n=10+9) HMAC_New/SHA1/32-2 16.7MB/s ± 1% 16.4MB/s ± 1% -1.81% (p=0.000 n=10+9) HMAC_New/SHA256/1K-2 91.3MB/s ± 1% 91.5MB/s ± 0% ~ (p=0.950 n=9+10) HMAC_New/SHA256/32-2 9.94MB/s ± 2% 10.04MB/s ± 2% +1.09% (p=0.021 n=9+10) HMAC_New/SHA512/1K-2 107MB/s ± 0% 106MB/s ± 1% -1.29% (p=0.000 n=9+10) HMAC_New/SHA512/32-2 7.32MB/s ± 1% 7.18MB/s ± 1% -1.89% (p=0.000 n=10+9) name old alloc/op new alloc/op delta HMAC_Reset/SHA1/1K-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) HMAC_Reset/SHA1/32-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) HMAC_Reset/SHA256/1K-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) HMAC_Reset/SHA256/32-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) HMAC_Reset/SHA512/1K-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) HMAC_Reset/SHA512/32-2 0.00B ±NaN% 0.00B ±NaN% ~ (all samples are equal) HMAC_New/SHA1/1K-2 448B ± 0% 448B ± 0% ~ (all samples are equal) HMAC_New/SHA1/32-2 448B ± 0% 448B ± 0% ~ (all samples are equal) HMAC_New/SHA256/1K-2 480B ± 0% 480B ± 0% ~ (all samples are equal) HMAC_New/SHA256/32-2 480B ± 0% 480B ± 0% ~ (all samples are equal) HMAC_New/SHA512/1K-2 800B ± 0% 800B ± 0% ~ (all samples are equal) HMAC_New/SHA512/32-2 800B ± 0% 800B ± 0% ~ (all samples are equal) name old allocs/op new allocs/op delta HMAC_Reset/SHA1/1K-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) HMAC_Reset/SHA1/32-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) HMAC_Reset/SHA256/1K-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) HMAC_Reset/SHA256/32-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) HMAC_Reset/SHA512/1K-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) HMAC_Reset/SHA512/32-2 0.00 ±NaN% 0.00 ±NaN% ~ (all samples are equal) HMAC_New/SHA1/1K-2 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal) HMAC_New/SHA1/32-2 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal) HMAC_New/SHA256/1K-2 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal) HMAC_New/SHA256/32-2 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal) HMAC_New/SHA512/1K-2 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal) HMAC_New/SHA512/32-2 5.00 ± 0% 5.00 ± 0% ~ (all samples are equal) x/crypto/pbkdf2 benchmarks: name old time/op new time/op delta HMACSHA1-2 4.63ms ± 0% 3.40ms ± 0% -26.58% (p=0.000 n=10+9) HMACSHA256-2 9.75ms ± 0% 5.98ms ± 0% -38.62% (p=0.000 n=9+10) name old alloc/op new alloc/op delta HMACSHA1-2 516B ± 0% 708B ± 0% +37.21% (p=0.000 n=10+10) HMACSHA256-2 549B ± 0% 772B ± 0% +40.62% (p=0.000 n=10+10) name old allocs/op new allocs/op delta HMACSHA1-2 8.00 ± 0% 10.00 ± 0% +25.00% (p=0.000 n=10+10) HMACSHA256-2 8.00 ± 0% 10.00 ± 0% +25.00% (p=0.000 n=10+10) Fixes #19941 Change-Id: I7077a6f875be68d3da05f7b3664e18514861886f Reviewed-on: https://go-review.googlesource.com/c/go/+/27458 Run-TryBot: Emmanuel Odeke <emm.odeke@gmail.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Filippo Valsorda <filippo@golang.org>
The guy in this YouTube video mentions that Go does not efficiently compute PBKDF2: https://www.youtube.com/watch?v=k_szwKBuNBw&t=418
Starting at 6:58 -- 9:00
I don't understand the internals but please take a look if this is still not optimized from 4i -> 2+2i
The text was updated successfully, but these errors were encountered: