Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use the custom implementation of multipliedFullWidth on arm64_32
Previously we were falling back on the generic implementation for 64b integers, which resulted in the following codegen: 00000008 asr x8, x0, swiftlang#32 0000000c asr x9, x0, swiftlang#63 00000010 cmp x0, #0x0 00000014 cinv w10, w0, lt 00000018 eor w9, w10, w9 0000001c asr x10, x1, swiftlang#32 00000020 asr x11, x1, swiftlang#63 00000024 cmp x1, #0x0 00000028 cinv w12, w1, lt 0000002c eor w11, w12, w11 00000030 umull x12, w11, w9 00000034 mul x11, x11, x8 00000038 add x11, x11, x12, lsr swiftlang#32 0000003c asr x12, x11, swiftlang#63 00000040 cmp x11, #0x0 00000044 cinv w13, w11, lt 00000048 eor w12, w13, w12 0000004c madd x9, x9, x10, x12 00000050 mul x8, x10, x8 00000054 add x8, x8, x11, asr swiftlang#32 00000058 add x0, x8, x9, asr swiftlang#32 0000005c ret Instead, we should use the 64b implementation when targeting arm64_32, which allows us to generate: 00000008 smulh x0, x1, x0 0000000c ret Unsurprisingly, this is considerably faster.
- Loading branch information