Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1359 - Optimize splatting make usign linear_ramp #1556

Merged
merged 1 commit into from
Feb 10, 2023

Conversation

jfalcou
Copy link
Owner

@jfalcou jfalcou commented Feb 10, 2023

No description provided.

Copy link
Collaborator

@DenisYaroshevskiy DenisYaroshevskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@jfalcou
Copy link
Owner Author

jfalcou commented Feb 10, 2023

BEFORE:

wide(int n): 
        stp     x29, x30, [sp, -64]!
        adrp    x2, :got:__stack_chk_guard
        adrp    x1, :got:__stack_chk_guard
        mov     z0.s, w0
        mov     x29, sp
        ldr     x2, [x2, #:got_lo12:__stack_chk_guard]
        ptrue   p0.b, vl32
        sub     sp, sp, #576
        str     p4, [sp]
        ldr     x1, [x1, #:got_lo12:__stack_chk_guard]
        mov     z1.b, #0
        ldr     x0, [x2]
        str     x0, [sp, 632]
        mov     x0, 0
        add     x0, sp, 592
        st1w    z0.d, p0, [x0]
        add     x0, sp, 608
        st1w    z1.d, p0, [x0]
        add     x0, sp, 592
        ld1w    z0.s, p0/z, [x0]
        ldr     x0, [sp, 632]
        ldr     x2, [x1]
        subs    x0, x0, x2
        mov     x2, 0
        bne     .L5
        ldr     p4, [sp]
        add     sp, sp, 576
        ldp     x29, x30, [sp], 64
        ret

AFTER:

wide(int n):
        index   z1.s, #0, #1
        mov     z0.s, w0
        ptrue   p0.b, vl32
        cmplt   p0.s, p0/z, z1.s, #4
        mov     z1.s, p0/z, #-1
        and     z0.d, z0.d, z1.d
        ret

@jfalcou
Copy link
Owner Author

jfalcou commented Feb 10, 2023

Gonna push the one for make logical too, I forgot it.

@jfalcou jfalcou merged commit fe62c96 into main Feb 10, 2023
@jfalcou jfalcou deleted the fix-1359/iota-make branch February 10, 2023 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants