Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance gain when kernel operation is wrapped with a function #37558

Closed
johnnychen94 opened this issue Sep 13, 2020 · 5 comments
Closed

performance gain when kernel operation is wrapped with a function #37558

johnnychen94 opened this issue Sep 13, 2020 · 5 comments
Assignees
Labels
performance Must go faster regression Regression in behavior compared to a previous version

Comments

@johnnychen94
Copy link
Member

This originates from JuliaArrays/OffsetArrays.jl#136 (comment)

function overflow_check(r, offset::T) where T
    if offset > 0 && last(r) > typemax(T) - offset
        throw(ArgumentError("Boundary overflow detected: offset $offset should be equal or less than $(typemax(T) - last(r))"))
    end
end

function overflow_check_with_func(r, offset::T) where T
    throw_overflow_error() = throw(ArgumentError("Boundary overflow detected: offset $offset should be equal or less than $(typemax(T) - last(r))"))
    if offset > 0 && last(r) > typemax(T) - offset
        throw_overflow_error()
    end
end

These two works equivalently, but their performance is slightly different:

julia> r, offset = -3:3, -4;

julia> @btime overflow_check($r, $offset)
  4.801 ns (0 allocations: 0 bytes) # 1.6.0-DEV.883
  4.078 ns (0 allocations: 0 bytes) # 1.4.0, 1.5.1

julia> @btime overflow_check_with_func($r, $offset)
  3.765 ns (0 allocations: 0 bytes) # 1.6.0-DEV.883
  1.696 ns (0 allocations: 0 bytes) # 1.5.1
  1.919 ns (0 allocations: 0 bytes) # 1.4.0

I'm really unsure what an appropriate title should be, please rename it if that's unclear.

cc: @jishnub @stevengj

@timholy timholy added the regression Regression in behavior compared to a previous version label Sep 13, 2020
@timholy
Copy link
Member

timholy commented Sep 13, 2020

Related: #33273, perhaps #22423. 1.6 allocates a gcframe even if you put a @noinline in front of throw_overflow_error.

@JeffBezanson
Copy link
Member

Probably due to #35982.

@JeffBezanson
Copy link
Member

I think we might be able to fix this by removing just the part that treats unreachable nodes the same as throw calls, so that we can still optimize the call to throw_overflow_error.

@JeffBezanson JeffBezanson self-assigned this Sep 14, 2020
@JeffBezanson JeffBezanson added the performance Must go faster label Sep 14, 2020
@JeffBezanson
Copy link
Member

To be clear, that would fix the regression since 1.5, but it's not easy to fix the need to manually out-line the error function.

@yuyichao
Copy link
Contributor

To be clear, that would fix the regression since 1.5, but it's not easy to fix the need to manually out-line the error function.

If it's only about the gc frame that's in the pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

No branches or pull requests

4 participants