Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large performance problem due to @fastmath on many operations #22275

Closed
ChrisRackauckas opened this issue Jun 7, 2017 · 6 comments
Closed

Large performance problem due to @fastmath on many operations #22275

ChrisRackauckas opened this issue Jun 7, 2017 · 6 comments
Labels
domain:maths Mathematical functions performance Must go faster

Comments

@ChrisRackauckas
Copy link
Member

function f1(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
  aidx = eachindex(a)
  for i in aidx
    @inbounds a[i] = b[i]+c*(d*e[i]+f*g[i]+h*j[i]+k*l[i]+m*n[i]+o*p[i])
  end
end
function f2(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
  aidx = eachindex(a)
  @fastmath for i in aidx
    @inbounds a[i] = b[i]+c*(d*e[i]+f*g[i]+h*j[i]+k*l[i]+m*n[i]+o*p[i])
  end
end
a = rand(10)
b = rand(10)
c = 0.1
d = 0.1
e = rand(10)
f = 0.1
g = rand(10)
h = 0.1
j = rand(10)
k = 0.1
l = rand(10)
m = 0.1
n = rand(10)
o = 0.1
p = rand(10)

using BenchmarkTools
@benchmark f1($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n,$o,$p)
@benchmark f2($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n,$o,$p)
@benchmark f1($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n,$o,$p)
BenchmarkTools.Trial: 
  memory estimate:  112 bytes
  allocs estimate:  7
  --------------
  minimum time:     83.086 ns (0.00% GC)
  median time:      164.388 ns (0.00% GC)
  mean time:        152.032 ns (5.06% GC)
  maximum time:     2.518 μs (92.10% GC)
  --------------
  samples:          10000
  evals/sample:     983

@benchmark f2($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n,$o,$p)
BenchmarkTools.Trial: 
  memory estimate:  5.27 KiB
  allocs estimate:  297
  --------------
  minimum time:     11.124 μs (0.00% GC)
  median time:      22.248 μs (0.00% GC)
  mean time:        20.982 μs (2.20% GC)
  maximum time:     2.470 ms (96.68% GC)
  --------------
  samples:          10000
  evals/sample:     1

Found in the same code as #22255

@yuyichao
Copy link
Contributor

yuyichao commented Jun 7, 2017

Is this actually a regression?

@ChrisRackauckas ChrisRackauckas changed the title Large performance regression due to @fastmath on many operations Large performance problem due to @fastmath on many operations Jun 7, 2017
@ChrisRackauckas
Copy link
Member Author

ChrisRackauckas commented Jun 7, 2017

Oops, my bad. I noticed this before on v0.5 but never reported it. Changed the title. These tests are run on v0.6-rc2.

@ChrisRackauckas
Copy link
Member Author

Just tested the lastest nightly binaries and it looks like it's still the case:

julia> @time f1(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
  0.014693 seconds (4.06 k allocations: 225.578 KiB)

julia> @time f2(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
  0.024622 seconds (10.12 k allocations: 568.695 KiB)

julia> @time f1(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
  0.000004 seconds (4 allocations: 160 bytes)

julia> @time f2(a,b,c,d,e,f,g,h,j,k,l,m,n,o,p)
  0.000043 seconds (294 allocations: 5.313 KiB)
julia> versioninfo()
Julia Version 0.7.0-DEV.491
Commit 45c15682ff* (2017-06-07 05:19 UTC)
Platform Info:
  OS: Windows (x86_64-w64-mingw32)
  CPU: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, haswell)
Environment:

@ararslan ararslan added domain:maths Mathematical functions performance Must go faster labels Jun 7, 2017
@ChrisRackauckas
Copy link
Member Author

This is number of operations dependent:

function f1(a,b,c,d,e,f,g,h,j,k,l,m,n)
  aidx = eachindex(a)
  for i in aidx
    @inbounds a[i] = b[i]+c*(d*e[i]+f*g[i]+h*j[i]+k*l[i]+m*n[i])
  end
end
function f2(a,b,c,d,e,f,g,h,j,k,l,m,n)
  aidx = eachindex(a)
  @fastmath for i in aidx
    @inbounds a[i] = b[i]+c*(d*e[i]+f*g[i]+h*j[i]+k*l[i]+m*n[i])
  end
end
a = rand(10)
b = rand(10)
c = 0.1
d = 0.1
e = rand(10)
f = 0.1
g = rand(10)
h = 0.1
j = rand(10)
k = 0.1
l = rand(10)
m = 0.1
n = rand(10)
julia> @benchmark f1($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     35.421 ns (0.00% GC)
  median time:      36.593 ns (0.00% GC)
  mean time:        39.635 ns (0.00% GC)
  maximum time:     299.767 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

julia> @benchmark f2($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n)
BenchmarkTools.Trial:
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     34.836 ns (0.00% GC)
  median time:      36.300 ns (0.00% GC)
  mean time:        39.278 ns (0.00% GC)
  maximum time:     129.684 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     1000

But this issue happens even though the parenthesis encloses <16 values?

@fredrikekre
Copy link
Member

The examples in this issue seem to be performing much better on master, example in OP:

julia> @benchmark f1($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n,$o,$p)
BenchmarkTools.Trial: 
  memory estimate:  112 bytes
  allocs estimate:  7
  --------------
  minimum time:     69.316 ns (0.00% GC)
  median time:      72.273 ns (0.00% GC)
  mean time:        83.480 ns (7.75% GC)
  maximum time:     39.859 μs (99.71% GC)
  --------------
  samples:          10000
  evals/sample:     964

julia> @benchmark f2($a,$b,$c,$d,$e,$f,$g,$h,$j,$k,$l,$m,$n,$o,$p)
BenchmarkTools.Trial: 
  memory estimate:  112 bytes
  allocs estimate:  7
  --------------
  minimum time:     62.282 ns (0.00% GC)
  median time:      68.275 ns (0.00% GC)
  mean time:        79.657 ns (8.59% GC)
  maximum time:     41.918 μs (99.73% GC)
  --------------
  samples:          10000
  evals/sample:     968

@KristofferC KristofferC added the kind:potential benchmark Could make a good benchmark in BaseBenchmarks label Aug 15, 2018
@KristofferC
Copy link
Sponsor Member

Looks fixed indeed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:maths Mathematical functions performance Must go faster
Projects
None yet
Development

No branches or pull requests

5 participants