-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
50% performance regression in map! #35914
Comments
Hi, I did some benchmarks and the cause is this line. If you remove it and rebuild Julia, the test runs in 1.490 ms as compared to 2.927 ms with it. Adding the I'm currently investigating why |
Hah, wow, that's quite the literal speed bump. There are three issues here:
|
A fourth issue is that it assumes fast linear indexing, i.e. it is assuming |
I think a generic and composable approach to implement IndexStyle-generic |
From Matt's comment, we see these problems and slow down were introduced by f9645ff, with the belief that the changes would make this function faster. |
I repeat benchmarking in Julia 1.6.0-rc3, performance of using BenchmarkTools
A = rand(1000, 1000); B = rand(1000, 1000); C = rand(1000, 1000); D = rand(1000, 1000);
test(A, B, C) = A + B + C
@btime test($A, $B, $C);
3.575 ms (2 allocations: 7.63 MiB)
test8!(D, A, B, C) = map!((a, b, c) -> a + b + c, D, A, B, C)
@btime test8!($D, $A, $B, $C)
6.457 ms (0 allocations: 0 bytes) |
Still slower on Julia 1.8 compared to 1.0 as originally reported. |
The problem resolved itself in v1.10 |
julia> @btime test($A, $B, $C);
1.346 ms (2 allocations: 7.63 MiB)
julia> @btime test8!($D, $A, $B, $C);
2.354 ms (0 allocations: 0 bytes)
julia> versioninfo()
Julia Version 1.10.0
Commit 3120989f39 (2023-12-25 18:01 UTC)
Build Info:
Note: This is an unofficial build, please report bugs to the project
responsible for this build and not to the Julia project unless you can
reproduce the issue using official builds available at https://julialang.org/downloads
Platform Info:
OS: Linux (x86_64-redhat-linux)
CPU: 36 × Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-15.0.7 (ORCJIT, skylake-avx512)
Threads: 1 on 36 virtual cores The assembly for |
From discourse, I see a big performance regression compared to Julia 1.0 in a simple
map!
call starting in Julia 1.2:Julia 1.0.4 gives 1.817 ms and Julia 1.1.0 gives 1.961 ms, but Julia 1.2.0 gives 3.004 ms, Julia 1.3.0 gives 3.091 ms, and Julia 1.4.0 gives 3.006 ms.
The text was updated successfully, but these errors were encountered: