-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
World age error from LoopVectorization v0.10.0 with Tullio and MPI on Julia v1.5.3 #192
Comments
Running the same steps with Julia v1.6.0-beta1 does not throw any error. Thus, it will be fine for us in the future. Since this is some special use case and we will switch to Julia v1.6 once its released officially, we can postpone upgrading LoopVectorization a bit. Feel free to close this issue. |
I get a similar error on 1.7. |
Do you need julia> using MPI: mpiexec
julia> mpiexec() do cmd
run(`$cmd -n 2 $(Base.julia_cmd()) --threads=1 /home/chriselrod/Documents/progwork/julia/loopvectests/mpi.jl`)
end
all_successful = true
Process(`/home/chriselrod/.julia/artifacts/3acc381f6eb6cae155dc415de8036910624a278c/bin/mpiexec -n 2 /home/chriselrod/Documents/languages/julia-polly/usr/bin/julia -Cnative -J/home/chriselrod/Documents/languages/julia-polly/usr/lib/julia/sys.so -O3 -g1 --threads=1 /home/chriselrod/Documents/progwork/julia/loopvectests/mpi.jl`, ProcessExited(0))
julia> versioninfo()
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
OS: Linux (x86_64-generic-linux)
CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake-avx512)
Environment:
JULIA_NUM_THREADS = auto While with it, a simpler reproducible example is just using VectorizationBase
include(joinpath(pkgdir(VectorizationBase), "test", "runtests.jl")) Full error: # > julia -O3 -q --compiled-modules=no
julia> using VectorizationBase
julia> include(joinpath(pkgdir(VectorizationBase), "test", "runtests.jl"))
Julia Version 1.5.3
Commit 788b2c77c1 (2020-11-09 13:37 UTC)
Platform Info:
OS: Linux (x86_64-generic-linux)
uname: Linux 5.10.9-1016.native #1 SMP Tue Jan 19 15:04:46 PST 2021 x86_64 unknown
CPU: Intel(R) Core(TM) i9-7900X CPU @ 3.30GHz:
speed user nice sys idle irq
#1-20 3999 MHz 2152869 s 2420 s 452123 s 321536482 s 75879 s
Memory: 31.043872833251953 GB (16618.921875 MB free)
Uptime: 162194.0 sec
Load Avg: 1.185546875 1.19091796875 1.26513671875
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-9.0.1 (ORCJIT, skylake-avx512)
Environment:
JULIA_NUM_THREADS = auto
CFLAGS = -O3 -march=native -mprefer-vector-width=512 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-signed-zeros -fno-trapping-math -fassociative-math -Wl,-sort-common -Wl,--enable-new-dtags
CLASSPATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/mpi.jar:/opt/intel/compilers_and_libraries_2019.4.243/linux/daal/lib/daal.jar
CPATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/ipp/include:/opt/intel/compilers_and_libraries_2019.4.243/linux/mkl/include:/opt/intel/compilers_and_libraries_2019.4.243/linux/pstl/include:/opt/intel/compilers_and_libraries_2019.4.243/linux/tbb/include:/opt/intel/compilers_and_libraries_2019.4.243/linux/tbb/include:/opt/intel/compilers_and_libraries_2019.4.243/linux/daal/include
CXXFLAGS = -O3 -march=native -mprefer-vector-width=512 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-signed-zeros -fno-trapping-math -fassociative-math -Wl,-sort-common -Wl,--enable-new-dtags -fvisibility-inlines-hidden -Wl,--enable-new-dtags
FCFLAGS = -Ofast -march=native -mprefer-vector-width=512 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-signed-zeros -fno-trapping-math -fassociative-math -Wl,-sort-common -Wl,--enable-new-dtags
FFLAGS = -Ofast -march=native -mprefer-vector-width=512 -feliminate-unused-debug-types -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=32 -Wformat -Wformat-security -m64 -fasynchronous-unwind-tables -Wp,-D_REENTRANT -ftree-loop-distribute-patterns -Wl,-z -Wl,now -Wl,-z -Wl,relro -fno-semantic-interposition -ffat-lto-objects -fno-signed-zeros -fno-trapping-math -fassociative-math -Wl,-sort-common -Wl,--enable-new-dtags
FI_PROVIDER_PATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib/prov
HOME = /home/chriselrod
LA_PATH = /usr/lib64/
LD_LIBRARY_PATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib:/opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib/release:/opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/lib:/opt/intel/compilers_and_libraries_2019.4.243/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/tbb/lib/intel64/gcc4.1:/opt/intel/compilers_and_libraries_2019.4.243/linux/tbb/lib/intel64/gcc4.1:/opt/intel/compilers_and_libraries_2019.4.243/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/daal/../tbb/lib/intel64_lin/gcc4.4
LIBRARY_PATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/lib:/opt/intel/compilers_and_libraries_2019.4.243/linux/ipp/lib/intel64:/opt/intel/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/tbb/lib/intel64/gcc4.1:/opt/intel/compilers_and_libraries_2019.4.243/linux/tbb/lib/intel64/gcc4.1:/opt/intel/compilers_and_libraries_2019.4.243/linux/daal/lib/intel64_lin:/opt/intel/compilers_and_libraries_2019.4.243/linux/daal/../tbb/lib/intel64_lin/gcc4.4
MANPATH = /opt/intel/man/common:/opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/man:/usr/local/share/man:/usr/share/man:/usr/man
MPI_PATH = /usr/lib64/
NLSPATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/compiler/lib/intel64/locale/%l_%t/%N:/opt/intel/compilers_and_libraries_2019.4.243/linux/mkl/lib/intel64_lin/locale/%l_%t/%N
PATH = /home/chriselrod/miniconda3/bin:/home/chriselrod/miniconda3/condabin:/opt/intel/compilers_and_libraries_2019.4.243/linux/bin/intel64:/opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/libfabric/bin:/opt/intel/compilers_and_libraries_2019.4.243/linux/mpi/intel64/bin:/usr/bin/haswell/avx512_1:/usr/bin/haswell:/usr/local/bin:/usr/local/sbin:/usr/bin:/opt/3rd-party/bin
PKG_CONFIG_PATH = /opt/intel/compilers_and_libraries_2019.4.243/linux/mkl/bin/pkgconfig
TERM = screen
THEANO_FLAGS = floatX=float32,openmp=true,gcc.cxxflags="-ftree-vectorize -mavx"
WINDOWPATH = 2
FONTCONFIG_PATH = /usr/share/defaults/fonts
CMDSTAN_HOME = /home/chriselrod/Documents/languages/cmdstan
R_HOME = /usr/lib64/R
ERROR: LoadError: LoadError: MethodError: no method matching register_size()
The applicable method may be too new: running in world age 31504, while current world is 34093.
Closest candidates are:
register_size() at /home/chriselrod/.julia/dev/VectorizationBase/src/cpu_info.jl:68 (method too new to be called from this world context.)
register_size(::Type{T}) where T<:Union{Signed, Unsigned} at /home/chriselrod/.julia/dev/VectorizationBase/src/vector_width.jl:3 (method too new to be called from this world context.)
register_size(::Type{T}) where T at /home/chriselrod/.julia/dev/VectorizationBase/src/vector_width.jl:2 (method too new to be called from this world context.)
Stacktrace:
[1] dynamic_integer_register_size() at /home/chriselrod/.julia/dev/VectorizationBase/src/cpu_info.jl:38
[2] #s1160#30 at /home/chriselrod/.julia/dev/VectorizationBase/src/cpu_info.jl:65 [inlined]
[3] #s1160#30(::Any) at ./none:0
[4] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any,N} where N) at ./boot.jl:527
[5] simd_integer_register_size() at /home/chriselrod/.julia/dev/VectorizationBase/src/cpu_info.jl:70
[6] __pick_vector_width(::Int64, ::Int64, ::Any) at /home/chriselrod/.julia/dev/VectorizationBase/src/vector_width.jl:39
[7] _pick_vector_width(::Type{T} where T) at /home/chriselrod/.julia/dev/VectorizationBase/src/vector_width.jl:53
[8] #s1160#33 at /home/chriselrod/.julia/dev/VectorizationBase/src/vector_width.jl:73 [inlined]
[9] #s1160#33(::Any, ::Any) at ./none:0
[10] (::Core.GeneratedFunctionStub)(::Any, ::Vararg{Any,N} where N) at ./boot.jl:527
[11] top-level scope at /home/chriselrod/.julia/dev/VectorizationBase/test/testsetup.jl:6
[12] include(::String) at ./client.jl:457
[13] top-level scope at /home/chriselrod/.julia/dev/VectorizationBase/test/runtests.jl:4
[14] include(::String) at ./client.jl:457
[15] top-level scope at REPL[2]:1
in expression starting at /home/chriselrod/.julia/dev/VectorizationBase/test/testsetup.jl:6
in expression starting at /home/chriselrod/.julia/dev/VectorizationBase/test/runtests.jl:4 Normally, the world age gets frozen when a module is defined, so that everything inside is at the same world age, and you don't get world age issues as a result. Seems that is not the case with As an aside, the reason for this change ( As in, earlier, if you compiled a Now -- and this admittedly needs more testing -- you should be able to produce the |
That's our workaround since Julia does not support parallel precompilation in v1.5, which is necessary for our MPI runs. We want to disable the flag in Julia v1.6, where it's not needed anymore. Thanks for the detailed explanation and your great work, @chriselrod! |
Can you confirm this has been fixed for you? julia> using MPI: mpiexec
julia> mpiexec() do cmd
run(`$cmd -n 2 $(Base.julia_cmd()) --compiled-modules=no --threads=1 /home/chriselrod/Documents/progwork/julia/loopvectests/mpi.jl`)
end
all_successful = true
Process(`/home/chriselrod/.julia/artifacts/3acc381f6eb6cae155dc415de8036910624a278c/bin/mpiexec -n 2 /home/chriselrod/Documents/languages/julia/usr/bin/julia -Cnative,-prefer-256-bit -J/home/chriselrod/Documents/languages/julia/usr/lib/julia/sys.so -O3 -g1 --compiled-modules=no --threads=1 /home/chriselrod/Documents/progwork/julia/loopvectests/mpi.jl`, ProcessExited(0))
(@v1.7) pkg> st VectorizationBase LoopVectorization
Status `~/.julia/environments/v1.7/Project.toml`
[bdcacae8] LoopVectorization v0.11.2 `~/.julia/dev/LoopVectorization`
[3d5dd08c] VectorizationBase v0.18.1 `~/.julia/dev/VectorizationBase`
shell> cat /home/chriselrod/Documents/progwork/julia/loopvectests/mpi.jl
using Test
using MPI
using LoopVectorization, Tullio
@test !MPI.Initialized()
MPI.Init()
@test MPI.Initialized()
function foo!(C, A, B)
@tullio C[i,j] = A[i,k] * B[k,j]
return nothing
end
A = rand(10^2, 10^2);
B = rand(10^2, 10^2);
C = similar(A);
foo!(C, A, B)
successful = C ≈ A * B
all_successful = MPI.Allreduce(Int(successful), +, MPI.COMM_WORLD) == MPI.Comm_size(MPI.COMM_WORLD)
if MPI.Comm_rank(MPI.COMM_WORLD) == 0
@show all_successful
end
@test !MPI.Finalized()
MPI.Finalize()
@test MPI.Finalized() |
I'm AFK right now but will test it tomorrow 👍 |
The error with
using LoopVectorization v0.11.2 and VectorizationBase v0.18.1, see https://github.com/trixi-framework/Trixi.jl/pull/428/checks?check_run_id=1809136036#step:6:3923 |
Trixi's tests passed for me locally after the above ArrayInterface PR, but mind upgrading to ArrayInterface 3.0.1 and confirming? Test Summary: | Test Summary: | Pass Pass TotalTotal
Test Summary: | Pass Total
Parallel 2D Parallel 2D | | 15 15 15 15
Parallel 2D | 134 134
254.036318 seconds (1.39 M allocations: 69.111 MiB, 0.00% gc time)
0.000005 seconds (4 allocations: 160 bytes)
0.000002 seconds (4 allocations: 160 bytes)
0.000002 seconds (4 allocations: 160 bytes)
Test Summary: | Pass Total
Trixi.jl tests | 1 1
254.551402 seconds (2.94 M allocations: 148.049 MiB, 0.01% gc time)
Testing Trixi tests passed |
Great, it works for me, too! Thanks a lot for your great work and support, @chriselrod! |
We get a world age error from LoopVectorization v0.10.0 in combination with Tullio and MPI in Trixi.jl on Ubuntu and Windows, see trixi-framework/Trixi.jl#423. The relevant error message is https://github.com/trixi-framework/Trixi.jl/pull/423/checks?check_run_id=1765905078#step:6:333
A minimal working example can be created as follows. Save
as
example.jl
. Setup a Julia project withand run
The text was updated successfully, but these errors were encountered: