Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LoopVectorization fails on some operations? #121

Open
alanderos91 opened this issue Oct 28, 2022 · 3 comments
Open

LoopVectorization fails on some operations? #121

alanderos91 opened this issue Oct 28, 2022 · 3 comments

Comments

@alanderos91
Copy link
Contributor

alanderos91 commented Oct 28, 2022

Reproducing the issue:

using SnpArrays
mouse = SnpArray(SnpArrays.datadir("mouse.bed"))
A = SnpLinAlg{Float64}(mouse)
A*A'

This results in the following message:

┌ Warning: #= /home/alanderos/.julia/packages/SnpArrays/WBzpL/src/linalg_direct.jl:760 =#:
│ `LoopVectorization.check_args` on your inputs failed; running fallback `@inbounds @fastmath` loop instead.
│ Use `warn_check_args=false`, e.g. `@turbo warn_check_args=false ...`, to disable this warning.
└ @ SnpArrays ~/.julia/packages/LoopVectorization/FMfT8/src/condense_loopset.jl:1049

This happens when I start Julia with 10 threads on a 10 core machine. Here's some additional information:

Julia Version 1.7.3
Commit 742b9abb4d (2022-05-06 12:58 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i9-10900KF CPU @ 3.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-12.0.1 (ORCJIT, skylake)

Note: The issue came up trying to compute A*B where A is SnpLinAlg and B is SparseMatrixCSC. So other operations seem to run into the issue.

@biona001
Copy link
Member

Sorry, probably my fault. At least the answer seems correct.

For matrix X, I only tested A*X where A is SnpLinAlg and X is dense matrix. Not sure why it fails when X is another SnpLinAlg or SparseMatrixCSC, would need some time to figure this out

@kose-y
Copy link
Member

kose-y commented Oct 28, 2022

I don't think it's anybody's fault, Ben. It's just that we have not designed a kernel for this specific operation. LoopVectorization is not really a good solution for SparseMatrixCSC. For SnpLinAlg x SnpLinAlg, the quick and dirty workaround is indeed transforming one of them into a numeric matrix. If it is large enough, I think it might still be faster than numeric matrix x numeric matrix due to having slightly better cache efficiency.

@alanderos91
Copy link
Contributor Author

Thank you both for the quick reply. I only used SnpLinAlg x SnpLinAlg as an example since it throws the same warning and is incredibly slow, for me at least.

In the short-term, I think it would be reasonable to restrict this and this to DenseMatrix{T} instead of AbstractMatrix{T}. Multiplication with a generic AbstractMatrix{T} should either throw an error, as it is technically not supported, or give a warning, since the current kernel is not suitable for that case.

In the long-term it would be nice to round out support for matrix-matrix multiplication. I can probably implement something for SnpLinAlg times SparseMatrixCSC for my problem and contribute it as a starting point. Can't promise it will be the best, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants