Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CUSPARSE_VERSION rather than CUDA_VERSION to choose how to use cuSPARSE TPL #2009

Closed
cwpearson opened this issue Oct 23, 2023 · 5 comments

Comments

@cwpearson
Copy link
Contributor

I was looking at something else in Trilinos and saw @etphipp referencing a problem with logic around cuSPARSE algorithm selection, CUDA version, and cuSPARSE version.

trilinos/Trilinos#12238 (comment)

I don't understand from that PR alone what the specific problem is in Stokhos is (or what the referenced problem in Kokkos Kernels is). This may be our offending code that the Stokhos PR copied with some modifications, and similar snippets may appear in a couple other places in Kokkos Kernels:

#if CUSPARSE_VERSION >= 11301
cusparseSpMVAlg_t alg = CUSPARSE_SPMV_ALG_DEFAULT;
#else
cusparseSpMVAlg_t alg = CUSPARSE_MV_ALG_DEFAULT;
#endif
if (controls.isParameter("algorithm")) {
const std::string algName = controls.getParameter("algorithm");
if (algName == "default")
#if CUSPARSE_VERSION >= 11301
alg = CUSPARSE_SPMV_ALG_DEFAULT;
#else
alg = CUSPARSE_MV_ALG_DEFAULT;
#endif
else if (algName == "merge")
#if CUSPARSE_VERSION >= 11301
alg = CUSPARSE_SPMV_CSR_ALG2;
#else
alg = CUSPARSE_CSRMV_ALG2;
#endif

The cuSPARSE docs re: versioning[0] are contradictory to me, simultaneously saying "Using different versions of cuSPARSE and the CUDA runtime is not supported" but also not promising that any of the cuSPARSE version fields will match the CUDA runtime version (or anything else, for that matter; and is the CUDA runtime version the same as the CUDA toolkit version?).
In any case, we probably want to use CUSPARSE_VERSION everywhere to decide how to use cuSPARSE, rather than CUDA_VERSION.

[0] https://docs.nvidia.com/cuda/cusparse/index.html#compatibility-and-versioning

@cwpearson
Copy link
Contributor Author

cwpearson commented Oct 23, 2023

Here's a more fullsome description: #1967

@lucbv
Copy link
Contributor

lucbv commented Oct 23, 2023

In general we should always first try to use the version of the library that we are using instead of the version of the associated runtime. That's a more reasonable approach and if one day NVIDIA releases CUBLAS/CUSPARSE at a difference cadence from CUDA that would prevent issues.
On the other hand not every vendor is setting versions of their libraries correctly so it's hard to implement a consistent strategy.
In the case of CUBLAS/CUSPARSE we should definitely prioritize using CUBLAS_VERSION / CUSPARSE_VERSION

@etphipp
Copy link
Contributor

etphipp commented Oct 23, 2023

I don't understand from that PR alone what the specific problem is in Stokhos is (or what the referenced problem in Kokkos Kernels is).

This issue is the versioning logic is not correct, since the cuSPARSE version doesn't necessarily match the CUDA version, and this was exhibited in the CUDA versions used on some SNL machines. And yes, the logic in Stokhos was copied from Kokkos Kernels. I don't know what mechanism was used to choose that logic, but there seems to have been a transition period around CUDA 11 where it isn't right (it also may be the case that the cuSPARSE documentation was not correct on this issue).

@cwpearson
Copy link
Contributor Author

I'm going to close this in favor of #1967, which I should have found before opening this. I'll try to gather up which cuSPARSE versions actually came with which CUDA releases and perhaps we can take it from there.

@etphipp
Copy link
Contributor

etphipp commented Oct 23, 2023

That was done at least partially in this comment: trilinos/Trilinos#12238 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants