Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fall back to ILU/IC sparselib algorithm with CPU executors #1783

Merged
merged 2 commits into from
Feb 19, 2025

Conversation

upsj
Copy link
Member

@upsj upsj commented Feb 16, 2025

There is no reason to make the distinction between algorithms when we only have a single one available. This issue was found when running the custom multigrid example with OpenMP

@upsj upsj added the 1:ST:ready-for-review This PR is ready for review label Feb 16, 2025
@upsj upsj requested a review from a team February 16, 2025 20:16
@upsj upsj self-assigned this Feb 16, 2025
@ginkgo-bot ginkgo-bot added mod:core This is related to the core module. type:factorization This is related to the Factorizations labels Feb 16, 2025
Copy link
Member

@pratikvn pratikvn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OMP tests need to be updated. Otherwise, LGTM

@yhmtsai
Copy link
Member

yhmtsai commented Feb 17, 2025

I think we have decided to stay the behavior which throw the error when OMP does not have syncfree algorithm. It is also based on we keep sparselib algorithm although it is not from sparse library in fact.
Refer to the discussion #1684 (comment)

@upsj
Copy link
Member Author

upsj commented Feb 17, 2025

The current default does not support OpenMP, so either we change the default, or we fix the breakage. With the CSR strategies, we fall back to a different algorithm if it's unsupported - why the difference here? Both compute the same thing, and have only runtime performance implications.

@yhmtsai
Copy link
Member

yhmtsai commented Feb 17, 2025

I had the same comments on that but it was still the decision we made.

@nbeams
Copy link
Collaborator

nbeams commented Feb 17, 2025

I actually think it would be better if syncfree were the default, because it supports OpenMP + CUDA/HIP, vs. sparselib which only supports CUDA/HIP. But that changes the default behavior for anyone who was already using CUDA/HIP, perhaps in a "silent" way that could be easy to miss, which I think was the main reason not to do it. The current setup has the same default behavior as pre-#1684 for all three affected backends (use cuSPARSE/hipSPARSE or throw a not implemented error for OpenMP), except with the new error message for OpenMP, letting the user know that another option is available now.

Copy link
Member

@MarcelKoch MarcelKoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the approach is reasonable. Another alternative which doesn't change the current default, would be to introduce a new enum value, probably default, which defers to sparselib on gpu and syncfree on CPU. But maybe that is a bit overkill.

@upsj upsj requested a review from pratikvn February 18, 2025 13:03
@upsj upsj force-pushed the ilu_ic_omp_algorithm branch from 7d536b1 to 2fb775a Compare February 18, 2025 13:04
@upsj upsj added 1:ST:ready-to-merge This PR is ready to merge. and removed 1:ST:ready-for-review This PR is ready for review labels Feb 19, 2025
upsj added 2 commits February 19, 2025 22:54
This issue was found when running the custom multigrid example with OpenMP
@upsj upsj force-pushed the ilu_ic_omp_algorithm branch from 2fb775a to 84fdb1c Compare February 19, 2025 21:54
@upsj upsj merged commit 9761a22 into develop Feb 19, 2025
10 of 11 checks passed
@upsj upsj deleted the ilu_ic_omp_algorithm branch February 19, 2025 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. mod:core This is related to the core module. type:factorization This is related to the Factorizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants