Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 4.0.01 #1808

Merged
merged 26 commits into from
May 1, 2023
Merged

Release 4.0.01 #1808

merged 26 commits into from
May 1, 2023

Conversation

ndellingwood
Copy link
Contributor

Patch release

brian-kelley and others added 22 commits March 6, 2023 14:35
* Use the options ENABLE_PERFTEST, ENABLE_EXAMPLES

The cmake options KokkosKernels_ENABLE_PERFTESTS and
KokkosKernels_ENABLE_EXAMPLES were not actually used,
both perf_test/ and example/ were always built as long
as KokkosKernels_ENABLE_ALL_COMPONENTS=ON.

This makes these options have an effect again. If perftests or examples
are enabled but ENABLE_ALL_COMPONENTS=OFF, print a message about why
they can't actually be enabled.

* From e10harvey: fix typo in perf_test cmake

* Add feedback about cmake

- Turn ENABLE_PERFTESTS off by default
- since both examples and perf tests are off by default,
  warn if those are ON but can't be enabled
  because ENABLE_ALL_COMPONENTS=OFF
- use ELSE to simplify logic where ENABLE_ALL_COMPONENTS=OFF

(cherry picked from commit 834a85e)
Introduce KOKKOSKERNELS_ALL_COMPONENTS_ENABLED variable

(cherry picked from commit 76968d3)
Kokkos Kernels version: need to use upper case variables

(cherry picked from commit d63de38)
CUSPARSE_MM_ALG_DEFAULT deprecated by cuSparse 11.1

(cherry picked from commit 4f39a18)
blas/blas1: Fix a couple documentation typos.

(cherry picked from commit 3a20643)
CUDA 11.4: fixing some failing build while trying to reproduce issue kokkos#1725

(cherry picked from commit 26332ed)
Reduce BatchedGemm test coverage

(cherry picked from commit aec946c)
* Fix kk_generate_diagonally_dominant_sparse_matrix hang

Use bandwidth to cap the max entries per row, so that the row-filling
loop doesn't run forever looking for a column that isn't already
present.

* Diag-dominant matrix generator: error if bandwidth too small

If bandwidth is too small for the requested nnz and row_size_variance,
error out with a detailed message.

(cherry picked from commit 664bfc4)
This was intended to be a temporary patch, but it will need to stay
until 4.1. This means it has to be included in 4.0.1.
MDF: Minor changes to interface for ifpack2 impl
(cherry picked from commit 30bd681)
Roc tpls upgrade

(cherry picked from commit e35ed21)
For BLAS routines producing a complex scalar result (like zdotc),
prefer to get the result via a pointer argument, rather than as a
direct return value. Directly returning a std::complex from an "extern
C" function is technically not allowed and Clang warns about it.

(cherry picked from commit 53599f4)
Adds a better parilut test with gmres

(cherry picked from commit 747bb93)
Basically one wants to be very careful about only instantiating View
or other object with an execution space only as it might generate a
memory type mismatch down the road

(cherry picked from commit 1ae5b7d)

 Conflicts:
	sparse/src/KokkosSparse_MatrixPrec.hpp

        Resolved conflict with variable naming "A" vs "_A" in spmv call
* ParIlut: create and destroy spgemm handle for each usage

This fixes memory errors on Cuda

* Formatting
* Remove deterministic from par_ilut precond test

Now that spgemm memory errors have been fixed, it appears to work

* Add verbose mode to par_ilut

* Fixes for GPU

* Fix end_rel_res type to work when scalar is complex

* Turn off asynch fixed point on GPU

* Reorganize par_ilut handle, group conceptually similar members

* Refactor par_ilut deterministic setting

Change to async_update and move it to the handle. compute_l_u_factors
does not need to run in a serial exespace, that is way overkill. Simply
turning async_updates off should allow for deterministic results. Now that
we are iterating more than once, it looks like even the hardcoded fixture
test works fine with async_updates on since the multiple iterations corrects
any "bad" results.

* Add comment for par_ilut_precond test settings

* Fix ordering warning

* par_ilut: add test for nrows=0

(cherry picked from commit aa96a83)
)

* par_ilut: make Ut_values view atomic in compute_l_u_factors

... to fix the race issues when async updates are on.

* With Ut atomic, no need to avoid async updates on GPU

* Remove unnecessary header

* Update comments

* Fixes for complex scalars

* Adjust async update views; default it to off

* Fix UtValuesSafeType

* Update sparse/impl/KokkosSparse_par_ilut_numeric_impl.hpp

* Remove UtViewType in favor of std::conditional

(cherry picked from commit 507c29f)
Add and reorder parilut entries
Fix broken 4.0.0 changelog url
@ndellingwood
Copy link
Contributor Author

Please hold merge until successful testing of trilinos/Trilinos#11817

lucbv and others added 2 commits April 25, 2023 20:29
GMRES: fixing some type issues related to memory space instantiation
(cherry picked from commit f41ff47)
Part of Kokkos C++ Performance Portability Programming EcoSystem 4.0
@ndellingwood
Copy link
Contributor Author

Something messed up with my initial cherry-pick of #1719 which was caught by the CI. Fixed with 85d97ef and snapshot to Trilinos updated

Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nathan, this looks good to me

Copy link
Contributor

@e10harvey e10harvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Nathan!

@ndellingwood
Copy link
Contributor Author

One final update pushed with c208dac to modify the KokkosKernels_VERSION_PATCH numbering convention and keep consistent with Kokkos_VERSION_PATCH (here, prefer to use 1 rather than 01)

@ndellingwood
Copy link
Contributor Author

I should have mentioned the snapshot to Trilinos is also updated with the corrected patch version convention

@ndellingwood
Copy link
Contributor Author

trilinos/Trilinos#11817 has merged, no longer a blocker

@lucbv lucbv merged commit 1331baf into kokkos:master May 1, 2023
@lucbv
Copy link
Contributor

lucbv commented May 1, 2023

Thanks Nathan, I am merging this now and creating the tag and release artifact as well.
Will do the Spack things a bit later

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants