Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CRS matrix sort, instead of Kokkos::sort on each row #1553

Merged
merged 1 commit into from
Sep 28, 2022

Conversation

brian-kelley
Copy link
Contributor

@brian-kelley brian-kelley commented Sep 28, 2022

spiluk_symbolic needs to sort the rows of the L and U graphs. Instead of using Kokkos::sort on each row (since this is meant for device-wide sorts of large arrays), use the KokkosSparse::sort_crs_graph function executing on host.

This fixes trilinos/Trilinos#11041. The issue appeared to come from Kokkos because the default behavior of Kokkos::sort changed recently. In 3.6.1, given a UVM space view it dispatched to std::sort because it's host-accessible. In 3.7, In was dispatching to Kokkos bin sort on the GPU. This is definitely the way it should be in general given how Kokkos::sort is intended to be used, but it made this case of sorting many small arrays a lot slower. Especially when those small arrays live in UVM that's resident in host memory.

For biggest local matrix on abnormal energy surrogate problem (the one from Trilinos issue), here are the timings just for sorting L and U a few different ways:

Kokkos::sort each row, with force kokkos=true (current behavior): 169.452
My radix sort_crs_matrix, all on Serial/host (this PR): 3.42121
My bitonic sort_crs_matrix on V100, but not including the transfer back to host: 3.3849
std::sort each row in a normal for loop: 1.92318

This data suggests a follow-up change to sort_crs_graph, that uses std::sort instead of my radix sort where possible. That can come later though - this PR fixes the big slowdown.

Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

@kokkos-devops-admin
Copy link

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

Copy link
Contributor

@vqd8a vqd8a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley Looks good to me. Thanks for fixing this.

@kokkos-devops-admin
Copy link

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930

  • Build Num: 69
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight

  • Build Num: 78
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC1020

  • Build Num: 31
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight

  • Build Num: 30
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_GCC720

  • Build Num: 129
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_INTEL19

  • Build Num: 80
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_CLANG1001

  • Build Num: 129
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_CLANG13CUDA10

  • Build Num: 21
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Using Repos:

Repo: KOKKOSKERNELS (brian-kelley/kokkos-kernels)
  • Branch: FixTrilinos11041
  • SHA: 79c95ee
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

@kokkos-devops-admin
Copy link

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: KokkosKernels_PullRequest_GCC930_Light_Tpls_GCC930

  • Build Num: 69
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_CUDA11_CUDA11_LayoutRight

  • Build Num: 78
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC1020

  • Build Num: 31
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC1020_Light_LayoutRight

  • Build Num: 30
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_GCC720

  • Build Num: 129
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_INTEL19

  • Build Num: 80
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_CLANG1001

  • Build Num: 129
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_CLANG13CUDA10

  • Build Num: 21
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixTrilinos11041
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA 79c95ee
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 70b4605
PR_LABELS performance
PULLREQUESTNUM 1553
TEST_REPO_ALIAS KOKKOSKERNELS

@kokkos-devops-admin
Copy link

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ vqd8a lucbv ]!

@kokkos-devops-admin
Copy link

Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge

@brian-kelley brian-kelley merged commit 191dfa3 into kokkos:develop Sep 28, 2022
@brian-kelley brian-kelley deleted the FixTrilinos11041 branch September 28, 2022 21:17
jgfouca added a commit to jgfouca/kokkos-kernels that referenced this pull request Oct 18, 2022
* origin/develop: (177 commits)
  cm_test_all_sandia: fix base module list
  cm_test_all_sandia: updates and cleanup
  Remove src subdir
  TeamGemvInternal: Update headers and namespace in KokkosBatched_Gemv_Decl.hpp
  apply clang-format
  change to device_type
  scripts/cm_test_all_sandia: Update openblas tpl
  sparse/unit_test: Disable spmv_mv_heavy for all A64FX builds
  Use CRS matrix sort, instead of Kokkos::sort on each row (kokkos#1553)
  Avoid the SIMD code branch if the batched size is not a multiple of the vector length
  SYCL: Fix linking with ze_loader in Trilinos
  fix the failing tests
  cm_test_all_sandia: testing updates
  Fix warnings
  Use static_cast<double> instead of 1. *
  Update the batched sparse kernels to add SIMD data types and options for the temporary variable allocations.
  Update default argument ROCm version 4.5 -> 5.2 in Docker image
  Fix HIP nightly build to satisfy new Kokkos minimum requirements
  blas:
  apply clang-format-8
  ...
ndellingwood pushed a commit to ndellingwood/kokkos-kernels that referenced this pull request Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants