Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Work around team size AUTO issue on kepler #1020

Merged
merged 1 commit into from
Jun 15, 2021

Conversation

brian-kelley
Copy link
Contributor

in Gauss-Seidel long rows. This bug originally appeared in Trilinos nightly testing here: trilinos/Trilinos#9272
This PR just copies over changes from the Trilinos fix, which has more details about the issue: trilinos/Trilinos#9278

@brian-kelley brian-kelley requested a review from srajama1 June 11, 2021 19:39
@brian-kelley brian-kelley self-assigned this Jun 11, 2021
@kokkos-devops-admin
Copy link

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@kokkos-devops-admin
Copy link

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: KokkosKernels_PullRequest_GCC720

  • Build Num: 304
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_GCC720

  • Build Num: 296
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_INTEL18

  • Build Num: 282
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC720_Light

  • Build Num: 338
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight

  • Build Num: 115
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_CUDA10

  • Build Num: 303
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight

  • Build Num: 114
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_CUDA9

  • Build Num: 298
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_GCC720_GCC740

  • Build Num: 296
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Using Repos:

Repo: KOKKOSKERNELS (brian-kelley/kokkos-kernels)
  • Branch: FixLongRowTeamSize
  • SHA: e792384
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

@kokkos-devops-admin
Copy link

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: KokkosKernels_PullRequest_GCC720

  • Build Num: 304
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_GCC720

  • Build Num: 296
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_INTEL18

  • Build Num: 282
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC720_Light

  • Build Num: 338
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight

  • Build Num: 115
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_CUDA10

  • Build Num: 303
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_CUDA10_LayoutRight

  • Build Num: 114
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_CUDA9

  • Build Num: 298
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

Build Information

Test Name: KokkosKernels_PullRequest_Tpls_GCC720_GCC740

  • Build Num: 296
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
KOKKOSKERNELS_SOURCE_BRANCH FixLongRowTeamSize
KOKKOSKERNELS_SOURCE_REPO https://github.com/brian-kelley/kokkos-kernels
KOKKOSKERNELS_SOURCE_SHA e792384
KOKKOSKERNELS_TARGET_BRANCH develop
KOKKOSKERNELS_TARGET_REPO https://github.com/kokkos/kokkos-kernels
KOKKOSKERNELS_TARGET_SHA 0ed5962
PR_LABELS
PULLREQUESTNUM 1020
TEST_REPO_ALIAS KOKKOSKERNELS

@kokkos-devops-admin
Copy link

Status Flag 'Pre-Merge Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging
NO REVIEWS HAVE BEEN PERFORMED ON THIS PULL REQUEST!

@kokkos-devops-admin
Copy link

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

1 similar comment
@kokkos-devops-admin
Copy link

All Jobs Finished; status = PASSED, However Inspection must be performed before merge can occur...

Copy link
Contributor

@lucbv lucbv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it makes sense to by-pass Kokkos::AUTO in this case based on failures in the nightly tests. It's probably always a slightly safer option in general?

@kokkos-devops-admin
Copy link

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ lucbv ]!

@kokkos-devops-admin
Copy link

Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge

@brian-kelley brian-kelley merged commit 116df37 into kokkos:develop Jun 15, 2021
@brian-kelley
Copy link
Contributor Author

@lucbv I don't know that team_size_recommended is safe in general, this was the first time I've seen any issues with AUTO and it's for a very specific combination of kernel (SortIntoLongRowsFunctor), configuration (debug with bounds checking) and compute capability (Kepler37).

Still, I'm surprised that the two functions don't always produce the same team size.

@brian-kelley brian-kelley deleted the FixLongRowTeamSize branch June 15, 2021 17:06
@lucbv
Copy link
Contributor

lucbv commented Jun 15, 2021

I guess having the actual functor is what makes the different, I am guessing that they are trying to query it to estimate how it will consume resources?
We could ask the Kokkos team to give us some explanations?

@brian-kelley
Copy link
Contributor Author

brian-kelley commented Jun 15, 2021

@lucbv The functor instance is available to both ways (for AUTO, it's passed into parallel_reduce, and then it's also passed into team_size_recommended). I was thinking that maybe this big functor uses a specific number of registers that hits a corner case in AUTO (I just suggested this on slack). But this is just a guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants