-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
example/gmres: Switch scalar to bhalf_t #1300
Conversation
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_Tpls_CUDA10_Tpls_CUDA10_LayoutRight_GCC720_Light_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: e10harvey |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: 1 or more Jobs FAILED Note: Testing will normally be attempted again in approx. 2 Hrs 30 Mins. If a change to the PR source branch occurs, the testing will be attempted again on next available autotester run. Pull Request Auto Testing has FAILED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_Tpls_CUDA10_Tpls_CUDA10_LayoutRight_GCC720_Light_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_CUDA9_Tpls_CUDA10_Tpls_CUDA10_LayoutRight_GCC720_Light_GCC720_GCC740 # 36 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720 # 807 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_GCC720_Light_LayoutRight # 454 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_GCC720 # 798 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_Tpls_INTEL18 # 786 (click to expand)
Console Output (last 100 lines) : KokkosKernels_PullRequest_CLANG1001 # 191 (click to expand)
|
This is interesting! I am surprised at such good initial results from bfloat16. I will look this over in more detail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Triple-checking: bhalf = bfloat16, yes?
- What system did you run this on? I checked the double precision numbers on Vortex with both CUDA 11.2.0 and CUDA 10.2.89. In both cases, the test_complex with ST=double converges in 652 iterations. I am wondering why your run converges in 506? That seems like too much of a change for simple rounding errors.
- Can we add some sort of option to those example files so that the user can pick a scalar type? I don't really want the double versions to be overwritten with bhalf.
Beyond that, everything looks great! :)
Yes.
Vortex. I used the .mtx from https://math.nist.gov/MatrixMarket/data/Harwell-Boeing/acoust/young1c.html, could that be why?
Shall we leave them as double like we did when testing half? If so, I can update the PR. |
about Young1c.mtx : Weird, you are right! I downloaded the file from NIST and now mine is also using 506 iterations. I don't understand why this is different from the file I pulled via ssget (Gingko utility that connects to SuiteSparse). Yes, if you change the default back to double, I will approve the PR. |
As a side note- I've figured out the young1c issues. Apparently there was confusion in the original Harwell-Boeing matrix collection about this matrix data. The NIST website interprets it as symmetric and ignores some data, while SuiteSparse uses all the data and interprets the matrix as nonsymmetric. [This is all documented in the note at the top here: nist-young1c and the note at the bottom here: suitesparce-young1c ] So that seems to be why the matrices are different. Very unusual! |
Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request. |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_Tpls_CUDA10_Tpls_CUDA10_LayoutRight_GCC720_Light_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
Using Repos:
Pull Request Author: e10harvey |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: KokkosKernels_PullRequest_Tpls_CUDA9_Tpls_CUDA10_Tpls_CUDA10_LayoutRight_GCC720_Light_GCC720_GCC740
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_GCC720_Light_LayoutRight
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_GCC720
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_Tpls_INTEL18
Jenkins Parameters
Build InformationTest Name: KokkosKernels_PullRequest_CLANG1001
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jennloe ]! |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
2 similar comments
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
Status Flag 'Pull Request AutoTester' - Pull Request MUST BE MERGED MANUALLY BY Project Team - This Repo does not support Automerge |
Most of the diff in this PR is due to running clang-format. I only changed the
ST
typedefs fromdouble
tobhalf
and added thefabs
wrappers forbhalf
.I used cuda 11.2.0 for these tests with the .mtx files that are hard coded in these gmres example files.
Note that bhalf falls back to float prior to cuda 11 since bhalf support was first added in cuda 11.
Real numbers - Convergence with double
Real numbers - Convergence with bhalf
Complex numbers - Convergence with double
Complex numbers - Convergence with bhalf
Fixes #1250.