Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ifpack2: fix timers #8478

Merged
merged 1 commit into from
Dec 15, 2020

Conversation

brian-kelley
Copy link
Contributor

  • Don't do e.g. "ApplyTime_ += timer->totalElapsedTime()" since this
    will re-count time from all previous applies.
  • Replace totalElapsedTime() with a wallTime() interval for this
    instead, since if two preconditioner objects A and B are used in
    the same StackedTimer scope, B's times will include A's
    times.

Here's an example output from before these changes, showing both issues. I looped over a setup+solve 3 times in the Ifpack2+Belos driver:

Every proc reading parameters from xml_file: test_MTGS_calore1_mm.xml
Matrix Market file for sparse matrix A: /ascldap/users/bmkelle/MatrixCollection/bodyy5.mtx
Configuring, initializing, and computing Ifpack2 preconditioner
 Time Init: 0.0451295
 Finished computing Ifpack2 preconditioner
  Time (s): 0.00254004
Converged in 196 iterations.
Preconditioner attributes:
 "Ifpack2::Relaxation":
  MatrixType: "Tpetra::RowMatrix<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >"
  Label: Ifpack2::Relaxation
  Initialized: true
  Computed: true
  Parameters: 
...
  Call counts and total times (in seconds): 
   initialize: 
    Call count: 1
    Total time: 0.0445694
   compute: 
    Call count: 1
    Total time: 0.00245176
   apply: 
    Call count: 197
    Total time: 45.5491             <---------- this time is obviously too high, the whole solve < 2 seconds
2-Norm of 0th residual vec: 0
Achieved tolerance: 9.90648e-09
Matrix Market file for sparse matrix A: /ascldap/users/bmkelle/MatrixCollection/bodyy5.mtx
Configuring, initializing, and computing Ifpack2 preconditioner
 Time Init: 0.0410482
 Finished computing Ifpack2 preconditioner
  Time (s): 0.00175627
Converged in 197 iterations.
Preconditioner attributes:
 "Ifpack2::Relaxation":
  MatrixType: "Tpetra::RowMatrix<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >"
  Label: Ifpack2::Relaxation
  Initialized: true
  Computed: true
  Parameters: 
...
  Call counts and total times (in seconds): 
   initialize: 
    Call count: 1
    Total time: 0.0856112    <----- this is twice the initialize time from the 1st run, because it's including that timer
   compute: 
    Call count: 1
    Total time: 0.00420082
   apply: 
    Call count: 198
    Total time: 137.697
2-Norm of 0th residual vec: 0
Achieved tolerance: 9.36909e-09
Matrix Market file for sparse matrix A: /ascldap/users/bmkelle/MatrixCollection/bodyy5.mtx
Configuring, initializing, and computing Ifpack2 preconditioner
 Time Init: 0.0410016
 Finished computing Ifpack2 preconditioner
  Time (s): 0.00189405
Converged in 198 iterations.
Preconditioner attributes:
 "Ifpack2::Relaxation":
  MatrixType: "Tpetra::RowMatrix<double, int, long long, Kokkos::Compat::KokkosDeviceWrapperNode<Kokkos::Serial, Kokkos::HostSpace> >"
  Label: Ifpack2::Relaxation
  Initialized: true
  Computed: true
  Parameters: 
...
  Call counts and total times (in seconds): 
   initialize: 
    Call count: 1
    Total time: 0.126603 <------ and this is 3x the first (correct) init time
   compute: 
    Call count: 1
    Total time: 0.00608721
   apply: 
    Call count: 199
    Total time: 231.351
2-Norm of 0th residual vec: 0
Achieved tolerance: 9.94677e-09
proc 0 total program time: 7.25437
End Result: TEST PASSED

Here is what StackedTimer shows (this always had the correct call counts and total times). With this PR, the times reported by the Preconditioner above are consistent with the StackedTimer.

Ifpack2+Belos: total: 7.30449 [1]
|   init: 0.126711 [3]
|   |   Ifpack2::Relaxation::initialize: 0.126314 [3]
|   |   Remainder: 0.000397178
|   precond: 0.0060841 [3]
|   |   Ifpack2::Relaxation::compute: 0.00603574 [3]
|   |   Remainder: 4.836e-05
|   Belos: Operation Op*x: 0.0138657 [6]
|   Belos: Operation Prec*x: 0.00765458 [3]
|   |   Ifpack2::Relaxation::apply: 0.00762528 [3]
|   |   Remainder: 2.93e-05
|   Belos: BlockGmresSolMgr total solve time: 6.6073 [3]
|   |   Belos: ICGS[2]: Orthogonalization: 3.72666 [594]
|   |   |   Belos: ICGS[2]: Ortho (Norm): 0.0673645 [594]
|   |   |   Belos: ICGS[2]: Ortho (Inner Product): 2.20892 [1182]
|   |   |   Belos: ICGS[2]: Ortho (Update): 1.38941 [1182]
|   |   |   Remainder: 0.0609764
|   |   Belos: Operation Op*x: 1.38112 [610]
|   |   Belos: Operation Prec*x: 1.39858 [591]
|   |   |   Ifpack2::Relaxation::apply: 1.395 [591]
|   |   |   Remainder: 0.00357807
|   |   Remainder: 0.100944
|   Remainder: 0.542874

@trilinos/ifpack2

Motivation

Makes all Ifpack2::Preconditioner compute InitTime_, ComputeTime_, ApplyTime_ correctly.

Stakeholder Feedback

Testing

Did the setup+solve 3 times to demonstrate the incorrect reported times, and that this fixed it.

- Don't do e.g. "ApplyTime_ += timer->totalElapsedTime()" since this
  will re-count time from all previous applies.
- Replace totalElapsedTime() with a wallTime() interval for this
  instead, since if two preconditioner objects A and B are used in
  the same StackedTimer scope, B's times will include A's
  times.
@brian-kelley brian-kelley added the type: bug The primary issue is a bug in Trilinos code or tests label Dec 15, 2020
@brian-kelley brian-kelley self-assigned this Dec 15, 2020
@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Test Inspection' - Auto Inspected - Inspection Is Not Necessary for this Pull Request.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects:

Pull Request Auto Testing STARTING (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_8.3.0

  • Build Num: 3027
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0_serial

  • Build Num: 660
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0_debug

  • Build Num: 1153
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 8512
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 6228
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_clang_10.0.0

  • Build Num: 1362
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_python_3

  • Build Num: 4117
  • Status: STARTED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Using Repos:

Repo: TRILINOS (brian-kelley/Trilinos)
  • Branch: FixIfpack2Timers
  • SHA: f343ea5
  • Mode: TEST_REPO

Pull Request Author: brian-kelley

Copy link
Member

@jhux2 jhux2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-kelley Looks good to me. Your change is also makes it easier (for me) to understand what's being timed. Nice catch.

@brian-kelley brian-kelley added the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Dec 15, 2020
@brian-kelley
Copy link
Contributor Author

@jhux2 Thanks for reviewing.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED

Pull Request Auto Testing has PASSED (click to expand)

Build Information

Test Name: Trilinos_pullrequest_gcc_8.3.0

  • Build Num: 3027
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0_serial

  • Build Num: 660
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_gcc_7.2.0_debug

  • Build Num: 1153
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_intel_17.0.1

  • Build Num: 8512
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_cuda_9.2

  • Build Num: 6228
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_clang_10.0.0

  • Build Num: 1362
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed

Build Information

Test Name: Trilinos_pullrequest_python_3

  • Build Num: 4117
  • Status: PASSED

Jenkins Parameters

Parameter Name Value
PR_LABELS type: bug
PULLREQUESTNUM 8478
TEST_REPO_ALIAS TRILINOS
TRILINOS_SOURCE_BRANCH FixIfpack2Timers
TRILINOS_SOURCE_REPO https://github.com/brian-kelley/Trilinos
TRILINOS_SOURCE_SHA f343ea5
TRILINOS_TARGET_BRANCH develop
TRILINOS_TARGET_REPO https://github.com/trilinos/Trilinos
TRILINOS_TARGET_SHA a5b71ed


CDash Test Results for PR# 8478.

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ jhux2 ]!

@trilinos-autotester
Copy link
Contributor

Status Flag 'Pull Request AutoTester' - Pull Request will be Automerged

@trilinos-autotester trilinos-autotester merged commit 8785e8c into trilinos:develop Dec 15, 2020
@trilinos-autotester
Copy link
Contributor

Merge on Pull Request# 8478: IS A SUCCESS - Pull Request successfully merged

@trilinos-autotester trilinos-autotester removed the AT: AUTOMERGE Causes the PR autotester to automatically merge the PR branch once approvals are completed label Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants