-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panzer: Fix PanzerDofMgr_tFilteredUGI_MPI_2 test for CUDA_LAUNCH_BLOCKING=0 #7864
Panzer: Fix PanzerDofMgr_tFilteredUGI_MPI_2 test for CUDA_LAUNCH_BLOCKING=0 #7864
Conversation
putScalar will by default write to device. replaceGlobalValue will always write to host. So we can sync_host like this after putScalar to make sure the device write is synched for UVM to host before replaceGlobalValue. Then since replaceGlobalValue does not call modify_host, I think we need to call this afterwards. That won't be necessary for UVM, but later if device is CudaSpace, then skipping that call means the export will be confused about where the data is and try to pull it from device. An alternative is to call modify_host before the putScalar calls. Then the putScalar will write to host and we can skip the modify_host calls after the replaceGlobalValue. This might be confusing as I'm not sure calling modify_host to force putScalar to host is a normal pattern.
Status Flag 'Pre-Test Inspection' - - This Pull Request Requires Inspection... The code must be inspected by a member of the Team before Testing/Merging |
Status Flag 'Pre-Test Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ rppawlo ]! |
Status Flag 'Pull Request AutoTester' - Testing Jenkins Projects: Pull Request Auto Testing STARTING (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
Using Repos:
Pull Request Author: MicheldeMessieres |
Status Flag 'Pull Request AutoTester' - Jenkins Testing: all Jobs PASSED Pull Request Auto Testing has PASSED (click to expand)Build InformationTest Name: Trilinos_pullrequest_gcc_4.8.4
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_intel_17.0.1
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_4.9.3_SERIAL
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_7.2.0_debug
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_gcc_8.3.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_cuda_9.2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_clang_10.0.0
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_2
Jenkins Parameters
Build InformationTest Name: Trilinos_pullrequest_python_3
Jenkins Parameters
|
Status Flag 'Pre-Merge Inspection' - SUCCESS: The last commit to this Pull Request has been INSPECTED AND APPROVED by [ rppawlo ]! |
Status Flag 'Pull Request AutoTester' - AutoMerge IS ENABLED, but the Label AT: AUTOMERGE is not set. Either set Label AT: AUTOMERGE or manually merge the PR... |
putScalar will by default write to device.
replaceGlobalValue will always write to host.
So we can sync_host like this after putScalar to make sure the device write
is synched for UVM to host before replaceGlobalValue.
Then since replaceGlobalValue does not call modify_host, I think we need
to call this afterwards. That won't be necessary for UVM, but later if
device is CudaSpace, then skipping that call means the export will be confused
about where the data is and try to pull it from device.
An alternative is to call modify_host before the putScalar calls.
Then the putScalar will write to host and we can skip the modify_host
calls after the replaceGlobalValue. This might be confusing as I'm not
sure calling modify_host to force putScalar to host is a normal pattern.
Also I think that is not good performance for the putScalar.
@trilinos/panzer
Motivation
Fix Panzer tests to run with CUDA_LAUNCH_BLOCKING=0
Testing
PanzerDofMgr_tFilteredUGI_MPI_2 Cuda build on white