-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update ECAL and HCAL reconstruction to run on multple GPUs #502
Update ECAL and HCAL reconstruction to run on multple GPUs #502
Conversation
… is relinquishable
RecoLocalCalo/EcalRecProducers/plugins/AmplitudeComputationCommonKernels.cu
Show resolved
Hide resolved
RecoLocalCalo/EcalRecProducers/plugins/AmplitudeComputationCommonKernels.cu
Show resolved
Hide resolved
|
||
// input cpu data | ||
ecal::raw::InputDataCPU inputCPU = { | ||
cms::cuda::make_host_unique<unsigned char[]>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment for the future (after CMSSW 11.1.0 / CUDA 11.0 / c++17): would it make sense to use std::byte
instead of unsigned char
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say we should be consistent with what sits inside of the FEDRawData... although we do make a copy before sending to the device...
Sorry for taking long to go through this, my plan is to
I am also using this PR as an excuse to add the HCAL-only workflows to the validation, but if that takes too long I'll just go ahead and merge it. |
@fwyzard np. i tested this on cmg-gpu1080... |
I haven't gotten to the HCAL workflow yet - but this PR already breaks the ECAL-only workflow, and the HLT customisation. Looks like the reasons are:
I can try to add them ... |
What is the configuration for The autogenerated cfi is import FWCore.ParameterSet.Config as cms
ecalRecHitParametersGPUESProducer = cms.ESSource('EcalRecHitParametersGPUESProducer',
ChannelStatusToBeExcluded = cms.VPSet(
cms.PSet()
),
appendToDataLabel = cms.string('')
) but using it fails with
|
The auto generated cfi is not self sufficient...I indicated this in pr
description. Andrea should provide input here.
…On Wed, 8 Jul 2020 at 19:17, Andrea Bocci ***@***.***> wrote:
What is the configuration for EcalRecHitParametersGPUESProducer supposed
to look like ?
The autogenerated cfi is
import FWCore.ParameterSet.Config as cms
ecalRecHitParametersGPUESProducer = cms.ESSource('EcalRecHitParametersGPUESProducer',
ChannelStatusToBeExcluded = cms.VPSet(
cms.PSet()
),
appendToDataLabel = cms.string('')
)
but using it fails with
----- Begin Fatal Exception 08-Jul-2020 19:06:29
CEST-----------------------
An exception of category 'Configuration' occurred while
[0] Constructing the EventProcessor
[1] Validating configuration of ESProducer or ESSource of type
EcalRecHitParametersGPUESProducer with label:
'ecalRecHitParametersGPUESProducer'
Exception Message:
Missing required parameter. It should have label "kDAC" and have type
"tracked string".
The description has no default. The parameter must be defined in the
configuration
----- End Fatal Exception -------------------------------------------------
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#502 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABSFUCNTUGY7O46ALF5ZULLR2SS2ZANCNFSM4OUMZMDA>
.
|
I see - I hadn't understood that the comment was about the EDProducer/ESProducer split, thanks for the clarification. |
ok, I think that you need to use this guy https://github.com/cms-patatrack/cmssw/blob/CMSSW_11_1_X_Patatrack/RecoLocalCalo/EcalRecProducers/python/ecalRecHitGPU_cfi.py but again the EDProducer itself does not really have self-sufficient |
Thanks @vkhristenko ! I think that with this I tested with the usual test configuration under RecoLocalCalo/EcalRecProducers/test/testEcalRechitProducer_cfg.py |
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
Use caching allocators for host and device CUDA memory. Use dedicated ESProducers to make part of the modules' configuration available on all GPUs. Rename hcal and hcal::common namespaces to to calo::common.
PR description:
superseeds #498
hcal and ecal are done together in here cause of the change
CUDADataFormats/HcalCommon
->CUDADataFormats/CaloCommon
which both hcal and ecal now depend on. this avoids duplication...this is to allow hcal and ecal running on a node with multiple gpus.
all the modules have been updated for that and now basically no protection for cuda service is needed.
note: Ecal RecHit was only updated but not validated (fillDescriptions is not self-sufficient) @amassiro
the only thing in this pr is that the newly added conditions' Records should be moved to CondFormats/DataRecord eventually.
PR validation:
using standalone execs