-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hcal Raw Decoding + Reco (method 0, mahi) on GPU #374
Hcal Raw Decoding + Reco (method 0, mahi) on GPU #374
Conversation
…ucts + cpu producer + validation
general note: MahiGPU.cu should contain only MAHI not the conditions and neither nnls() Can you put in conditions to validate the raw to the digi on CPU part and then only the MAHI on GPU ? |
so,
|
// move index to the right part of the vector | ||
w_max_idx += npassive; | ||
|
||
Eigen::numext::swap(pulseOffsets.coeffRef(npassive), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q: how you avoided to swap all the vectors (and do only the fitted amplitude) ?
https://github.com/cms-sw/cmssw/blob/master/RecoLocalCalo/HcalRecAlgos/src/MahiFit.cc#L476-L484
If I do in the CPU version all the quantities are wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check how pulseOffsets are used. you do not need to swap (on cpu it's prolly better, also you will not be able to use eigen in that case). but for gpu, it's better not to swap especially for small matrices. you just need to keep track of the active/passive set, which is what pulseOffsets does, and load things accordingly.
@vkhristenko could you remove the file |
done |
Hi @vkhristenko, If I checkout out only the minimal set of packages, with cmsrel CMSSW_10_6_3_Patatrack
cd CMSSW_10_6_3_Patatrack/src
cmsenv
git cms-init -x cms-patatrack
git cms-fetch-pr cms-patatrack:374
git merge pull/374
git cms-addpkg CUDADataFormats/HcalCommon CUDADataFormats/HcalDigi CUDADataFormats/HcalRecHitSoA EventFilter/HcalRawToDigi HeterogeneousCore/CUDACore RecoLocalCalo/HcalRecAlgos RecoLocalCalo/HcalRecProducers
# edit HeterogeneousCore/CUDACore/interface/CUDAESProduct.h and add
#include <cassert>
# edit RecoLocalCalo/HcalRecAlgos/src/HcalSiPMCharacteristicsGPU.cc and add
#include "FWCore/Utilities/interface/Exception.h"
scram b -j it works. However, if I add a couple more (unmodified) packages: git cms-addpkg RecoLocalCalo/EcalRecAlgos RecoLocalCalo/EcalRecProducers
scram b -j the build fails with a linker error:
Have you seen this error before ? |
first of, the errors are coming from ecal.... and not hcal sorry... would need to check out myself. i also never did minimalistic check outs, meaning i would always get head of the patatrack branch and do things on top... but here the problem is from what is already part of the release and what has been compiled before... may be try just to add
but next
this guy |
I think you can also reproduce it working directly in your branch. Yes, the errors come from the ECAL packages, but are caused by the changes in the HCAL packages, because of the dependency among them. I think we may have seen similar errors quite some time ago, and IIRC they were caused by multiple shared libraries (or plugins) trying to load the same CUDA kernels (for example, because they were linking a shared library that included those kernels). |
@vkhristenko did you have a look at the link problems ? |
@fwyzard not yet, give me time until monday. i need that branch working on top of 10 6 3 patatrack for opendata in any case before then... |
I"m not able to reproduce it. note that i've added the include of
compiles w/o any linking errors. |
Hi @vkhristenko , # create a working area for CMSSW_10_6_3_Patatrack
scram list
cmsrel CMSSW_10_6_3_Patatrack
cd CMSSW_10_6_3_Patatrack/src
cmsenv
git cms-init -x cms-patatrack
git branch CMSSW_10_6_X_Patatrack --track cms-patatrack/CMSSW_10_6_X_Patatrack
git checkout CMSSW_10_6_X_Patatrack
# merge vkhristenko:hcal_mahi_patatrack and checkout the modified packages
git checkout -b test_merging
git cms-merge-topic vkhristenko:hcal_mahi_patatrack
git diff --name-only $CMSSW_VERSION | cut -d/ -f-2 | uniq | xargs git cms-addpkg
# fix CUDAESProduct.h to add #include <cassert>
git cms-addpkg HeterogeneousCore/CUDACore
vim HeterogeneousCore/CUDACore/interface/CUDAESProduct.h
# checkout the ECAL packages, and all dependencies
git cms-addpkg RecoLocalCalo/EcalRecAlgos RecoLocalCalo/EcalRecProducers
git cms-checkdeps -a
# build
scram b -v -j results in
|
just added dependencies as well... after checking ecal stuff... still no errors |
I am not sure how else to allow you to reproduce it... |
scram arch: slc7_amd64_gcc700 (note you have 820 gcc) |
I've made a copy of your area, and I'll try to rebuild that... |
...I did just scram b clean
scram b -j and I got the same problem
Could you try just that ? |
yep, i did
|
OK, I suspect you didn't get it before because of the order of the compilation and link commands :-( Can you look into what is causing it ? |
yep yep |
Indeed this is a dependency in hcal code on ecal... so, below is the command for final stage device side linking for RecoLocalCaloHcalRecAlgos_cudadlink.o this command includes linking device side code from RecoLocalCalo/EcalRecAlgos by removing this linkage (of @fwyzard , as far as i remember, you mentioned at some point although cms-sw/cmssw-config#65 got merged, still things did not work. I think what i see here is the result of this guy not working 100%... Moreover, in here there is no device side dependency, only host side, but the build rules are still generated... this is actually something we should not add!
|
PR description:
This is the replacemnt of #367 made on top of the #373
PR validation:
an exe is provided to validate against cpu version.
note, that validation branch should be used to match exactly.
method 0 changes:
mahi changes: