-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
R2D: use GPU::SimpleVector for the error unpacking #14
R2D: use GPU::SimpleVector for the error unpacking #14
Conversation
1440 1856
this is the problem: the calibration are the worng one.... (fro phase0)
please add
# load HLT payload
process.GlobalTag = GlobalTag(process.GlobalTag, '100X_dataRun2_asv2plusPixelGainfromHLT_v1', '')
to your config...
v.
|
temp_err.word = ww; | ||
temp_err.fedId = fedId; | ||
temp_err.rawId = rawId; | ||
err->push_back(temp_err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use emplace
instead of creating a temporary object ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually I have already tried this:
err->emplace_back(rawId, ww, error, fedId);
and it does not compile:
Compiling /afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/RawToDigiGPU.cu
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/HeterogeneousCore/CUDAUtilities/interface/GPUSimpleVector.h(68): error: this pack expansion produced more than one expression, and a single expression is needed here
detected during instantiation of "int GPU::SimpleVector::emplace_back(Ts &&...) [with T=error_obj, Ts=<uint32_t &, uint32_t &, unsigned char, unsigned char>]"
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/RawToDigiGPU.cu(491): here
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/HeterogeneousCore/CUDAUtilities/interface/GPUSimpleVector.h(68): error: no suitable constructor exists to convert from "uint32_t" to "error"
detected during instantiation of "int GPU::SimpleVector::emplace_back(Ts &&...) [with T=error_obj, Ts=<uint32_t &, uint32_t &, unsigned char, unsigned char>]"
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/RawToDigiGPU.cu(491): here
2 errors detected in the compilation of "/tmp/calabria/tmpxft_00009396_00000000-8_RawToDigiGPU.compute_61.cpp1.ii".
config/SCRAM/GMake/Makefile.rules:2079: recipe for target 'tmp/slc7_amd64_gcc630/src/EventFilter/SiPixelRawToDigi/plugins/EventFilterSiPixelRawToDigiGPUPlugins/RawToDigiGPU.o' failed
gmake: *** [tmp/slc7_amd64_gcc630/src/EventFilter/SiPixelRawToDigi/plugins/EventFilterSiPixelRawToDigiGPUPlugins/RawToDigiGPU.o] Error 1
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
emplace_back()
requires a constructor (I have stumbled on this as well).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I googled this, and there is a defect report to the C++ standard since 2011... with discussions ongoing more or less continuously ever since: a proposal to fix it since 2015, and further comments just few weeks ago...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...bottom line: add a constructor and use emplace_back
temp_err.word = ww; | ||
temp_err.fedId = fedId; | ||
temp_err.rawId = rawId; | ||
err->push_back(temp_err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... emplace
...
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << (bool) badRocs[i] << std::setw(20) << (bool) modToUnp[i] << std::endl; | ||
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << fedMap[i] << std::setw(20) << linkMap[i] << std::setw(20) << rocMap[i] << std::endl; | ||
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << RawId[i] << std::setw(20) << rocInDet[i] << std::setw(20) << moduleId[i] << std::endl; | ||
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << (int)badRocs[i] << std::setw(20) << (int)modToUnp[i] << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would leave them as bool
, since that' what they are used as
cudaMallocHost(&errRawID_h, sizeof(uint32_t)*WSIZE); | ||
cudaMallocHost(&errWord_h, sizeof(uint32_t)*WSIZE); | ||
cudaMallocHost(&errFedID_h, sizeof(uint32_t)*WSIZE); | ||
uint32_t VSIZE = sizeof(GPU::SimpleVector<error_obj>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use uppercase only for #define
cudaMallocHost(&errFedID_h, sizeof(uint32_t)*WSIZE); | ||
uint32_t VSIZE = sizeof(GPU::SimpleVector<error_obj>); | ||
uint32_t ESIZE = sizeof(error_obj); | ||
bool success = cudaMallocHost(&error_h, VSIZE) == cudaSuccess && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add calls to cudaCheck()
around all cudaMallocHost()
, and remove the assert
?
@@ -83,6 +83,11 @@ template <class T> struct SimpleVector { | |||
__inline__ __host__ __device__ int capacity() const { return m_capacity; } | |||
|
|||
__inline__ __host__ __device__ T *data() const { return m_data; } | |||
|
|||
__inline__ __host__ __device__ void set_size(int size) { m_size = size; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this resize()
, unless people find it confusing ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on resize()
, the effect would anyway be similar to std::vector
. (or remove as it is not currently used).
|
||
__inline__ __host__ __device__ void set_size(int size) { m_size = size; } | ||
|
||
__inline__ __host__ __device__ void set_data(T * data) { m_data = data; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I like or not a set_data()
method... other comments ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly like. If kept it should be accompanied with capacity
parameter.
In principle I'd expect the same be achieved with move/copy assignment. On the other hand, the pointed memory areas are copied CPU<->GPU outside of the SimpleVector
so there could be cases where it is handier (with the current design of SimpleVector
to use set_data()
.
(I must say it took me a while to figure out what exactly happens in the code that uses the 'set_data(). I'm afraid that having the memory owned outside of the
SimpleVectorand transferring the
SimpleVector` members and the data memory separately between CPU and GPU may lead to difficult-to-understand code. I'm fine with that now, but eventually we should aim to simplify.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree it should set also capacity
(and reduce size
if it exceeds the new capacity
).
And I agree long term we should come up with a better approach (or get unified memory to work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further thought, if one resets the data
pointer, the object doesn't anymore know what would be the correct size
, so it would have to be given from outside as well (e.g. if the data
has valid elements already). To me this sounds like the job of move assignment (I came to the conclusion that copying should be forbidden as two copies would just have the same data
pointers). But I can imagine that with the outsourced memory management also the results move operations may become difficult to follow.
@makortel @felicepantaleo can you comment on the |
uint32_t word; | ||
unsigned char errorType; | ||
unsigned char fedId; | ||
} error_obj; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason to not be fully C++ here (i.e. drop typedef and the trailing name)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
cudaCheck(cudaMemcpy(error_h, c.error_d, VSIZE, cudaMemcpyDeviceToHost)); | ||
error_h->set_data(data_h); | ||
int size = error_h->size(); | ||
cudaCheck(cudaMemcpy(data_h, c.data_d, size*ESIZE, cudaMemcpyDeviceToHost)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change to synchronous copy?
(or does cudaGetLastError()
create a synchronization point?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, it doesn't - though it may fail to report errors from asynchronous calls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, can you please summarize the discussion? What should I do with this? Thanks.
cudaMemcpy should stay cudaMemcpyAsync
.A
|
cudaCheck(cudaMemcpyAsync(error_h, c.error_d, vsize, cudaMemcpyDeviceToHost, c.stream)); | ||
error_h->set_data(data_h); | ||
int size = error_h->size(); | ||
cudaCheck(cudaMemcpyAsync(data_h, c.data_d, size*esize, cudaMemcpyDeviceToHost, c.stream)); | ||
} | ||
cudaStreamSynchronize(c.stream); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this synchronization should go right after the cudaCheck(cudaMemcpyAsync(error_h, c.error_d, vsize, cudaMemcpyDeviceToHost, c.stream));
Otherwise you have a race condition as the size could be transferred from the GPU after you've used it in
int size = error_h->size();
if (errType_h[i] != 0) { | ||
SiPixelRawDataError error(errWord_h[i], errType_h[i], errFedID_h[i]+1200); | ||
errors[errRawID_h[i]].push_back(error); | ||
uint32_t size = error_h->size(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you replace uint32_t
with auto
, for readability, same in the loop...
A new Pull Request was created by @calabria (Cesare Calabria) for CMSSW_10_1_X_Patatrack. It involves the following packages: EventFilter/SiPixelRawToDigi The following packages do not have a category, yet: HeterogeneousCore/CUDAUtilities @cmsbot, @fwyzard can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
…ms (L1Trigger/TrackFindingTMTT) (cms-sw#29381) * create separate PRs for the two L1TK packages * Improved KF efficiency at high eta * Moved MC data files to cms-data * Removed old file * Removed KF HLS to put instead in external library * Ran scram b code-format * Delete KF4ParamsComb.h.bak * Delete KF4ParamsCombIV.bak * Delete KF4ParamsCombV2.bak * Delete KF5ParamsComb.h.bak * Delete KF4ParamsComb.cc.bak * Delete KF4ParamsCombIV.bak * Delete KF4ParamsCombV2.bak * Delete KF5ParamsComb.cc.bak * L1 tk integration tmtt pre5 (#7) * Added CMS code style fixes * Removed old file * Reapplied stub b code-format * All code review changes (#13) * Fix clang errors (#14) * fixed clang error * directory for MC txt files * Fixed clang warnings + minor simplifications (#15) * tweak * tweak * Fixed clang warnings and small simplifications * Fixed clang warnings and small simplifications * All remaining review comments addressed (#16) * Replaced vector size with empty function * Simplified DegradeBend and StubWindowSuggest * Fixed more review comments * More review comments * code reformat * Ran exhaustive clang tidy * Added library to BuildFile.xml (#17) * Deleted TrackFindingTMT/data/README (#18) * Added library to BuildFile.xml (This was already done yesterday. Not sure why it appears again) * README file in data directory deleted * Fix review comments (#20) Co-authored-by: Louise Skinnari <louise.skinnari@cern.ch>
Hgcal eol pulse update 112 x bis
@fwyzard @felicepantaleo @VinInn
My changes seem to work fine (always same number and type of errors as in the serial code), but there is a crash due to the calibration part. I don't know if this was expected because still under development.
-bash-4.2$ cmsRun tkreco.py
%MSG-i ThreadStreamSetup: (NoModuleName) 16-Feb-2018 15:42:01 CET pre-events
setting # threads 8
setting # streams 8
%MSG
16-Feb-2018 15:42:12 CET Initiating request to open file file:/data/patatrack/innocent/run2017/JetHT_raw304797HL.root
16-Feb-2018 15:42:13 CET Successfully opened file file:/data/patatrack/innocent/run2017/JetHT_raw304797HL.root
Begin processing the 1st record. Run 304797, Event 105123496, LumiSection 70 on stream 2 at 16-Feb-2018 15:42:32.267 CET
%MSG-w HcalSeverityLevelComputer: HBHEPhase1Reconstructor:hbheprereco 16-Feb-2018 15:42:34 CET Run: 304797 Event: 105123496
HcalSeverityLevelComputer: Error: RecHitFlag >>HFDigiTime<< unknown. Ignoring.
%MSG
caching calibs for 1856 pixel detectors of size 1647360
sizes 1 1 2
precisions g 1.23308 0.0761627
1440 1856
cmsRun: /afs/cern.ch/work/c/calabria/private/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/SiPixelFedCablingMapGPU.cc:164: void processGainCalibration(const SiPixelGainCalibrationForHLT&, const TrackerGeometry&, SiPixelGainForHLTonGPU*&, SiPixelGainForHLTonGPU::DecodingStructure*&): Assertion `p!=ind.end() && p->detid==dus[i]->geographicalId()' failed.
A fatal system signal has occurred: abort signal
The following is the call stack containing the origin of the signal.