Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R2D: use GPU::SimpleVector for the error unpacking #14

Merged

Conversation

calabria
Copy link

@fwyzard @felicepantaleo @VinInn

My changes seem to work fine (always same number and type of errors as in the serial code), but there is a crash due to the calibration part. I don't know if this was expected because still under development.

-bash-4.2$ cmsRun tkreco.py
%MSG-i ThreadStreamSetup: (NoModuleName) 16-Feb-2018 15:42:01 CET pre-events
setting # threads 8
setting # streams 8
%MSG
16-Feb-2018 15:42:12 CET Initiating request to open file file:/data/patatrack/innocent/run2017/JetHT_raw304797HL.root
16-Feb-2018 15:42:13 CET Successfully opened file file:/data/patatrack/innocent/run2017/JetHT_raw304797HL.root
Begin processing the 1st record. Run 304797, Event 105123496, LumiSection 70 on stream 2 at 16-Feb-2018 15:42:32.267 CET
%MSG-w HcalSeverityLevelComputer: HBHEPhase1Reconstructor:hbheprereco 16-Feb-2018 15:42:34 CET Run: 304797 Event: 105123496
HcalSeverityLevelComputer: Error: RecHitFlag >>HFDigiTime<< unknown. Ignoring.
%MSG
caching calibs for 1856 pixel detectors of size 1647360
sizes 1 1 2
precisions g 1.23308 0.0761627
1440 1856
cmsRun: /afs/cern.ch/work/c/calabria/private/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/SiPixelFedCablingMapGPU.cc:164: void processGainCalibration(const SiPixelGainCalibrationForHLT&, const TrackerGeometry&, SiPixelGainForHLTonGPU*&, SiPixelGainForHLTonGPU::DecodingStructure*&): Assertion `p!=ind.end() && p->detid==dus[i]->geographicalId()' failed.

A fatal system signal has occurred: abort signal
The following is the call stack containing the origin of the signal.

@VinInn
Copy link

VinInn commented Feb 16, 2018 via email

temp_err.word = ww;
temp_err.fedId = fedId;
temp_err.rawId = rawId;
err->push_back(temp_err);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use emplace instead of creating a temporary object ?

Copy link
Author

@calabria calabria Feb 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I have already tried this:
err->emplace_back(rawId, ww, error, fedId);
and it does not compile:

Compiling /afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/RawToDigiGPU.cu
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/HeterogeneousCore/CUDAUtilities/interface/GPUSimpleVector.h(68): error: this pack expansion produced more than one expression, and a single expression is needed here
detected during instantiation of "int GPU::SimpleVector::emplace_back(Ts &&...) [with T=error_obj, Ts=<uint32_t &, uint32_t &, unsigned char, unsigned char>]"
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/RawToDigiGPU.cu(491): here

/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/HeterogeneousCore/CUDAUtilities/interface/GPUSimpleVector.h(68): error: no suitable constructor exists to convert from "uint32_t" to "error"
detected during instantiation of "int GPU::SimpleVector::emplace_back(Ts &&...) [with T=error_obj, Ts=<uint32_t &, uint32_t &, unsigned char, unsigned char>]"
/afs/cern.ch/work/c/calabria/private/DUMMY_GPU/forse/CMSSW_10_1_0_pre1/src/EventFilter/SiPixelRawToDigi/plugins/RawToDigiGPU.cu(491): here

2 errors detected in the compilation of "/tmp/calabria/tmpxft_00009396_00000000-8_RawToDigiGPU.compute_61.cpp1.ii".
config/SCRAM/GMake/Makefile.rules:2079: recipe for target 'tmp/slc7_amd64_gcc630/src/EventFilter/SiPixelRawToDigi/plugins/EventFilterSiPixelRawToDigiGPUPlugins/RawToDigiGPU.o' failed
gmake: *** [tmp/slc7_amd64_gcc630/src/EventFilter/SiPixelRawToDigi/plugins/EventFilterSiPixelRawToDigiGPUPlugins/RawToDigiGPU.o] Error 1
gmake: *** [There are compilation/build errors. Please see the detail log above.] Error 2

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emplace_back() requires a constructor (I have stumbled on this as well).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I googled this, and there is a defect report to the C++ standard since 2011... with discussions ongoing more or less continuously ever since: a proposal to fix it since 2015, and further comments just few weeks ago...

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...bottom line: add a constructor and use emplace_back

temp_err.word = ww;
temp_err.fedId = fedId;
temp_err.rawId = rawId;
err->push_back(temp_err);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... emplace ...

LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << (bool) badRocs[i] << std::setw(20) << (bool) modToUnp[i] << std::endl;
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << fedMap[i] << std::setw(20) << linkMap[i] << std::setw(20) << rocMap[i] << std::endl;
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << RawId[i] << std::setw(20) << rocInDet[i] << std::setw(20) << moduleId[i] << std::endl;
LogDebug("SiPixelFedCablingMapGPU") << i << std::setw(20) << (int)badRocs[i] << std::setw(20) << (int)modToUnp[i] << std::endl;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave them as bool, since that' what they are used as

cudaMallocHost(&errRawID_h, sizeof(uint32_t)*WSIZE);
cudaMallocHost(&errWord_h, sizeof(uint32_t)*WSIZE);
cudaMallocHost(&errFedID_h, sizeof(uint32_t)*WSIZE);
uint32_t VSIZE = sizeof(GPU::SimpleVector<error_obj>);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use uppercase only for #define

cudaMallocHost(&errFedID_h, sizeof(uint32_t)*WSIZE);
uint32_t VSIZE = sizeof(GPU::SimpleVector<error_obj>);
uint32_t ESIZE = sizeof(error_obj);
bool success = cudaMallocHost(&error_h, VSIZE) == cudaSuccess &&
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add calls to cudaCheck() around all cudaMallocHost(), and remove the assert ?

@@ -83,6 +83,11 @@ template <class T> struct SimpleVector {
__inline__ __host__ __device__ int capacity() const { return m_capacity; }

__inline__ __host__ __device__ T *data() const { return m_data; }

__inline__ __host__ __device__ void set_size(int size) { m_size = size; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would call this resize(), unless people find it confusing ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree on resize(), the effect would anyway be similar to std::vector. (or remove as it is not currently used).


__inline__ __host__ __device__ void set_size(int size) { m_size = size; }

__inline__ __host__ __device__ void set_data(T * data) { m_data = data; }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I like or not a set_data() method... other comments ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't particularly like. If kept it should be accompanied with capacity parameter.

In principle I'd expect the same be achieved with move/copy assignment. On the other hand, the pointed memory areas are copied CPU<->GPU outside of the SimpleVector so there could be cases where it is handier (with the current design of SimpleVector to use set_data().

(I must say it took me a while to figure out what exactly happens in the code that uses the 'set_data(). I'm afraid that having the memory owned outside of the SimpleVectorand transferring theSimpleVector` members and the data memory separately between CPU and GPU may lead to difficult-to-understand code. I'm fine with that now, but eventually we should aim to simplify.)

Copy link

@fwyzard fwyzard Feb 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree it should set also capacity (and reduce size if it exceeds the new capacity).

And I agree long term we should come up with a better approach (or get unified memory to work).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further thought, if one resets the data pointer, the object doesn't anymore know what would be the correct size, so it would have to be given from outside as well (e.g. if the data has valid elements already). To me this sounds like the job of move assignment (I came to the conclusion that copying should be forbidden as two copies would just have the same data pointers). But I can imagine that with the outsourced memory management also the results move operations may become difficult to follow.

@fwyzard
Copy link

fwyzard commented Feb 16, 2018

@makortel @felicepantaleo can you comment on the SimpleVector interface changes ?

uint32_t word;
unsigned char errorType;
unsigned char fedId;
} error_obj;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason to not be fully C++ here (i.e. drop typedef and the trailing name)?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

cudaCheck(cudaMemcpy(error_h, c.error_d, VSIZE, cudaMemcpyDeviceToHost));
error_h->set_data(data_h);
int size = error_h->size();
cudaCheck(cudaMemcpy(data_h, c.data_d, size*ESIZE, cudaMemcpyDeviceToHost));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change to synchronous copy?

(or does cudaGetLastError() create a synchronization point?)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it doesn't - though it may fail to report errors from asynchronous calls

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, can you please summarize the discussion? What should I do with this? Thanks.

@fwyzard
Copy link

fwyzard commented Feb 17, 2018 via email

cudaCheck(cudaMemcpyAsync(error_h, c.error_d, vsize, cudaMemcpyDeviceToHost, c.stream));
error_h->set_data(data_h);
int size = error_h->size();
cudaCheck(cudaMemcpyAsync(data_h, c.data_d, size*esize, cudaMemcpyDeviceToHost, c.stream));
}
cudaStreamSynchronize(c.stream);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this synchronization should go right after the cudaCheck(cudaMemcpyAsync(error_h, c.error_d, vsize, cudaMemcpyDeviceToHost, c.stream));
Otherwise you have a race condition as the size could be transferred from the GPU after you've used it in
int size = error_h->size();

if (errType_h[i] != 0) {
SiPixelRawDataError error(errWord_h[i], errType_h[i], errFedID_h[i]+1200);
errors[errRawID_h[i]].push_back(error);
uint32_t size = error_h->size();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you replace uint32_t with auto, for readability, same in the loop...

@cmsbot
Copy link

cmsbot commented Feb 19, 2018

A new Pull Request was created by @calabria (Cesare Calabria) for CMSSW_10_1_X_Patatrack.

It involves the following packages:

EventFilter/SiPixelRawToDigi
HeterogeneousCore/CUDAUtilities

The following packages do not have a category, yet:

HeterogeneousCore/CUDAUtilities
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@cmsbot, @fwyzard can you please review it and eventually sign? Thanks.

cms-bot commands are listed here

@fwyzard fwyzard merged commit 2f625fc into cms-patatrack:CMSSW_10_1_X_Patatrack Feb 20, 2018
fwyzard pushed a commit that referenced this pull request Dec 7, 2018
fwyzard pushed a commit that referenced this pull request May 23, 2020
…ms (L1Trigger/TrackFindingTMTT) (cms-sw#29381)

* create separate PRs for the two L1TK packages

* Improved KF efficiency at high eta

* Moved MC data files to cms-data

* Removed old file

* Removed KF HLS to put instead in external library

* Ran scram b code-format

* Delete KF4ParamsComb.h.bak

* Delete KF4ParamsCombIV.bak

* Delete KF4ParamsCombV2.bak

* Delete KF5ParamsComb.h.bak

* Delete KF4ParamsComb.cc.bak

* Delete KF4ParamsCombIV.bak

* Delete KF4ParamsCombV2.bak

* Delete KF5ParamsComb.cc.bak

* L1 tk integration tmtt pre5 (#7)

* Added CMS code style fixes

* Removed old file

* Reapplied stub b code-format

* All code review changes (#13)

* Fix clang errors (#14)

* fixed clang error

* directory for MC txt files

* Fixed clang warnings + minor simplifications (#15)

* tweak

* tweak

* Fixed clang warnings and small simplifications

* Fixed clang warnings and small simplifications

* All remaining review comments addressed (#16)

* Replaced vector size with empty function

* Simplified DegradeBend and StubWindowSuggest

* Fixed more review comments

* More review comments

* code reformat

* Ran exhaustive clang tidy

* Added library to BuildFile.xml (#17)

* Deleted TrackFindingTMT/data/README (#18)

* Added library to BuildFile.xml (This was already done yesterday. Not sure why it appears again)

* README file in data directory deleted

* Fix review comments (#20)

Co-authored-by: Louise Skinnari <louise.skinnari@cern.ch>
fwyzard pushed a commit that referenced this pull request Jul 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants