Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve GPU treatment of conditions #65

Closed
makortel opened this issue May 30, 2018 · 6 comments
Closed

Improve GPU treatment of conditions #65

makortel opened this issue May 30, 2018 · 6 comments
Assignees

Comments

@makortel
Copy link

makortel commented May 30, 2018

Currently the conditions data for GPUs is handled in the following ways

  1. Raw2Cluster EDProducer owns the GPU memory and transfers the data to GPU
    • GPU memory need gets multiplied by the number of edm::Streams
  2. PixelCPEFast ESProduct owns the GPU memory and transfers the data to GPU by itself
    • CPU->GPU transfer is synchronous
    • Does not support multiple GPU devices

To unify the approach and to address the shortcomings above I'm thinking the following

  • It should be the ESProduct's responsibility to own the GPU memory and transfer the data to GPU
  • The ESProduct would have a function like getGPUProductAsync(cuda::stream_t<>&) which would do the following
    • if the data is not yet on current GPU device
      • allocate the GPU memory on current device (should happen only rarely so the cost of cudaMalloc() could be acceptable?)
      • transfer the data to the GPU asynchronously using the CUDA stream argument
    • return the (struct of) pointer(s) to the GPU data

It would follow that

  • Data would be transferred to a GPU device only when necessary
  • There would be only one copy of data per GPU device
  • ESProduct would need a vector/array of the length of the number of edm::Streams of
    • a mutex (or some other synchronization primitive to prevent running multiple CPU->GPU transfers on the same device)
      • the ESProducer::getGPUProductAsync() should enqueue a callback function to the CUDA stream releasing the mutex after the asynchronous memory transfers are finished
    • pointers to GPU memory
  • Everything would get automatically updated for IOV changes
  • Multiple lumis in flight should work

@fwyzard @felicepantaleo @VinInn @rovere @vkhristenko

@cmsbot
Copy link

cmsbot commented May 30, 2018

A new Issue was created by @makortel Matti Kortelainen.

can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@makortel
Copy link
Author

I'll start to look into this. If anyone sees problem(s) in the plan, please scream.

@makortel makortel self-assigned this May 30, 2018
@makortel
Copy link
Author

I think I have a decent solution for PixelCPEFast(I can make a PR after #62 is merged). I proceeded then to the cabling map in Raw2Cluster which ended up being not so simple. The reasons are

  • SiPixelFedCablingMap comes from the DB (so I don't want to touch it), while PixelCPEFast has an ESProducer
  • Our current SiPixelFedCablingMapGPU seems to squash 4 pieces of information together (for convenience, I presume)
    1. ROC index -> fed, link, ROC, module DetId, ROC-in-module index (from SiPixelFedCablingMap)
    2. ROC index -> module index (from TrackerGeometry)
    3. ROC index -> good/bad (SiPixelQuality)
    4. ROC index -> whether to unpack (from PixelDataFormatter, in principle event-level information?)

Then thinking out loud what to do. I believe point iv should in any case be separated as it is event-level quantity (even if in practice in absence of pixel unpacking regions the list of modules is presumably the same for each event). For i-iii I think there are two options

  1. Continue with the "monolithic" approach, i.e. combined i-iii
    • Need to create an ESProduct wrapping them (ok), an ESProducer for that (ok), and a Record where the ESProducer produces the product
      • The Record should depend on SiPixelFedCablingMapRcd, SiPixelQualityRcd, and TrackerDigiGeometryRecord
      • Is there an existing Record to use? If not, we could add a new one, but last time I tried that I didn't succeed. Also where the new Record file should be placed?
  2. Split to three GPU ESProducts
    • Maybe somewhat more work, but more straightforward to do (to me at least, can use existing Records)
    • The loop over fed+link+roc gets repeated

@VinInn
Copy link

VinInn commented May 31, 2018

I favor 1:
this is an approach used in calorimetry that should be more widespread (the problem is that is heavish on the coding side)
for instance pixel (and cabling in particular) would benefit of an intermediate producer to account for transformation between DB and memory layout (and content!)
there are some interesting discussions in some PRs on the subject
Anyhow this is NOT something we should solve here (even if it makes our lives more miserable)

btw we need to address gpuCalib as well (that may be combined in 1)

for Records look under
https://cmssdt.cern.ch/dxr/CMSSW/source/RecoTracker/Record
(sic)

there is the catch all record
https://cmssdt.cern.ch/dxr/CMSSW/source/RecoTracker/Record/interface/CkfComponentsRecord.h

notice the comment
#include "CondFormats/DataRecord/interface/SiPixelQualityRcd.h"
#include "CondFormats/DataRecord/interface/SiPixelFedCablingMapRcd.h" // FIXME should be in the dependencies of the SiPixelQualityRcd

and of course
https://cmssdt.cern.ch/dxr/CMSSW/source/RecoLocalTracker/Records/interface
that contains only CPEs though

here there is a nice collection of records for the strips
https://cmssdt.cern.ch/dxr/CMSSW/source/CondFormats/DataRecord/interface/SiStripCondDataRecords.h

I personally would move raw2cluster in LocalRecoTracker (where it belongs) as I did for the strip
and then use LocalRecoTracker/Record to store new records (and LocalRecoTracker/xyz to locate ESProducers

@makortel
Copy link
Author

@VinInn Thanks, I'll start to move to that direction then.

@makortel
Copy link
Author

makortel commented Jun 5, 2018

Addressed in #77.

fwyzard pushed a commit that referenced this issue Feb 12, 2021
* TrackQuality Code Review

* Fix to FileInPath error

* Revert FileInPath, set quality default to true
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants