Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU: better hits #81

Merged
merged 13 commits into from
Jun 29, 2018
Merged

Conversation

VinInn
Copy link

@VinInn VinInn commented Jun 12, 2018

Migrated PixelRecHit producer to Heterogeneous (including a cpu product)
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
took the opportunity for some cleanup and bug fixes.

@cmsbot
Copy link

cmsbot commented Jun 12, 2018

A new Pull Request was created by @VinInn (Vincenzo Innocente) for CMSSW_10_2_X_Patatrack.

It involves the following packages:

Geometry/TrackerGeometryBuilder
RecoLocalTracker/Configuration
RecoLocalTracker/SiPixelRecHits

@cmsbot, @fwyzard can you please review it and eventually sign? Thanks.

cms-bot commands are listed here


cudaCheck(cudaMalloc((void**) & gpu_d, sizeof(HitsOnGPU)));
cudaCheck(cudaMemcpy(gpu_d, &gpu_, sizeof(HitsOnGPU), cudaMemcpyDefault));
cudaCheck(cudaDeviceSynchronize());

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor detail, but the cudaMemcpy could be made asynchronous by passing the CUDA stream from here

void SiPixelRecHitHeterogeneous::beginStreamGPUCuda(edm::StreamID streamId, cuda::stream_t<>& cudaStream) {
gpuAlgo_ = std::make_unique<pixelgpudetails::PixelRecHitGPUKernel>();
}

for (int i=0;i<10;++i) std::cout << phase1PixelTopology::layerName[i] << ':' << hitsLayerStart[i] << ' ';
std::cout << "end:" << hitsLayerStart[10] << std::endl;

cudaCheck(cudaMemcpyAsync(gpu_.hitsLayerStart_d, hitsLayerStart, (11) * sizeof(uint32_t), cudaMemcpyDefault, stream.id()));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this is not safe because hitsLayerStart is a local array. It should have a life time longer than the async copy takes (effectively to be a member of PixelRecHitGPUKernel).


output->shrink_to_fit();
output->collection.shrink_to_fit();
iEvent.put(std::move(output));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be changed to iEvent.put<Output>(std::move(output)) to produce the HeterogeneousProduct for CPU.

siPixelRecHitsPreSplitting = siPixelRecHits.clone(
src = 'siPixelClustersPreSplitting'
)


from RecoLocalTracker.SiPixelRecHits.siPixelRecHitHeterogeneous_cfi import siPixelRecHitHeterogeneous

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #77 when doing the same for raw2cluster I put this import to RecoLocalTracker_cff.py.


iEvent.put(std::move(output));
iEvent.put<Output>(std::move(output), [this, hits, hclusters](const GPUProduct&, CPUProduct& cpu) {
this->convertGPUtoCPU(hclusters, hits, cpu);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could avoid capturing hits by using the HitsOnCPU as the GPUProduct.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the lifetime of the lambda? in principle in this way once the lambda is gone hits are gone as well
while as part of GPUProduct will stay in the event

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda is store as part of the HeterogeneousProduct' (as is the GPUProduct`), so it will stay to the end of the event anyway.

from RecoLocalTracker.SiPixelRecHits.siPixelRecHitHeterogeneous_cfi import siPixelRecHitHeterogeneous

from RecoLocalTracker.SiPixelRecHits.siPixelRecHitHeterogeneousConverter_cfi import siPixelRecHitHeterogeneousConverter as _siPixelRecHitHeterogeneousConverter
gpu.toReplaceWith(siPixelRecHitsPreSplitting, _siPixelRecHitHeterogeneousConverter.clone())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing siPixelRecHits before the clone() to ...PreSplitting would work as well (then in principle the default src should be kept as siPixelClusters).

Combining with my comment above, this file could be left untouched.

constexpr char const * layerName[10] = {"BL1","BL2","BL3","BL4",
"E+1", "E+2", "E+3",
"E-1", "E-2", "E-3"
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note that this introduces yet another convention for naming pixel layers. It seems to be only for debug prints, so I don't object.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@makortel what are the "usual" pixel layer names, and where are they defined ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We (=CMS) have many conventions in different places

I don't think none of these is authoritative enough to suggest a change in here (since they're for printouts only, for configuration input's I'd probably suggest the seeding layers' convention). Anyway these have the nice property (on purpose?) that the length of BPix and FPix strings are the same (that none of the other conventions have).

@fwyzard

This comment has been minimized.

@fwyzard
Copy link

fwyzard commented Jun 14, 2018

Apart from the requested changes and merge conflicts, I have the impression that the GPU workflows (10824.8) do not produce any tracks any more ?

@makortel
Copy link

Apart from the requested changes and merge conflicts, I have the impression that the GPU workflows (10824.8) do not produce any tracks any more ?

From https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/42f6f40e4973cddd77d4138005948039b75ff1ca/RelValTTbar_13-CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/10824.8/plots_summary.html
there are 303 tracks (all fake) with this PR while the "development" gives 42766.

@VinInn
Copy link
Author

VinInn commented Jun 15, 2018

impressive!
I suspect I know why... (local vs global coords)

@VinInn
Copy link
Author

VinInn commented Jun 15, 2018

btw what is the new baseline w/r/t I have to sync?

@fwyzard
Copy link

fwyzard commented Jun 15, 2018

CMSSW_10_2_0_pre5_Patatrack - I think the instructions [here](at https://patatrack.web.cern.ch/patatrack/wiki/patatrackdevelopment.html) are up to date.

@VinInn
Copy link
Author

VinInn commented Jun 15, 2018

I started from CMSSW_10_2_0_pre5_Patatrack ...
will have a look to the conflicts...

@cmsbot
Copy link

cmsbot commented Jun 16, 2018

Pull request #81 was updated. @cmsbot, @fwyzard can you please check and sign again.

@cmsbot
Copy link

cmsbot commented Jun 16, 2018

Pull request #81 was updated. @cmsbot, @fwyzard can you please check and sign again.

@fwyzard

This comment has been minimized.

@fwyzard
Copy link

fwyzard commented Jun 16, 2018

Indeed, now this PR reproduces the results of the development branch.

@fwyzard
Copy link

fwyzard commented Jun 29, 2018

Validation summary

Reference release CMSSW_10_2_0_pre5 at 30c7b03
Development branch CMSSW_10_2_X_Patatrack at 10d59f2
Testing PRs:

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre5-PU25ns_102X_upgrade2018_realistic_v1-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre5-102X_upgrade2018_realistic_v1-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre5-PU25ns_102X_upgrade2018_realistic_v1-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre5-102X_upgrade2018_realistic_v1-v1/GEN-SIM-DIGI-RAW

  • reference DQM plots for reference release, workflow 10824.5
  • DQM plots for development release, workflow 10824.5
  • DQM plots for development release, workflow 10824.8 are missing
  • DQM plots for development release, workflow 10824.7
  • DQM plots for development release, workflow 10824.9
  • DQM plots for testing release, workflow 10824.5
  • DQM plots for testing release, workflow 10824.8 are missing
  • DQM plots for testing release, workflow 10824.7
  • DQM plots for testing release, workflow 10824.9
  • DQM comparison for reference workflow 10824.5
  • DQM comparison for workflow 10824.8
  • DQM comparison for workflow 10824.7
  • DQM comparison for workflow 10824.9

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_2_0_pre5-PU25ns_102X_upgrade2018_realistic_v1-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre5-102X_upgrade2018_realistic_v1-v1/GEN-SIM-DIGI-RAW

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/d9fd2c79bc3226ea452e21c0d0431bed4b281e91/log .

@fwyzard fwyzard merged commit a9f0289 into cms-patatrack:CMSSW_10_2_X_Patatrack Jun 29, 2018
fwyzard pushed a commit that referenced this pull request Jun 29, 2018
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.

std::cout << "hit layerStart ";
for (int i=0;i<10;++i) std::cout << phase1PixelTopology::layerName[i] << ':' << hitsLayerStart_[i] << ' ';
std::cout << "end:" << hitsLayerStart_[10] << std::endl;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, now we have a per-event (non-thread safe) printout from here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be removed at next iteration when the sorting will be implemented as well

@fwyzard fwyzard added this to the CMSSW_10_2_0_pre6_Patatrack milestone Jul 5, 2018
fwyzard pushed a commit that referenced this pull request Nov 1, 2018
fwyzard pushed a commit that referenced this pull request Oct 8, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Oct 19, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Oct 20, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Oct 23, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Nov 6, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Nov 16, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Dec 25, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Dec 29, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
fwyzard pushed a commit that referenced this pull request Dec 29, 2020
Migrate PixelRecHit EDProducer to HeterogeneousEDProducer, including the cpu product.
Data structures on gpu now include everything needed for Doublets, CA and fit.
Layer splitting done: phi sorting (or partial sorting) requires #69.
Includes some cleanup and bug fixes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants