Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phase II Patatrack Pixel Local Reco #36235

Merged
merged 16 commits into from
Dec 22, 2021

Conversation

AdrianoDee
Copy link
Contributor

@AdrianoDee AdrianoDee commented Nov 24, 2021

PR description:

Allowing Patatrack tracker local reco to run on Phase2 tracker. A summary:

  • rising maxNumModules to 4000 to allow for Phase2 tracker geometry to be accommodated;
  • introducing maxNumDigis (set based on ttbar events @ =200);
  • rising number of layers to 28 for TrackingRecHit2DSOAView:: PhiBinner. This in principle could be templated but seems to me it would mess up with too many specialized definitions.
  • adding m_nMaxModules (and it's getter nMaxModules() for TrackingRecHit2DSOAView and TrackingRecHit2DHeterogeneous to determine hits module structures sizes; added in the constructor for TrackingRecHit2DHeterogeneous.
  • renamed phase1PixelTopology to pixelTopology and separating the two topologies in two namespaces (namely phase1PixelTopology and phase2PixelTopology);
  • splitting SiPixelRawToClusterCUDA in two branches to account for digis already being there for Phase2;
  • adding Phase2 digi calibrations in gpuCalibPixel
  • extending pixelCPEforGPU::CommonParams to include maxModuleStride and numberOfLaddersInBarrel to have them propagated around being different for Phase1 and Phase2;
  • extending pixelCPEforGPU::DetParams to include nRowsRoc, nColsRoc, nRows, nCols and numPixsInModule being not constant among modules for Phase2;
  • fixing tests all around to take into account the changes;
  • adding isUpgrade_ flag or template all around;

PR validation:

Attaching in the first comment a validation run via cms-patatrack/patatrack-validation on a P100 at T2@Bari.

Local tracker validation for Run3 samples (comparison 12_2_0_pre2 vs this PR for CPU/GPU):

Local tracker validation for Phase2 samples (comparison legacy reco from 12_2_0_pre2 vs this PR for CPU/GPU):

A simple customization function to run the new local reco on top of a generic wf here. This may be tested, e.g., on top of 38634.1 workflow (on step3).

This will conflict with #36215 and #36176. As soon as those are merged a fix will be committed here. In a following PR the modifications to Pixel Tracks reco will be addressed.

cc: @mtosi @vmariani @mmusich @VinInn
(probably you will be notified anyway)

@AdrianoDee
Copy link
Contributor Author

Validation plots

/RelValTTbar_14TeV/CMSSW_12_1_0_pre5-PU_121X_mcRun3_2021_realistic_v15-v2/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 11634.501
  • tracking validation plots and summary for workflow 11634.505
  • tracking validation plots and summary for workflow 11634.502
  • tracking validation plots and summary for workflow 11634.506

/RelValZMM_14/CMSSW_12_1_0_pre5-PU_121X_mcRun3_2021_realistic_v15-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 11634.501
  • tracking validation plots and summary for workflow 11634.505
  • tracking validation plots and summary for workflow 11634.502
  • tracking validation plots and summary for workflow 11634.506

Validation plots (CPU vs GPU)

/RelValTTbar_14TeV/CMSSW_12_1_0_pre5-PU_121X_mcRun3_2021_realistic_v15-v2/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflows 11634.502 and 11634.501
  • tracking validation plots and summary for workflows 11634.506 and 11634.505

/RelValZMM_14/CMSSW_12_1_0_pre5-PU_121X_mcRun3_2021_realistic_v15-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflows 11634.502 and 11634.501
  • tracking validation plots and summary for workflows 11634.506 and 11634.505

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

scan-136.885502.png
zoom-136.885502.png
scan-136.885512.png
zoom-136.885512.png
scan-136.885522.png
zoom-136.885522.png

logs and nvprof/nvvp profiles

/RelValTTbar_14TeV/CMSSW_12_1_0_pre5-PU_121X_mcRun3_2021_realistic_v15-v2/GEN-SIM-DIGI-RAW

  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.501
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.505
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.511
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.521
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.501
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.505
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.511
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.521
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors

/RelValZMM_14/CMSSW_12_1_0_pre5-PU_121X_mcRun3_2021_realistic_v15-v1/GEN-SIM-DIGI-RAW

  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.501
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.505
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.511
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.521
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.522
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.501
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.505
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.502
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.506
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.511
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.512
    • ✔️ step3.py: log
    • ✔️ profile.py: log
    • compute-sanitizer --tool initcheck (report, log) found some errors
    • compute-sanitizer --tool memcheck --leak-check full --report-api-errors all (report, log) found some errors
    • compute-sanitizer --tool synccheck (report, log) found some errors
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.521
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 11634.522
  • reference/CMSSW_12_2_X_2021-11-18-1100 release, workflow 136.885522
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 136.885502
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 136.885512
  • testing/CMSSW_12_2_X_2021-11-18-1100 release, workflow 136.885522

Logs

The full log is available at http://adiflori.web.cern.ch/pulls/b167b16f84d9e85571db9f294f87ce32e1532d56/log .

@cmsbuild
Copy link
Contributor

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36235/26865

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@mmusich
Copy link
Contributor

mmusich commented Nov 24, 2021

@emiglior FYI

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-36235/26866

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @AdrianoDee for master.

It involves the following packages:

  • CUDADataFormats/SiPixelCluster (heterogeneous, reconstruction)
  • CUDADataFormats/TrackingRecHit (heterogeneous, reconstruction)
  • CalibTracker/SiPixelLorentzAngle (alca)
  • Geometry/TrackerGeometryBuilder (geometry)
  • RecoLocalTracker/SiPixelClusterizer (reconstruction)
  • RecoLocalTracker/SiPixelRecHits (reconstruction)
  • RecoPixelVertexing/PixelTriplets (reconstruction)

@malbouis, @civanch, @yuanchao, @makortel, @cvuosalo, @fwyzard, @ianna, @mdhildreth, @cmsbuild, @Dr15Jones, @slava77, @jpata, @tvami, @francescobrivio can you please review it and eventually sign? Thanks.
@tvami, @fabiocos, @felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @bsunanda, @OzAmram, @tocheng, @ferencek, @mtosi, @gpetruc, @mmusich, @dkotlins, @threus, @dgulhan, @venturia this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor

mmusich commented Nov 24, 2021

@srimanob FYI. Would be eventually nice to have some wf to test phase-2 GPU reconstruction.

@AdrianoDee
Copy link
Contributor Author

@cmsbuild please test

@tvami
Copy link
Contributor

tvami commented Nov 24, 2021

@cmsbuild please test

@AdrianoDee shouldnt enable gpu be added?

@AdrianoDee
Copy link
Contributor Author

@cmsbuild please abort

@AdrianoDee
Copy link
Contributor Author

enable gpu

@AdrianoDee
Copy link
Contributor Author

@cmsbuild please test

@srimanob
Copy link
Contributor

+Upgrade

@cvuosalo
Copy link
Contributor

+1

@AdrianoDee AdrianoDee deleted the phaseII_patatrack_local_123X branch December 23, 2021 11:24

auto const& input = iEvent.get(pixelDigiToken_);

const TrackerGeometry* geom_ = &iSetup.getData(geomToken_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The esConsumes() is missing for geomToken_ and that is causing workflow 39434.502 step 3 to fail on a machine with a GPU (see #36604).

@jfernan2
Copy link
Contributor

+1

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-227853/34793/summary.html
COMMIT: e63524a
CMSSW: CMSSW_13_3_X_2023-09-17-2300/el8_amd64_gcc11
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/36235/34793/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

ValueError: Undefined workflows: 39434.5, 39434.501, 39434.502

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 11 differences found in the comparisons
  • DQMHistoTests: Total files compared: 3
  • DQMHistoTests: Total histograms compared: 40118
  • DQMHistoTests: Total failures: 457
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 39661
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 2 files compared)
  • Checked 8 log files, 10 edm output root files, 3 DQM output files
  • TriggerResults: no differences found

@fwyzard
Copy link
Contributor

fwyzard commented Jun 20, 2024

+heterogeneous

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (but tests are reportedly failing).

@cmsbuild
Copy link
Contributor

cms-bot internal usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.