Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBHEGPU: added taggedBadByDb by ChannelQuality in GPU [12_0_X] #35355

Merged
merged 2 commits into from
Sep 30, 2021

Conversation

mariadalfonso
Copy link
Contributor

@mariadalfonso mariadalfonso commented Sep 21, 2021

This PR add a functionality to the Mahi on GPU that take into account the HcalChannelQuality to resynch with what is done on CPU

On CPU the rechit is dropped, while for the GPU energy is set to 0.
Tested on Run3 MC, and disable the ieta=18, depth1. (Those channel are present only at DIGI, but the fiber is not connected to the scintillator, so energy is not meaningful.)

Help in solving some differences observed between the Run3 HLT menu

backport of #35357

@fwyzard @abdoulline @silviodonato

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 21, 2021

A new Pull Request was created by @mariadalfonso for CMSSW_12_0_X.

It involves the following packages:

  • CondFormats/HcalObjects (db, alca)
  • RecoLocalCalo/HcalRecProducers (reconstruction)

@malbouis, @yuanchao, @cmsbuild, @jpata, @slava77, @ggovi, @francescobrivio, @tvami can you please review it and eventually sign? Thanks.
@apsallid, @tocheng, @bsunanda, @mmusich, @abdoulline, @seemasharmafnal this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

enable gpu

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

@mariadalfonso is there a specific relval wf that tests this?

@mariadalfonso
Copy link
Contributor Author

@mariadalfonso is there a specific relval wf that tests this?

11634.522 this is TTbar_14TeV , 2021 , HCALOnlyGPU

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

test parameters:

  • workflows = 11634.522

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

@cmsbuild , please test

@abdoulline
Copy link

@mariadalfonso Thank you, Maria.
There is small typo in the introduction:
<...> On CPU the rechit is dropped, while for the CPU energy is set to 0 <...>
the latter is apparently GPU


cms::cuda::ESProduct<Product> product_;
#endif // __CUDACC__
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the (maybe) naive question, but should this object be COND_SERIALIZABLE ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mariadalfonso @fwyzard please have a look at this comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CondFormats/HcalObjects/interface/HcalChannelQuality.h is already COND_SERIALIZABLE
HcalChannelQualityGPU.h is not, but usually include the main .h
This is a common pattern for all conditions if need to do this change should be done in synch for all ECAL/HCAL/TRK ... and outside of this PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I don't think these conditions need to be declared COND_SERIALIZABLE.
They are never persisted or read from the database - they are only transient conditions, derived from the persistent ones and "repackaged" for use on GPUs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok thanks for the explanation @mariadalfonso and @fwyzard, makes sense!

@cmsbuild
Copy link
Contributor

-1

Failed Tests: RelVals
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4950a8/18791/summary.html
COMMIT: 25187e9
CMSSW: CMSSW_12_0_X_2021-09-21-1100/slc7_amd64_gcc900
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35355/18791/install.sh to create a dev area with all the needed externals and cmssw changes.

RelVals

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19735
  • DQMHistoTests: Total failures: 10
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19725
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

@mariadalfonso would have been better to start with the master, or at least wait with the second PR until tests and review finishes.
Anyway,

@mariadalfonso
Copy link
Contributor Author

mariadalfonso commented Sep 21, 2021

-1

Failed Tests: RelVals

Not sure why, I tested locally 11634.522 (GPU ) and compared with 11634.521 (CPU)

locally is discoverable like this: runTheMatrix.py -w gpu -l 11634.522

Traceback (most recent call last):
File "/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35355/18791/CMSSW_12_0_X_2021-09-21-1100/bin/slc7_amd64_gcc900/runTheMatrix.py", line 545, in
ret = runSelected(opt)
File "/cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/35355/18791/CMSSW_12_0_X_2021-09-21-1100/bin/slc7_amd64_gcc900/runTheMatrix.py", line 31, in runSelected
if len(undefSet)>0: raise ValueError('Undefined workflows: '+', '.join(map(str,list(undefSet))))
ValueError: Undefined workflows: 11634.522

@mariadalfonso
Copy link
Contributor Author

  • is this needed for the test beams in October?

please ask @fwyzard

@mariadalfonso mariadalfonso changed the title HBHEGPU: added taggedBadByDb by ChannelQuality in GPU HBHEGPU: added taggedBadByDb by ChannelQuality in GPU [12_0_X] Sep 21, 2021
@fwyzard
Copy link
Contributor

fwyzard commented Sep 21, 2021

is this needed for the test beams in October?

Yes.

@fwyzard
Copy link
Contributor

fwyzard commented Sep 21, 2021

test parameters:

* workflows = 11634.522

I think you need to enable the gpu tests

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

test parameters:

* workflows = 11634.522

I think you need to enable the gpu tests

I thought I did that here: #35355 (comment)

@mmusich
Copy link
Contributor

mmusich commented Sep 21, 2021

I thought I did that here: #35355 (comment)

I think the matrix should be told to use the gpu part of it. relvals_options = --what gpu or something like that.

@tvami
Copy link
Contributor

tvami commented Sep 21, 2021

test parameters:

  • workflows_gpu = 11634.522
  • enable_test = gpu
  • relvals_options = --what gpu

@cmsbuild
Copy link
Contributor

Pull request #35355 was updated. @malbouis, @yuanchao, @missirol, @Martin-Grunewald, @jpata, @cmsbuild, @slava77, @ggovi, @francescobrivio, @tvami can you please check and sign again.

@tvami
Copy link
Contributor

tvami commented Sep 28, 2021

@cmsbuild , please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-4950a8/19215/summary.html
COMMIT: 82a7d35
CMSSW: CMSSW_12_0_X_2021-09-28-1100/slc7_amd64_gcc900
Additional Tests: GPU
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/35355/19215/install.sh to create a dev area with all the needed externals and cmssw changes.

GPU Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 4
  • DQMHistoTests: Total histograms compared: 19735
  • DQMHistoTests: Total failures: 10
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 19725
  • DQMHistoTests: Total skipped: 0
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 3 files compared)
  • Checked 12 log files, 9 edm output root files, 4 DQM output files
  • TriggerResults: no differences found

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 39
  • DQMHistoTests: Total histograms compared: 2998564
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2998536
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 38 files compared)
  • Checked 165 log files, 37 edm output root files, 39 DQM output files
  • TriggerResults: no differences found

@tvami
Copy link
Contributor

tvami commented Sep 29, 2021

+alca

  • differences seen for HCAL GPU wf are expected

@tvami
Copy link
Contributor

tvami commented Sep 29, 2021

+db

@missirol
Copy link
Contributor

+hlt

@jpata
Copy link
Contributor

jpata commented Sep 29, 2021

+reconstruction

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next CMSSW_12_0_X IBs (tests are also fine) and once validation in the development release cycle CMSSW_12_1_X is complete. This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Sep 30, 2021

+1

@cmsbuild cmsbuild merged commit 91c9ba0 into cms-sw:CMSSW_12_0_X Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.