Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HCALDQM - Add occupancy-per-LS based ML monitoring plots #42212

Merged
merged 1 commit into from
Jul 27, 2023

Conversation

lwang046
Copy link
Contributor

@lwang046 lwang046 commented Jul 7, 2023

PR description:

Add an HcalMLTask to run ML models at the end of each LS for each cell, model description can be found here.

The configurable flagDecisionThr is a sensitivity control variable for the anomaly detection to balance between false negatives (missing to capture anomaly channel) and false positives (mis-flagging of healthy channel). It is a threshold value used when generating channel anomaly status flags from the estimated channel anomaly score strength. The higher flagDecisionThr reduces the anomaly detection sensitivity (captures extreme channel faults such as dead and hot issues but may miss some anomalies). In contrast, the lowering the flagDecisionThr increases the sensitivity (captures also degrading channels but may also increase false detections). Generally, the anomaly scores are proportional to channel deterioration (degrading, dead, and hot channel faults, respectively).
The general recommend values for the latest models are between 10 and 30. If the AD model is generating too much false flags, increase the flagDecisionThr and if it is missing too much actual anomalies, reduce flagDecisionThr. Note that tuning flagDecisionThr is not repeatative task and may need to be done only during initial phase of the deployment and not for every experiment.

PR validation:

Tested with runTheMatrix.py, desired plots were shown added.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2023

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42212/36218

  • This PR adds an extra 26048KB to repository

  • Found files with invalid states:

    • DQM/HcalTasks/data/models/HB_2022/CGAE_MultiDim_SPATIAL_vONNX_RCLv22_PIXEL_BT_BN_RIN_IPHI_MED_7763_v06_02_2023_22h55_stateful.onnx:

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2023

-code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42212/36219

  • This PR adds an extra 26044KB to repository

  • Found files with invalid states:

    • DQM/HcalTasks/data/models/HB_2022/CGAE_MultiDim_SPATIAL_vONNX_RCLv22_PIXEL_BT_BN_RIN_IPHI_MED_7763_v06_02_2023_22h55_stateful.onnx:

Code check has found code style and quality issues which could be resolved by applying following patch(s)

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2023

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42212/36220

  • This PR adds an extra 26052KB to repository

  • Found files with invalid states:

    • DQM/HcalTasks/data/models/HB_2022/CGAE_MultiDim_SPATIAL_vONNX_RCLv22_PIXEL_BT_BN_RIN_IPHI_MED_7763_v06_02_2023_22h55_stateful.onnx:

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2023

A new Pull Request was created by @lwang046 for master.

It involves the following packages:

  • DQM/HcalTasks (dqm)
  • DQM/Integration (dqm)

@nothingface0, @emanueleusai, @cmsbuild, @pmandrik, @syuvivida, @tjavaid, @micsucmed, @rvenditti can you please review it and eventually sign? Thanks.
@DryRun, @threus, @bsunanda, @francescobrivio, @abdoulline, @batinkov this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@emanueleusai
Copy link
Member

type hcal

@emanueleusai
Copy link
Member

please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23dc62/33625/summary.html
COMMIT: 3466355
CMSSW: CMSSW_13_2_X_2023-07-10-2300/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/42212/33625/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 7 lines to the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3193892
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3193870
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

Copy link
Member

@emanueleusai emanueleusai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these comments be removed?

DQM/HcalTasks/plugins/HcalMLTask.cc Outdated Show resolved Hide resolved
DQM/HcalTasks/plugins/HcalMLTask.cc Outdated Show resolved Hide resolved
DQM/HcalTasks/plugins/HcalMLTask.cc Outdated Show resolved Hide resolved
DQM/HcalTasks/plugins/HcalMLTask.cc Outdated Show resolved Hide resolved
DQM/HcalTasks/plugins/HcalMLTask.cc Outdated Show resolved Hide resolved
@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-42212/36383

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

Pull request #42212 was updated. @nothingface0, @emanueleusai, @cmsbuild, @pmandrik, @syuvivida, @tjavaid, @micsucmed, @rvenditti can you please check and sign again.

@srimanob
Copy link
Contributor

test parameters:

@srimanob
Copy link
Contributor

@cmsbuild please test

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23dc62/33884/summary.html
COMMIT: f4650a5
CMSSW: CMSSW_13_3_X_2023-07-25-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/42212/33884/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23dc62/33884/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-23dc62/33884/git-merge-result

Comparison Summary

Summary:

  • You potentially removed 13 lines from the logs
  • Reco comparison results: 1 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3150117
  • DQMHistoTests: Total failures: 3
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3150092
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@emanueleusai
Copy link
Member

+1

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@muleina
Copy link

muleina commented Aug 31, 2023

A fix for 42445

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants