Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[14_0_X] DQM: temporary fix of the crash of hlt DQM client using TryToContinue #44652

Closed
wants to merge 3 commits into from
Closed

Conversation

syuvivida
Copy link
Contributor

PR description:

Starting from the first run with 13.6 TeV collisions and stable beam run378981, we see crashes from the client
hlt_dqm_sourceclient-live_cfg.py. More details are discussed in this github issue. While the CMSSW core team, HLT, and DQM are investigating the issues, we added a temporary fix using "TryToContinue" when a product is not found.

PR validation:

This PR was tested at p5 playback machines using the streamers containing the LSs of run 378981 in which this hlt client crashed. Additionally this PR has been deployed in online production machine starting from run 379059. No more crashes were observed. But given that the root cause was not yet resolved, one could see warning messages as

%MSG-e TrackRefitter:  TrackRefitter:hltTrackRefitterForSiStripMonitorTrack  07-Apr-2024 17:04:25 CEST Run: 379059 Event: 86050
could not get the reco::TrackCollection.hltMergedTracks
%MSG
%MSG-e SiStripMonitorTrack:  SiStripMonitorTrack:HLTSiStripMonitorTrack  07-Apr-2024 17:04:25 CEST Run: 379059 Event: 86050
ClusterCollection is not valid!!

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 8, 2024

A new Pull Request was created by @syuvivida for CMSSW_14_0_X.

It involves the following packages:

  • DQM/Integration (dqm)

@cmsbuild, @rvenditti, @syuvivida, @tjavaid, @nothingface0, @antoniovagnerini can you please review it and eventually sign? Thanks.
@threus, @batinkov, @francescobrivio this is something you requested to watch as well.
@sextonkennedy, @antoniovilela, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 8, 2024

cms-bot internal usage

@tjavaid
Copy link

tjavaid commented Apr 8, 2024

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Apr 8, 2024

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-98fa28/38672/summary.html
COMMIT: 435ed1c
CMSSW: CMSSW_14_0_X_2024-04-07-2300/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/44652/38672/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 95 lines to the logs
  • Reco comparison results: 46 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3346242
  • DQMHistoTests: Total failures: 6
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3346214
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 48 files compared)
  • Checked 205 log files, 166 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@cmsbuild
Copy link
Contributor

Pull request #44652 was updated. @cmsbuild, @nothingface0, @antoniovagnerini, @rvenditti, @tjavaid, @syuvivida can you please check and sign again.

@cmsbuild
Copy link
Contributor

Pull request #44652 was updated. @tjavaid, @antoniovagnerini, @syuvivida, @rvenditti, @nothingface0, @cmsbuild can you please check and sign again.

@syuvivida
Copy link
Contributor Author

Hi sorry, I was making commit of another code unrelated to HLT clients. I will close this PR first (since it is not clear if we need to add this protection)

@syuvivida syuvivida closed this Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants