Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GlobalAlgBlkUnpacker to throw error in the case of headers received in the wrong order #41401

Merged
merged 1 commit into from
Apr 25, 2023

Conversation

aloeliger
Copy link
Contributor

@aloeliger aloeliger commented Apr 25, 2023

PR description:

This PR has been opened in response to L1 JIRA ticket 411. HLT has noticed problems regarding the L1TRawToDigi process, which has been tracked to an issue in the GlobalAlgBlk unpacker receiving blocks and header IDs out of order. This causes a bad access/->at call,of a following BXVector, which fails somewhat un-informatively. This PR has been opened to produce a more informative failure message in case the initial header ID assumption fails.

@dinyar @kbunkow @missirol @alintulu I figured you would like to be kept in the loop on this.

PR validation:

All code compiles, has had code-formatting applied, and passes code checks. When testing on error events from the open JIRA ticket, the cms::exception reports instead of bad vector access errors.

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

This PR is not a backport, however a backport may be expected for CMSSW_13_0.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41401/35266

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @aloeliger (Andrew Loeliger) for master.

It involves the following packages:

  • EventFilter/L1TRawToDigi (l1)

@epalencia, @cmsbuild, @cecilecaillol, @aloeliger can you please review it and eventually sign? Thanks.
@dinyar, @missirol, @Martin-Grunewald, @thomreis, @eyigitba this is something you requested to watch as well.
@perrotta, @dpiparo, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@aloeliger
Copy link
Contributor Author

please test

@aloeliger
Copy link
Contributor Author

please abort

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-41401/35267

  • This PR adds an extra 16KB to repository

@cmsbuild
Copy link
Contributor

Pull request #41401 was updated. @epalencia, @cmsbuild, @cecilecaillol, @aloeliger can you please check and sign again.

@aloeliger
Copy link
Contributor Author

please test

@dinyar
Copy link
Contributor

dinyar commented Apr 25, 2023

Hi Andrew, this looks good to me! One thing I noticed during debugging is that https://github.com/cms-sw/cmssw/blob/CMSSW_13_0_3/EventFilter/L1TRawToDigi/plugins/L1TRawToDigi.cc#L269-L276 are LogDebug statements, even though they may indicate quite serious issues. I was wondering if those shouldn't be upgraded to LogWarnings, what do you (or @missirol) think?

Cheers,
Dinyar

@aloeliger
Copy link
Contributor Author

@dinyar In principle I would agree, but that is probably not for this PR. I also noticed that there are some TODO statements there about handling the error. I don't know how one would handle that, but @bundocka is actually listed as the author of that particular block, so perhaps we can ask him to clarify if that should be changed to an exception, or if there was ever planned error handling of that section that simply never got implemented.

@missirol
Copy link
Contributor

I was wondering if those shouldn't be upgraded to LogWarnings, what do you (or @missirol) think?

I know too little about the L1T unpackers to judge this, but I agree those likely need to be changed to at least LogWarning.

but that is probably not for this PR

I agree with this, too. The reason is that it would be useful to have the backport included in the next 13_0_X release, which means getting #41402 integrated asap.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-c28387/32129/summary.html
COMMIT: f93445c
CMSSW: CMSSW_13_1_X_2023-04-25-1100/el8_amd64_gcc11
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/41401/32129/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • You potentially added 16 lines to the logs
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3460685
  • DQMHistoTests: Total failures: 0
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3460663
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 207 log files, 159 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@aloeliger
Copy link
Contributor Author

aloeliger commented Apr 25, 2023

+l1

  • Backport desired for data-taking
  • Updates error message to be informative

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit 0c3a8c6 into cms-sw:master Apr 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants