Add HLT-Scouting collections to `MINIAOD` event content (follow-up of #42863) #43327

missirol · 2023-11-18T18:36:14Z

PR description:

#42863 added the HLT-Scouting collections to the MINIAODSIM event content (i.e. MINIAOD content of standard MC samples). This PR suggests two improvements on top of #42863.

Add the HLT-Scouting collections to the MINIAOD event content (not only to MINIAODSIM), as per request of the Scouting group.
- Pro: for Primary Datasets (PDs, real data) whose RAW event content includes HLT-Scouting objects, the latter objects will also be included in MINIAOD. Right now, only one such PD exists, named ScoutingPFMonitor. It is used for offline studies related to Scouting, and the Scouting group currently uses a workflow with "two-file solution" to access offline objects from MINIAOD and HLT-Scouting objects from AOD. Having both in MINIAOD will simplify this workflow significantly. Note that this change (adding HLT-Scouting to MINIAOD) has no impact on any other PD, to my knowledge, as those PDs do not retain HLT-Scouting objects in RAW in the first place.
- Con 1: the size of the MINIAOD samples of the ScoutingPFMonitor PD will increase. This size increase has not been quantified. It is assumed to be at most 10% based on the checks done in Adding the scouting event content to MINIAODSIM #42863. Since this only applies to a single PD with relatively low rate (below ~40 Hz during normal pp data-taking in 2023), I dare say this cost is rather small. For example, if I check the total size of all the Run2023 MINIAOD samples on DAS, I get 1.65 PB. If I restrict that to the ScoutingPFMonitor PDs, I get 2.8 TB (0.16% of the total).
```
rm -f tmp.txt
for ddd in $(dasgoclient -query "dataset dataset=/*/*Run2023*/*MINIAOD* status=VALID"); do
  dasgoclient -query "file dataset=$ddd | sum(file.size)" >> tmp.txt
done
cat tmp.txt | awk '{sum += $2} END {print sum}'

rm -f tmp.txt
for ddd in $(dasgoclient -query "dataset dataset=/*Scouting*/*Run2023*/*MINIAOD* status=VALID"); do
  dasgoclient -query "file dataset=$ddd | sum(file.size)" >> tmp.txt
done
cat tmp.txt | awk '{sum += $2} END {print sum}'
```
- Con 2: the size of MINIAOD samples derived from data tiers such as FEVTDEBUGHLT will also increase (again by a guess-stimated ~10% or less). I do not know this kind of use cases in detail. I see this happens, for example, in wfs such as 141.001 where there is a reHLT step on data with --eventcontent FEVTDEBUGHLT (followed by a 2nd step with RECO, MINI, NANO, etc). Here too, I would guess this use case is limited, and the overall cost of this increase could be considered small.
Integrate this better in the way HLT currently provides collections to the 'central' event contents in CMSSW. This PR defines a PSet HLTriggerMINIAOD in HLTrigger/Configuration (HLTriggerMINIAOD in this PR includes only the HLT Scouting event content), similarly to the way HLTriggerAOD and others are defined. This part of the PR is purely technical, it's just meant to homogenise how extra HLT-related collections are inserted in different data tiers.

HLTrigger_EventContent_cff.py was not modified directly, but recreated by running an updated version of HLTrigger/Configuration/test/getEventContent.py.

If approved, I would suggest to backport this PR to CMSSW_13_3_X to keep HLTrigger/Configuration as similar as possible in 13_3_X (currently used for HLT-menu development) and later cycles (and to cover the unlikely scenario of taking data relevant to Scouting in 2024 with 13_3_X).

(Since changes to HLTrigger/Configuration/test are normally done only by @cms-sw/hlt-l2, I could also close this PR and let this update be done by TSG/STORM in one of the next HLT PRs.)

Attn: @elfontan @kelmorab (TSG/Scouting conveners)

PR validation:

Ran a couple of runTheMatrix.py wfs for Run-3 data and MC, and checked that the HLT-Scouting collections are present in the MINIAOD(SIM) outputs.

If this PR is a backport, please specify the original PR and why you need to backport that PR. If this PR will be backported, please specify to which release cycle the backport is meant for:

CMSSW_13_3_X

…ms-sw#42863)

cmsbuild · 2023-11-18T18:41:34Z

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-43327/37777

This PR adds an extra 24KB to repository
There are other open Pull requests which might conflict with changes you have proposed:
- File Configuration/EventContent/python/EventContent_cff.py modified in PR(s): New json dump #43318
- File HLTrigger/Configuration/python/HLTrigger_EventContent_cff.py modified in PR(s): New json dump #43318

cmsbuild · 2023-11-18T18:41:58Z

A new Pull Request was created by @missirol (Marino Missiroli) for master.

It involves the following packages:

Configuration/EventContent (operations)
HLTrigger/Configuration (hlt)

@cmsbuild, @rappoccio, @mmusich, @fabiocos, @antoniovilela, @davidlange6, @Martin-Grunewald can you please review it and eventually sign? Thanks.
@fabiocos, @Martin-Grunewald, @silviodonato this is something you requested to watch as well.
@rappoccio, @sextonkennedy, @antoniovilela you are the release manager for this.

cms-bot commands are listed here

mmusich · 2023-11-18T19:03:07Z

@cmsbuild please test

cmsbuild · 2023-11-18T21:50:22Z

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-29b123/35941/summary.html
COMMIT: 86ee6b5
CMSSW: CMSSW_14_0_X_2023-11-18-1100/el8_amd64_gcc12
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/43327/35941/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

There are some workflows for which there are errors in the baseline:
141.001 step 2
141.008505 step 2
141.008521 step 2
141.112 step 2
141.11 step 2
The results for the comparisons for these workflows could be incomplete
This means most likely that the IB is having errors in the relvals.The error does NOT come from this pull request

Summary:

You potentially removed 365 lines from the logs
ROOTFileChecks: Some differences in event products or their sizes found
Reco comparison results: 141 differences found in the comparisons
DQMHistoTests: Total files compared: 50
DQMHistoTests: Total histograms compared: 3363868
DQMHistoTests: Total failures: 2389
DQMHistoTests: Total nulls: 0
DQMHistoTests: Total successes: 3361457
DQMHistoTests: Total skipped: 22
DQMHistoTests: Total Missing objects: 0
DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
Checked 214 log files, 167 edm output root files, 50 DQM output files
TriggerResults: no differences found

Martin-Grunewald · 2023-11-20T07:42:33Z

@missirol
Hmm, on one side I see the solution adopted here given the precedent. On the other side it looks like HLT is (ab)used to solve offline probems. Since it is straight forward and I do not have a better solution, OK, let's go ahead.
Could you please make the 13_3 backport PR?
Thanks!

mmusich · 2023-11-20T09:51:33Z

+hlt

as per Add HLT-Scouting collections to MINIAOD event content (follow-up of #42863) #43327 (comment) and M.M. deeming uncontroversial.

rappoccio · 2023-11-20T14:31:55Z

+1

cmsbuild · 2023-11-20T14:32:16Z

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

add HLT-Scouting collections to MINIAOD event content (follow-up of c…

86ee6b5

…ms-sw#42863)

cmsbuild added this to the CMSSW_14_0_X milestone Nov 18, 2023

cmsbuild added hlt-pending operations-pending pending-signatures tests-pending orp-pending code-checks-pending labels Nov 18, 2023

cmsbuild added code-checks-approved and removed code-checks-pending labels Nov 18, 2023

cmsbuild added tests-started and removed tests-pending labels Nov 18, 2023

cmsbuild added tests-approved and removed tests-started labels Nov 18, 2023

cmsbuild added hlt-approved and removed hlt-pending labels Nov 20, 2023

missirol mentioned this pull request Nov 20, 2023

Add HLT-Scouting collections to MINIAOD event content (follow-up of #42863) [13_3_X] #43331

Merged

cmsbuild added operations-approved fully-signed orp-approved and removed operations-pending pending-signatures orp-pending labels Nov 20, 2023

cmsbuild merged commit 3f852f7 into cms-sw:master Nov 20, 2023
11 checks passed

cmsbuild mentioned this pull request Nov 20, 2023

DisplacedRegionalStep track DNN #43336

Merged

missirol deleted the devel_hltScoutingInMINIAOD branch September 13, 2024 07:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add HLT-Scouting collections to `MINIAOD` event content (follow-up of #42863) #43327

Add HLT-Scouting collections to `MINIAOD` event content (follow-up of #42863) #43327

missirol commented Nov 18, 2023 •

edited

Loading

cmsbuild commented Nov 18, 2023

cmsbuild commented Nov 18, 2023

mmusich commented Nov 18, 2023

cmsbuild commented Nov 18, 2023

Martin-Grunewald commented Nov 20, 2023

mmusich commented Nov 20, 2023

rappoccio commented Nov 20, 2023

cmsbuild commented Nov 20, 2023

Add HLT-Scouting collections to MINIAOD event content (follow-up of #42863) #43327

Add HLT-Scouting collections to MINIAOD event content (follow-up of #42863) #43327

Conversation

missirol commented Nov 18, 2023 • edited Loading

PR description:

PR validation:

If this PR is a backport, please specify the original PR and why you need to backport that PR. If this PR will be backported, please specify to which release cycle the backport is meant for:

cmsbuild commented Nov 18, 2023

cmsbuild commented Nov 18, 2023

mmusich commented Nov 18, 2023

cmsbuild commented Nov 18, 2023

Comparison Summary

Martin-Grunewald commented Nov 20, 2023

mmusich commented Nov 20, 2023

rappoccio commented Nov 20, 2023

cmsbuild commented Nov 20, 2023

Add HLT-Scouting collections to `MINIAOD` event content (follow-up of #42863) #43327

Add HLT-Scouting collections to `MINIAOD` event content (follow-up of #42863) #43327

missirol commented Nov 18, 2023 •

edited

Loading