Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve replacement of OutputModules in HLT ConfDB utilities #39177

Merged
merged 1 commit into from
Aug 26, 2022

Conversation

missirol
Copy link
Contributor

PR description:

This PR suggests a small improvement in how HLT utilities customise for offline usage the OutputModules defined in ConfDB configurations. The current implementation assumes certain values for module parameters like compression_algorithm (and others), and this is not necessary.

Merely technical. Intended to be fully backward compatible. No changes expected in outputs of PR tests.

Tagging @fwyzard and @Sam-Harper to review.

PR validation:

Manual tests on a limited number of HLT menus.

If this PR is a backport, please specify the original PR and why you need to backport that PR. If this PR will be backported, please specify to which release cycle the backport is meant for:

CMSSW_12_4_X

),
\g<8>
""", self.data, 0, re.DOTALL)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used regular strings as it seemed more readable. It can be reverted to raw strings if this is preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raw strings tend to be easier to use with regexp because of the escape sequences, but if those are not a problem, regular strings are fine.
Note that IIRC we can also have multi-line raw strings, e.g.

r"""
...
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a wish, not even a feature request... could you look into reusing the same compression algorithm and level also in the root output ?

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-39177/31770

  • This PR adds an extra 24KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @missirol (Marino Missiroli) for master.

It involves the following packages:

  • HLTrigger/Configuration (hlt)

@cmsbuild, @missirol, @Martin-Grunewald can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @silviodonato this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy, @rappoccio you are the release manager for this.

cms-bot commands are listed here

@missirol
Copy link
Contributor Author

please test

After more checks, I didn't find issues, so I'll start moving forward.

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b9f2e9/27093/summary.html
COMMIT: 1f6ea3c
CMSSW: CMSSW_12_5_X_2022-08-25-1100/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week1/cms-sw/cmssw/39177/27093/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 4 differences found in the comparisons
  • DQMHistoTests: Total files compared: 51
  • DQMHistoTests: Total histograms compared: 3693084
  • DQMHistoTests: Total failures: 8
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3693054
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 50 files compared)
  • Checked 212 log files, 49 edm output root files, 51 DQM output files
  • TriggerResults: no differences found

@missirol
Copy link
Contributor Author

+hlt

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy, @rappoccio (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

+1

@cmsbuild cmsbuild merged commit d64bcb8 into cms-sw:master Aug 26, 2022
@missirol missirol deleted the devel_hltOutputModuleRepl branch August 26, 2022 08:37
self.data = re.sub("""\
\\b(process\.)?hltOutput(\w+) *= *cms\.OutputModule\( *"(EvFOutputModule|GlobalEvFOutputModule)" *,
use_compression = cms.untracked.bool\( (True|False) \),
compression_algorithm = cms.untracked.string\( "(.+?)" \),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd change " to ['"] since ' is a valid delimiter for strings

@missirol
Copy link
Contributor Author

@fwyzard , thanks for having a look.

Trying to address your comments at once, we could do the following:

diff --git a/HLTrigger/Configuration/python/Tools/confdb.py b/HLTrigger/Configuration/python/Tools/confdb.py
index 18c5db0610b..d8ef59d7da5 100644
--- a/HLTrigger/Configuration/python/Tools/confdb.py
+++ b/HLTrigger/Configuration/python/Tools/confdb.py
@@ -479,16 +479,18 @@ from HLTrigger.Configuration.CustomConfigs import L1REPACK
       )
 
       self.data = re.sub("""\
-\\b(process\.)?hltOutput(\w+) *= *cms\.OutputModule\( *"(EvFOutputModule|GlobalEvFOutputModule)" *,
+\\b(process\.)?hltOutput(\w+) *= *cms\.OutputModule\( *['"](EvFOutputModule|GlobalEvFOutputModule)['"] *,
     use_compression = cms.untracked.bool\( (True|False) \),
-    compression_algorithm = cms.untracked.string\( "(.+?)" \),
+    compression_algorithm = cms.untracked.string\( ['"](.+?)['"] \),
     compression_level = cms.untracked.int32\( (-?\d+) \),
     lumiSection_interval = cms.untracked.int32\( (-?\d+) \),
 (.+?),
-    psetMap = cms.untracked.InputTag\( "hltPSetMap" \)
+    psetMap = cms.untracked.InputTag\( ['"]hltPSetMap['"] \)
 ""","""\
-\g<1>hltOutput\g<2> = cms.OutputModule( "PoolOutputModule",
+%(process)s.hltOutput\g<2> = cms.OutputModule( "PoolOutputModule",
     fileName = cms.untracked.string( "output\g<2>.root" ),
+    compressionAlgorithm = cms.untracked.string( "\g<5>" ),
+    compressionLevel = cms.untracked.int32( \g<6> ),
     fastCloning = cms.untracked.bool( False ),
     dataset = cms.untracked.PSet(
         filterName = cms.untracked.string( "" ),

A couple of notes:

  • indeed I hadn't thought of multiline raw strings.. I learned something new; what I still prefer about standard strings is that one can use """\ in the very first line, to indent that line in a slightly more readable way (I couldn't find a way to do the same with raw strings..)
  • also, I changed \g<1> to %(process); maybe [*] it doesn't make a difference here, but it looks like the correct thing to do

[*] IIrc, %(process) and \g<1> both correspond to process except when --fragment/--cff is used, but in the latter case the output modules are dropped from the configuration.

@missirol
Copy link
Contributor Author

missirol commented Sep 2, 2022

Unless there are other comments, I will make PRs for 12_{4,5,6}_X} with the patch in #39177 (comment) in the next days.

@Martin-Grunewald
Copy link
Contributor

Sorry, why would you use process in case of cffs where it is fragment (notwithstanding that currently in cffs we do not have OMs)? Ie, keep \g<1>...

@missirol
Copy link
Contributor Author

missirol commented Sep 2, 2022

I would use %(process), not process.

%(process) translates to what the output configuration needs to have, which is fragment in cffs; \g<1> is what comes from ConfDB, which corresponds to process or nothing.

%(process) is indeed what was used in the replacement of the Run-1 output module:

r'%(process)s.hltOutput\2 = cms.OutputModule( "PoolOutputModule",\n fileName = cms.untracked.string( "output\2.root" ),\n fastCloning = cms.untracked.bool( False ),\n dataset = cms.untracked.PSet(\n filterName = cms.untracked.string( "" ),\n dataTier = cms.untracked.string( "RAW" )\n ),',

@Martin-Grunewald
Copy link
Contributor

Ah ok!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants