Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifiers not being set properly in EventConfig for data (GEM RecHits not in latest data) #38622

Merged
merged 1 commit into from
Jul 7, 2022

Conversation

watson-ij
Copy link
Contributor

@watson-ij watson-ij commented Jul 7, 2022

PR description:

We noticed there are no GEM rechits being added to the data output recently (in particular in the latest collision events), and when trying out Configuration/DataProcessing/test/RunPromptReco.py to see what gets added, I traced it back to Configuration/DataProcessing/python/Repack.py loading Configuration.EventContent.EventContent_cff before the era is set (it gets set after Scenario.py is loaded, which is loading Repack.py), so that modifier chains don't seem to be included at all. In particular, the run3_GEM modifier chain in RecoLocalMuon/Configuration/python/RecoLocalMuon_EventContent_cff.py is not set when the EventContent is loaded by the Repack.py file (but would be if the eras are set by the scenario before the EventContent cff is loaded). Could someone clarify if my understanding is correct but this is expected (it appears the change to Repack.py was made 2 months ago?), or if not expected is the fix okay and should I backport this? Otherwise let me know if my understanding is wrong.

@jshlee
@slowmoyang

PR validation:

Checked the output of Configuration/DataProcessing/test/RunPromptReco.py with ppEra_Run3 scenario before and after the change:

before the local muon content looks like:

    'keep *_dt4DSegments_*_*',
    'keep *_dt4DCosmicSegments_*_*',
    'keep *_cscSegments_*_*',
    'keep *_rpcRecHits_*_*'

after:

    'keep *_dt4DSegments_*_*',
    'keep *_dt4DCosmicSegments_*_*',
    'keep *_cscSegments_*_*',
    'keep *_rpcRecHits_*_*',
    'keep *_gemRecHits_*_*',
    'keep *_gemSegments_*_*'

I used the command

python3 ${CMSSW_BASE}/src/Configuration/DataProcessing/test/RunPromptReco.py \
    --scenario=ppEra_Run3 \
    --global-tag=123X_dataRun3_Prompt_v12 \
    --reco --aod \
    --lfn /store/whatever

If this PR is a backport please specify the original PR and why you need to backport that PR. If this PR will be backported please specify to which release cycle the backport is meant for:

Before submitting your pull requests, make sure you followed this checklist:

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2022

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2022

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-da417a/26035/summary.html
COMMIT: f2c154c
CMSSW: CMSSW_12_5_X_2022-07-06-2300/el8_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/38622/26035/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-da417a/26035/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-da417a/26035/git-merge-result

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 50
  • DQMHistoTests: Total histograms compared: 3654771
  • DQMHistoTests: Total failures: 14
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3654735
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 49 files compared)
  • Checked 208 log files, 45 edm output root files, 50 DQM output files
  • TriggerResults: no differences found

@seungjin-yang
Copy link
Contributor

@perrotta, @davidlange6, @germanfgv

Configuration/DataProcessing/test/RunPromptReco.py creates an instance of Process class using child classes of Scenario class.

from Configuration.DataProcessing.GetScenario import getScenario
class RunPromptReco:
    def __call__(self):
        ...
        try:
            scenario = getScenario(self.scenario) # NOTE: our self.scenario is "ppEra_Run3"
        ...
        process = scenario.promptReco(self.globalTag, **kwds)
        ...

So, we need to investigate Scenario. In Configuration/DataProcessing/python/Reco.py, which is a child of Scenario, you can see that Scenario module is loaded first, and then Reco class is defined. Then, promptReco method of Reco class creates an instance ofProcess class with ModifierChain with Era. It means when we run RunPromptReco.py, Scenario is loaded and then Process is created.

from Configuration.DataProcessing.Scenario import *
class Reco(Scenario):
    def __init__(self):
        ...
        self.cbSc=self.__class__.__name__
        self.promptModifiers = cms.ModifierChain()
        ...

    def promptReco(self, globalTag, **args):
        # NOTE: args (deliverd by RunPromptReco.py) == {'outputs': [{'dataTier': 'RECO', 'eventContent': 'RECO', 'moduleLabel': 'write_RECO'}]}
        ...
        options = Options()
        options.__dict__.update(defaultOptions.__dict__)
        options.scenario = self.cbSc % NOTE self.cbSc == "pp"
        options.step = 'RAW2DIGI,L1Reco,RECO'+self.recoSeq+eiStep+step+PhysicsSkimStep+miniAODStep+',DQM'+dqmStep+',ENDJOB'
        ... 
        process = cms.Process('RECO', cms.ModifierChain(self.eras, self.promptModifiers) )
        cb = ConfigBuilder(options, process = process, with_output = True)
        cb.prepare()
        ...
        return process

Let's keep track of Scenario module.

Configuration/DataProcessing/python/Scenario.py loaded the Repack module.

from Configuration.DataProcessing.Repack import repackProcess
class Scenario(object):
    ...

Configuration/DataProcessing/python/Repack.py loads EventContent_cff module, which loads a bunch of configuration files containing Modifier.toModify.

...
from Configuration.EventContent.EventContent_cff import RAWEventContent
...
def repackProcess(**args):
    ...

However, Modifier.toModify does nothing when its __chosen attribute is false and __chosen is set to false in __init__ method. Modifier.__chosen is set to true in Process.__init__. Because EventContent_cff module is loaded before Process.__init__ is called, all Modifier.toModify do nothing. (i.e. gemRecHits is not added into EventContent)

class Modifier(object):
    def __init__(self):
        self.__chosen = False
    def toModify(self,obj, func=None,**kw):
        Modifier._toModifyCheck(obj,func,**kw)
        if not self._isChosen():
            return
        Modifier._toModify(obj,func,**kw)
    def _setChosen(self):
        """Should only be called by cms.Process instances"""
        self.__chosen = True
    def _isChosen(self):
        return self.__chosen
class Process(object):
    def __init__(self,name,*Mods):
        ...
        self.__dict__['_Process__modifiers'] = Mods
        ...
        for m in self.__modifiers:
            m._setChosen()

Then, back to Reco.promptReco. promptReco method creates an instance of Process and then assigns a lot of attributes to Process using ConfigBuilder.prepare.

from Configuration.DataProcessing.Scenario import *
class Reco(Scenario):
    def __init__(self):
        ...
        self.cbSc=self.__class__.__name__
        self.promptModifiers = cms.ModifierChain()
        ...

    def promptReco(self, globalTag, **args):
        # NOTE: args (deliverd by RunPromptReco.py) == {'outputs': [{'dataTier': 'RECO', 'eventContent': 'RECO', 'moduleLabel': 'write_RECO'}]}
        ...
        options = Options()
        options.__dict__.update(defaultOptions.__dict__)
        options.scenario = self.cbSc % NOTE self.cbSc == "pp"
        options.step = 'RAW2DIGI,L1Reco,RECO'+self.recoSeq+eiStep+step+PhysicsSkimStep+miniAODStep+',DQM'+dqmStep+',ENDJOB'
        ... 
        process = cms.Process('RECO', cms.ModifierChain(self.eras, self.promptModifiers) )
        cb = ConfigBuilder(options, process = process, with_output = True)
        cb.prepare()
        ...
        return process

In Configuration/Applications/python/ConfigBuilder.py, we meet the event content configuration file. In the pp scenario, the used event content cff file is the same as one loaded in Configuration/DataProcessing/python/Repack.py.
Process loads "Configuration/EventContent/EventContent_cff" with modifiers whose __chosen are true using __import__ BUT python get EventContent_cff module from the global cache without creating new one. So, all Modifer.toModify are not called. So, gemRecHits and gemSegments are still dropped. In other hand, when the scenario is cosmics, new EventContentCosmics_cff and *_Cosmics_cff modules are loaded, Modifer.toModify is called, and gem objects are kept.

Please refer When I edit an imported module and reimport it, the changes don’t show up. Why does this happen? for more details.

class ConfigBuilder(object):
    def __init__(self, options, process = None, with_output = False, with_input = False ):
        ...
        self.EVTCONTDefaultCFF="Configuration/EventContent/EventContent_cff"
        ...
        if self._options.scenario=='cosmics':
            ...
            self.EVTCONTDefaultCFF="Configuration/EventContent/EventContentCosmics_cff"
            ...
        if self._options.scenario=='HeavyIons':
            ...
            self.EVTCONTDefaultCFF="Configuration/EventContent/EventContentHeavyIons_cff"
            ...

    def loadAndRemember(self, includeFile):
        """helper routine to load am memorize imports"""
        # we could make the imports a on-the-fly data method of the process instance itself
        # not sure if the latter is a good idea
        includeFile = includeFile.replace('/','.')
        self.imports.append(includeFile)
        self.process.load(includeFile)
        return sys.modules[includeFile]

    def prepare(self, doChecking = False):
        """ Prepare the configuration string and add missing pieces."""
        self.loadAndRemember(self.EVTCONTDefaultCFF)  #load the event contents regardless
        ...
class Process(object):
    def load(self, moduleName):
        moduleName = moduleName.replace("/",".")
        module = __import__(moduleName)
        self.extend(sys.modules[moduleName])

Let me just recap what's happened.

  1. EventContent_cff is loaded in Repack.py before calling Process.__init__, which set Modifier.__chosen to true. So, all Modifer.toModify do nothing.
  2. Process.__init__ is called.
  3. EventContent_cff is imported again, but python3 get EventContent_cff from the cache without creating a new module. So, Modifier.toModify is not called.

Let me know if anything is unclear.

@slava77
Copy link
Contributor

slava77 commented Jul 7, 2022

Let me just recap what's happened.

1. `EventContent_cff` is loaded in `Repack.py` before calling `Process.__init__`, which set `Modifier.__chosen` to true. So, all `Modifer.toModify` do nothing.

this analysis makes sense; good catch.
Apparently the addition is somewhat recent:
#37791 in 124X and #38017 in 12_3_X ( CMSSW_12_3_3_patch1).
I guess we may be missing more than just GEM rh.

@slava77
Copy link
Contributor

slava77 commented Jul 7, 2022

I guess we may be missing more than just GEM rh.

IIUC, only pps is also going to be lost (more issues would appear in HI processing, but that still has some time to get a fix).

@rappoccio
Copy link
Contributor

urgent

@tvami
Copy link
Contributor

tvami commented Jul 7, 2022

urgent

  • for Sal <3

@cmsbuild cmsbuild added the urgent label Jul 7, 2022
@perrotta
Copy link
Contributor

perrotta commented Jul 7, 2022

+1

  • It fixes Use RAWEventContent parameter set for Repack output module #37791
  • While something more error-prouf should be invented to correctly apply the modifiers in Configuration.EventContent.EventContent_cff (e.g. applying the modifiers in the file itself, and not in the caller configs?), the solution adopted here looks like patching correctly the issue noticed

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 7, 2022

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will be automatically merged.

@seungjin-yang
Copy link
Contributor

seungjin-yang commented Jul 8, 2022

@perrotta, This PR should be backported to 12_3_X and 12_4_x, right? We're ready to backport this PR to 12_3_X and 12_4_X.

seungjin-yang added a commit to slowmoyang/cmssw that referenced this pull request Jul 11, 2022
…ess.__init__`.

Refer to cms-sw#38622 (comment) for more details.
The new line is added to EOF by vim
seungjin-yang added a commit to slowmoyang/cmssw that referenced this pull request Jul 11, 2022
cmsbuild added a commit that referenced this pull request Jul 11, 2022
…rom-CMSSW_12_4_X_2022-07-08-2300

Modifiers not being set properly in EventConfig for data (backport of #38622, 12_4_X)
cmsbuild added a commit that referenced this pull request Jul 12, 2022
…rom-CMSSW_12_3_X_2022-07-10-2300

 Modifiers not being set properly in EventConfig for data (backport of #38622, 12_3_X)
@davidlange6
Copy link
Contributor

davidlange6 commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants