Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible misconfiguration of TrackingPOGFilters #37738

Open
perrotta opened this issue Apr 29, 2022 · 22 comments
Open

Possible misconfiguration of TrackingPOGFilters #37738

perrotta opened this issue Apr 29, 2022 · 22 comments

Comments

@perrotta
Copy link
Contributor

The PR tests of (now merged) #37134 for wfs 10804.31 and 10805.31 are showing at every event the following error messagges:

%MSG-w NoModule:   ByClusterSummaryMultiplicityPairEventFilter:toomanystripclus53X  28-Apr-2022 20:02:03 CEST Run: 1 Event: 1
No information for requested module 5. Please check in the Provinence Infomation for proper modules.
%MSG
%MSG-w NoModule:  ByClusterSummaryMultiplicityPairEventFilter:manystripclus53X  28-Apr-2022 20:02:03 CEST Run: 1 Event: 1
No information for requested module 5. Please check in the Provinence Infomation for proper modules.
%MSG

I did not verify whether this also happens in other workflows.

As pointed out by @mmusich in #37134 (comment), it looks like there's something wrong with how the "tracking POG" filters in:

manystripclus53X = cms.EDFilter('ByClusterSummaryMultiplicityPairEventFilter',
multiplicityConfig = cms.PSet(
firstMultiplicityConfig = cms.PSet(
clusterSummaryCollection = cms.InputTag("clusterSummaryProducer"),
subDetEnum = cms.int32(5),
varEnum = cms.int32(0)
),
secondMultiplicityConfig = cms.PSet(
clusterSummaryCollection = cms.InputTag("clusterSummaryProducer"),
subDetEnum = cms.int32(0),
varEnum = cms.int32(0)
),
),
cut = cms.string("( mult2 > 20000+7*mult1)")
)

are configured.

@cmsbuild
Copy link
Contributor

A new Issue was created by @perrotta Andrea Perrotta.

@Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@perrotta
Copy link
Contributor Author

assign reconstruction

FYI @cms-sw/tracking-pog-l2

@cmsbuild
Copy link
Contributor

New categories assigned: reconstruction

@jpata,@slava77,@clacaputo you have been requested to review this Pull request/Issue and eventually sign? Thanks

@jpata
Copy link
Contributor

jpata commented Apr 29, 2022

type trk

@slava77
Copy link
Contributor

slava77 commented Apr 29, 2022

this is probably a tracker issue (clusterSummaryProducer is under the tracker )

there was a PR that drops empty detsets in the cluster collections
#37035
@ferencek
I guess this then leads to the clustersummary not seeing BPIX or other dets in the photon gun events, which then in turn leads to the warning.

@cms-sw/trk-dpg-l2

@slava77
Copy link
Contributor

slava77 commented Apr 29, 2022

a fix could be to still fill a summary for empty dets, but perhaps a different functionality will need to be added to check empties

@cmsbuild cmsbuild added trk and removed tracking labels Apr 29, 2022
@ferencek
Copy link
Contributor

this is probably a tracker issue (clusterSummaryProducer is under the tracker )

there was a PR that drops empty detsets in the cluster collections #37035 @ferencek I guess this then leads to the clustersummary not seeing BPIX or other dets in the photon gun events, which then in turn leads to the warning.

@cms-sw/trk-dpg-l2

Hi @slava77, #37035 drops empty DetSets from the pixel digi collection so the multiplicity of pixel clusters remains unchanged (after all, there is nothing to cluster from empty DetSets, more details in #37035 (comment)). Hence, I don't see how the issue reported here could be related to #37035.

@slava77
Copy link
Contributor

slava77 commented Apr 29, 2022

Hi @slava77, #37035 drops empty DetSets from the pixel digi collection so the multiplicity of pixel clusters remains unchanged (after all, there is nothing to cluster from empty DetSets, more details in #37035 (comment)). Hence, I don't see how the issue reported here could be related to #37035.

but doesn't the clusterizer preserve the DetSets dimension? If so, for an empty BPIX before #37035 the clusterSummaryProducer would see BPIX detSets and would collect some (zeroes ) info for them, while after that PR it will not see them.

@ferencek
Copy link
Contributor

Hi @slava77, #37035 drops empty DetSets from the pixel digi collection so the multiplicity of pixel clusters remains unchanged (after all, there is nothing to cluster from empty DetSets, more details in #37035 (comment)). Hence, I don't see how the issue reported here could be related to #37035.

but doesn't the clusterizer preserve the DetSets dimension? If so, for an empty BPIX before #37035 the clusterSummaryProducer would see BPIX detSets and would collect some (zeroes ) info for them, while after that PR it will not see them.

Based on my observations described in #37035 (comment), the clusterizer does not preserve the DetSetVector size. When it encounter an empty digi DetSet, it does not produce an empty cluster DetSet. This also seems to be confirmed by https://github.com/cms-sw/cmssw/blob/CMSSW_12_2_0/RecoLocalTracker/SiPixelClusterizer/plugins/SiPixelClusterProducer.cc#L191--L197.

@qliphy
Copy link
Contributor

qliphy commented May 2, 2022

Also to add here: since CMSSW_12_4 2022-04-29-2300 , there is an IB issue with 10805.31
@ssrothman Please have a check.

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc10/CMSSW_12_4_X_2022-05-01-2300/pyRelValMatrixLogs/run/10804.31_SingleGammaPt10+2018_photonDRN+SingleGammaPt10_pythia8_GenSimINPUT+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_SingleGammaPt10+2018_photonDRN+SingleGammaPt10_pythia8_GenSimINPUT+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log#/

Starting python2 /data/cmsbld/jenkins/workspace/ib-run-relvals/cms-bot/monitor_workflow.py timeout --signal SIGTERM 9000 cmsRun -j JobReport3.xml step3_RAW2DIGI_L1Reco_RECO_RECOSIM_PAT_VALIDATION_DQM.py
sh: nvidia-smi: command not found
%MSG-i ThreadStreamSetup: (NoModuleName) 02-May-2022 01:20:31 CEST pre-events
setting # threads 4
setting # streams 4
%MSG
02-May-2022 01:20:48 CEST Initiating request to open file file:step2.root
02-May-2022 01:20:50 CEST Successfully opened file file:step2.root
2022-05-02 01:21:27.078436: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
----- Begin Fatal Exception 02-May-2022 01:25:37 CEST-----------------------
An exception of category 'FallbackFailed' occurred while
[0] Calling beginJob
Exception Message:
Starting the fallback server failed with exit code 256

@makortel
Copy link
Contributor

makortel commented May 2, 2022

Also to add here: since CMSSW_12_4 2022-04-29-2300 , there is an IB issue with 10805.31
@ssrothman Please have a check.

Let's move this to a separate issue as the Triton issue has nothing to do with TrackingPOGFilters. I opened #37767.

@mmusich
Copy link
Contributor

mmusich commented May 6, 2022

so, I 've started to have a look and here is some other evidence I gathered:

enum CMSTracker {
STRIP = 0,
TIB = 1,
TOB = 2,
TID = 3,
TEC = 4,
PIXEL = 5,
FPIX = 6,
BPIX = 7,
NVALIDENUMS = 8,
NTRACKERENUMS = 100
};

  • more specifically it gets triggered here when checking the cluster count:

int getNClus(const CMSTracker mod) const {
int pos = getModuleLocation(mod);
return pos < 0 ? 0. : nClus[pos];
}

@mmusich
Copy link
Contributor

mmusich commented May 10, 2022

Upon further look, as far as I can tell, the printout works as intended.

  • in this particular workflow (10804.31) for some events (but not for every event) there aren't any pixel clusters generated.
  • ClusterSummaryProducer copies a trimmed a version of the modules array (only for the subdetectors with non-zero entries)

for (unsigned int iM = 0; iM < src_nClus.size(); ++iM) {
if (src.nClus[iM] != 0) {
modules.push_back(src_modules[iM]);
nClus.push_back(src_nClus[iM]);
clusSize.push_back(src_clusSize[iM]);
clusCharge.push_back(src_clusCharge[iM]);
}

  • the filters manystripclus53X and toomanystripclus53X require explicitly the presence of pixel clusters in order to come to a decision:

hence the warning gets triggered here:

if (warn)
edm::LogWarning("NoModule") << "No information for requested module " << mod
<< ". Please check in the Provinence Infomation for proper modules.";

@mmusich
Copy link
Contributor

mmusich commented May 16, 2022

@cms-sw/reconstruction-l2 given the analysis above #37738 (comment) do you think is this issue still relevant?

@jpata
Copy link
Contributor

jpata commented May 16, 2022

+reconstruction

  • it looks like the printout warnings are as intended

@cmsbuild
Copy link
Contributor

This issue is fully signed and ready to be closed.

@perrotta
Copy link
Contributor Author

@mmusich @jpata: ok, we could even close it.
But still it looks to me that those messages could be demoted, if really to be considered "as intended" for the normal workflows. If it is intended, why to issue a warning?
Alternatively, some limitation to the number of such messages that can be issued during a run could also be useful to avoid polluting the outputs and let them clean for other, possibli more relevant. warnings

@mmusich
Copy link
Contributor

mmusich commented May 16, 2022

@perrotta the problem is about the sanity of running the MET filter in a single photon, no PU wf. In normal collision workflows it makes sense to check because there are always pixel clusters.
About how the wf is set up I think it's not trk responsibility (another issue could be opened)

@jpata
Copy link
Contributor

jpata commented May 16, 2022

I agree with Marco above. Do we have ways to ignore warnings in some wfs (per-wf logging level)? Or maybe extending the warning message "this should not happen in normal collision events" or such?

@makortel
Copy link
Contributor

It should be straightforward to suppress messages from particular category (NoModule in this case) already with something along

cmsDriver.py ... --customise_commands="process.MessageLogger.NoModule = dict(limit=0)"

(see https://github.com/cms-sw/cmssw/blob/master/FWCore/MessageService/Readme.md#suppressing-a-particular-category), but if the demand for such suppression starts to spread, perhaps some more specific structure in runTheMatrix.py infrastructure could be useful (@cms-sw/pdmv-l2).

@tvami
Copy link
Contributor

tvami commented Dec 20, 2022

@cmsbuild
Copy link
Contributor

New categories assigned: pdmv

@bbilin,@sunilUIET,@kskovpen you have been requested to review this Pull request/Issue and eventually sign? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants