-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated root to tip of branch v6-22-00-patches #6314
Updated root to tip of branch v6-22-00-patches #6314
Conversation
The tests are being triggered in jenkins.
|
A new Pull Request was created by @mrodozov (Mircho Rodozov) for branch IB/CMSSW_11_2_X/rootnext. @cmsbuild, @smuzaffar, @mrodozov can you please review it and eventually sign? Thanks. |
-1 Tested at: 540e8a6 CMSSW: CMSSW_11_2_ROOT622_X_2020-10-13-2300 I found follow errors while testing this PR Failed tests: UnitTests RelVals AddOn
I found errors in the following unit tests: ---> test testAlignmentOfflineValidation had ERRORS
When I ran the RelVals I found an error in the following workflows: runTheMatrix-results/136.88811_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL/step2_RunJetHT2018D_reminiaodUL+RunJetHT2018D_reminiaodUL+REMINIAOD_data2018UL+HARVEST2018_REMINIAOD_data2018UL.log4.22 step2 runTheMatrix-results/4.22_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC/step2_RunCosmics2011A+RunCosmics2011A+RECOCOSD+ALCACOSD+SKIMCOSD+HARVESTDC.log136.7611 step2 runTheMatrix-results/136.7611_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM/step2_RunJetHT2016E_reminiaod+RunJetHT2016E_reminiaod+REMINIAOD_data2016_HIPM+HARVESTDR2_REMINIAOD_data2016_HIPM.log136.8311 step2 runTheMatrix-results/136.8311_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017/step2_RunJetHT2017F_reminiaod+RunJetHT2017F_reminiaod+REMINIAOD_data2017+HARVEST2017_REMINIAOD_data2017.log8.0 step3 runTheMatrix-results/8.0_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS/step3_BeamHalo+BeamHalo+DIGICOS+RECOCOS+ALCABH+HARVESTCOS.log140.53 step2 runTheMatrix-results/140.53_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI/step2_RunHI2011+RunHI2011+RECOHID11+HARVESTDHI.log158.01 step2 runTheMatrix-results/158.01_HydjetQ_reminiaodPbPb2018_INPUT+HydjetQ_reminiaodPbPb2018_INPUT+REMINIAODHI2018PPRECO+HARVESTHI2018PPRECOMINIAOD/step2_HydjetQ_reminiaodPbPb2018_INPUT+HydjetQ_reminiaodPbPb2018_INPUT+REMINIAODHI2018PPRECO+HARVESTHI2018PPRECOMINIAOD.log136.731 step3 runTheMatrix-results/136.731_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2/step3_RunSinglePh2016B+RunSinglePh2016B+HLTDR2_2016+RECODR2_2016reHLT_skimSinglePh_HIPM+HARVESTDR2.log136.793 step3 runTheMatrix-results/136.793_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017/step3_RunDoubleEG2017C+RunDoubleEG2017C+HLTDR2_2017+RECODR2_2017reHLT_skimDoubleEG_Prompt+HARVEST2017.log140.56 step2 runTheMatrix-results/140.56_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18/step2_RunHI2018+RunHI2018+RECOHID18+HARVESTDHI18.log136.874 step3 runTheMatrix-results/136.874_RunEGamma2018C+RunEGamma2018C+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM/step3_RunEGamma2018C+RunEGamma2018C+HLTDR2_2018+RECODR2_2018reHLT_skimEGamma_Offline_L1TEgDQM+HARVEST2018_L1TEgDQM.log135.4 step3 runTheMatrix-results/135.4_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS/step3_ZEE_13+ZEEFS_13+HARVESTUP15FS+MINIAODMCUP15FS.log7.3 step3 runTheMatrix-results/7.3_CosmicsSPLoose_UP18+CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18/step3_CosmicsSPLoose_UP18+CosmicsSPLoose_UP18+DIGICOS_UP18+RECOCOS_UP18+ALCACOS_UP18+HARVESTCOS_UP18.log158.0 step2 runTheMatrix-results/158.0_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO/step2_HydjetQ_B12_5020GeV_2018_ppReco+HydjetQ_B12_5020GeV_2018_ppReco+DIGIHI2018PPRECO+RECOHI2018PPRECO+ALCARECOHI2018PPRECO+HARVESTHI2018PPRECO.log1001.0 step3 runTheMatrix-results/1001.0_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5/step3_RunMinBias2011A+RunMinBias2011A+TIER0EXP+ALCAEXP+ALCAHARVDSIPIXELCALRUN1+ALCAHARVD1+ALCAHARVD2+ALCAHARVD3+ALCAHARVD4+ALCAHARVD5.log1000.0 step3 runTheMatrix-results/1000.0_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT/step3_RunMinBias2011A+RunMinBias2011A+TIER0+SKIMD+HARVESTDfst2+ALCASPLIT.log10042.0 step3 runTheMatrix-results/10042.0_ZMM_13+2017+ZMM_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_ZMM_13+2017+ZMM_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log25.0 step5 runTheMatrix-results/25.0_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT/step5_TTbar+TTbar+DIGI+RECOAlCaCalo+HARVEST+ALCATT.log11634.0 step2 runTheMatrix-results/11634.0_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/step2_TTbar_14TeV+2021+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA.log10024.0 step3 runTheMatrix-results/10024.0_TTbar_13+2017+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_TTbar_13+2017+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log12434.0 step2 runTheMatrix-results/12434.0_TTbar_14TeV+2023+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA/step2_TTbar_14TeV+2023+TTbar_14TeV_TuneCP5_GenSim+Digi+Reco+HARVEST+ALCA.log10824.0 step3 runTheMatrix-results/10824.0_TTbar_13+2018+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano/step3_TTbar_13+2018+TTbar_13TeV_TuneCUETP8M1_GenSim+Digi+RecoFakeHLT+HARVESTFakeHLT+ALCA+Nano.log10224.0 step3 runTheMatrix-results/10224.0_TTbar_13+2017PU+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoFakeHLTPU+HARVESTFakeHLTPU+Nano/step3_TTbar_13+2017PU+TTbar_13TeV_TuneCUETP8M1_GenSim+DigiPU+RecoFakeHLTPU+HARVESTFakeHLTPU+Nano.log250202.181 step4 runTheMatrix-results/250202.181_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25/step4_TTbar_13UP18+TTbar_13UP18+PREMIXUP18_PU25+DIGIPRMXLOCALUP18_PU25+RECOPRMXUP18_PU25+HARVESTUP18_PU25.log
I found errors in the following addon tests: cmsRun /data/cmsbld/jenkins/workspace/ib-run-pr-tests/CMSSW_11_2_ROOT622_X_2020-10-13-2300/src/PhysicsTools/PatAlgos/test/IntegrationTest_cfg.py : FAILED - time: date Wed Oct 14 15:22:26 2020-date Wed Oct 14 15:22:11 2020 s - exit: 22016 |
Comparison not run due to runTheMatrix errors (RelVals and Igprof tests were also skipped) |
@makortel , We are trying to update root here and noticed that there are many failures like [a]. Any idea what could cause such issue? [a]
|
I have no idea. @Dr15Jones, @pcanal would you have any suggestions? |
FYI @pcanal |
by the way, we have seen the same errors while testing root master branch. |
the changes we are testing are root-project/root@d6156de...e4cd9d3 |
The code in question is a private base class which has all of its data members set as https://github.com/cms-sw/cmssw/blob/master/DataFormats/Common/interface/DetSetVectorNew.h#L89 |
There was change in the way we iterate through the data members so it looks like we overlooked something. @smuzaffar can you tell me how to reproduce this with a debug version of ROOT? Thanks. |
@pcanal I believe all ROOT specific builds are done with debug. Try CMSSW_11_2_ROOT6_X_2020-10-12-2300. |
@Dr15Jones , @pcanal these changes are not in IBs. One needs to built it locally. @mrodozov can you pelase build (on cmsdevXX) this PR + build root in debug mode and provide instructions to @pcanal so that he can test it. |
yes of course. |
The this CMSSW release is this way setup with the root 622 with -DCMAKE_BUILD_TYPE=Debug |
I was able to log it and set up the build area. Could you give an example of command line leading to the error message? Thanks. |
E.g.
but should not require grid certificate to run. |
I must have missed an important setup step:
What should I do to fix that? |
I found and resolved the problem (mostly). See root-project/root#6728. This patch is sufficient to solve the problem if all enums that are stored as the key of an associative container are using the default size. If some are using the non-default size, we also need the (upcoming) fix for root-project/root#6725. |
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix #6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix #6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
Related PR merged into main branch and v6.22 patch branch. |
@pcanal , looks like latest ROOT 6.22 updates broke our tests again [a] (e.g see https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-eac189/10459/runTheMatrix-results/23234.0_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal/step3_TTbar_14TeV+2026D49+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal.log ). Things were in good shape with ROOT v622 commit root-project/root@db0b0f7 + your change for non-zero size enum. But with latest root v6.22 ( https://github.com/root-project/root/commits/v6-22-00-patches ) we have few crashes. I have integrated #6358 for our ROOT622 IBs and in few hours we will have full tests results. [a]
|
@smuzaffar let me know when the build is ready to debug. (and please remind me one of the non-certificate reproducer :) ) |
@pcanal For the workflow @smuzaffar pointer to, |
@pcanal , new IB CMSSW_11_2_ROOT622_X_2020-11-03-1100 is available but only one workflow |
The issue looks like a memory overwrite sort of problem to me. Such problems can 'move' around when using multiple threads since the order of new/delete is not consistent process to process. The problem seems to be a corrupted virtual table called when a data product read from file is being deleted. |
@smuzaffar root-project/root#6768 solves the problems and has been merged into the master and v6.22 branches. |
thanks @pcanal , I already have tested and intergeted it for today's 11h00 ROOT622 IB. [a] workflow 1002.0 step3
|
@mrodozov which version did you import? There was a fatal bug solved today (In the function TStreamerInfo::SetClass:
is the correct code. |
I took your PR https://github.com/cms-sw/root/pull/146/commits on top of root's v6.22 branch. Looks like your PR was update after I merge. |
I can't verify 1002.0 due to a DAS error. I would need the input file and step3 script on cmsdev20. |
@pcanal , let me re-build the IB using latest latest root v6.22 branch and then we will see how it goes. |
@pcanal , wf 1002.0 step2 input and config files are available under /afs/cern.ch/user/m/muzaffar/public/root622/1002.0 As I wrote, this IB includes https://github.com/cms-sw/root/pull/146/commits which might not have all of your changes, So feel free to test the above otherwise wait till tomorrow when we have new IB based on latest root v6.22 |
For the record, it was reported elsewhere that this problems are solved but there are other issues. (cms-sw/cmssw#30359 (comment)) |
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix root-project#6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
This fix #6726 As reported by CMSSW tests (for example: cms-sw/cmsdist#6314 (comment)) where the data appear odd/corrupted, there is an issue in TStreamerInfo::GenerateInfoForPair (which is almost always used for std::pair in the tip of v6.22 and master). The problem is when calculating the offset of the second data member, TStreamerInfo::GenerateInfoForPair uses (unwittingly, of course :) ), the value zero for the size of the enums.
please test