Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2018 replay for Pixel LA PCL test #4635

Closed
wants to merge 11 commits into from
Closed

Conversation

tvami
Copy link
Contributor

@tvami tvami commented Nov 18, 2021

Replay Request

Requestor

AlCaDB

Describe the configuration

  • Release: CMSSW_12_1_0 then CMSSW_12_1_1 then CMSSW_12_2_0_patch1 then CMSSW_12_2_1 then CMSSW_12_3_0_pre6 then CMSSW_12_2_1_patch2
  • Run: 319077,324841
  • GTs:
    • expressGlobalTag: 121X_dataRun3_Express_TIER0_REPLAY_Run2_v1 then 122X_dataRun3_Express_TIER0_REPLAY_Run2_v1 then 123X_dataRun3_Express_TIER0_REPLAY_Run2_v1
    • promptrecoGlobalTag: 121X_dataRun3_Prompt_TIER0_REPLAY_Run2_v1 then 122X_dataRun3_Prompt_TIER0_REPLAY_Run2_v1 then 123X_dataRun3_Prompt_TIER0_REPLAY_Run2_v2
    • alcap0GlobalTag: 121X_dataRun3_Prompt_TIER0_REPLAY_Run2_v1 then 122X_dataRun3_Prompt_TIER0_REPLAY_Run2_v1 then 123X_dataRun3_Prompt_TIER0_REPLAY_Run2_v2
  • Additional changes: Adding a new SiPixel LA PCL producer

Second replay is for Express only replay

Purpose of the test

Checking a newly introduced Pixel PCL to measure the Lorentz Angle
Choice of runs 319077 --> short run, to prove that the PCL wont produce a sensible payload, while 324841 is a very long run that we expect to work nicely

GT changes wrt to the queue
https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/121X_dataRun3_Express_TIER0_REPLAY_Run2_v1/121X_dataRun3_Express_Queue

https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/121X_dataRun3_Prompt_TIER0_REPLAY_Run2_v1/121X_dataRun3_Prompt_Queue

  • Differ in the DropBox Metadata file that now contains the SiPixel LA PCL (and also the PPS, but that's not relevant here)
  • The tags that we slit for Run-2 and Run-3

The 12_2_X GTs contain the same DropBox metadata file than the 12_1_X GTs

The 12_3_X GTs
https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/123X_dataRun3_Express_TIER0_REPLAY_Run2_v1/123X_dataRun3_Express_v4

https://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/123X_dataRun3_Prompt_TIER0_REPLAY_Run2_v2/123X_dataRun3_Prompt_v5

T0 Operations HyperNews thread

https://hypernews.cern.ch/HyperNews/CMS/get/tier0-Ops/2319.html

https://cms-talk.web.cern.ch/t/2018-replay-for-sipixel-la-pcl-test/8065

@tvami tvami changed the title Config of a 2018 replay for Pixel LA PCL test 2018 replay for Pixel LA PCL test Nov 18, 2021
@tvami
Copy link
Contributor Author

tvami commented Nov 18, 2021

cc @OzAmram @mmusich

@tvami tvami force-pushed the TestPCL_forSiPixelLA branch 4 times, most recently from 51c88f0 to 56c88b5 Compare November 18, 2021 21:06
@tvami
Copy link
Contributor Author

tvami commented Nov 18, 2021

run replay please

@tvami
Copy link
Contributor Author

tvami commented Nov 18, 2021

Edit: I've removed run 319077 as the streamer file doesnt seem to be on DISK, @jhonatanamado said he'll add it manual later

@tvami
Copy link
Contributor Author

tvami commented Nov 18, 2021

run replay please

@cmsdmwmbot
Copy link

There are 17 repack workflows.
There are 5 express workflows.
There are 314 filesets not closed.
There are 3 paused jobs in the replay.

@cmsdmwmbot
Copy link

There are 17 repack workflows.
There are 5 express workflows.
There are 833 filesets not closed.
There are 6978 paused jobs in the replay.

@germanfgv
Copy link
Contributor

run replay please

@cmsdmwmbot
Copy link

Replay testing PR '2018 replay for Pixel LA PCL test'
An automatic replay has been requested by germanfgv.
Here is a brief description of the replay.
Github PR : #4635
PR author : tvami
Requestor : None
Injected runs : 324841
CMSSW release : CMSSW_12_1_0
Tier0 release : 3.0.1
ppScenario : ppEra_Run2_2018
Tier0 Config : https://cmst0.web.cern.ch/CMST0/tier0/offline_config/ReplayOfflineConfiguration_047.php
Contatiner ID : 1
Jenkins Build : https://cmssdt.cern.ch/dmwm-jenkins/job/DMWM-T0-PR-test-job/362/

Replay Request\r\n\r\nRequestor \r\n\r\nAlCaDB\r\n\r\nDescribe the configuration \r\n* Release: CMSSW_12_1_0\r\n* Run: 319077,324841\r\n* GTs:\r\n * expressGlobalTag: 121X_dataRun3_Express_Candidate_2021_11_18_19_28_59\r\n * promptrecoGlobalTag: 121X_dataRun3_Prompt_Candidate_2021_11_18_19_44_08\r\n * alcap0GlobalTag: 121X_dataRun3_Prompt_Candidate_2021_11_18_19_44_08\r\n* Additional changes: Adding a new SiPixel LA PCL producer\r\n\r\nPurpose of the test \r\n\r\nChecking a newly introduced Pixel PCL to measure the Lorentz Angle\r\nChoice of runs 319077 --> short run, to prove that the PCL wont produce a sensible payload, while 324841 is a very long run that we expect to work nicely\r\n\r\nGT changes wrt to the queue\r\nhttps://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/121X_dataRun3_Express_Candidate_2021_11_18_19_28_59/121X_dataRun3_Express_Queue\r\n\r\nhttps://cms-conddb.cern.ch/cmsDbBrowser/diff/Prod/gts/121X_dataRun3_Prompt_Candidate_2021_11_18_19_44_08/121X_dataRun3_Prompt_Queue\r\n\r\nDiffer in the DropBox Metadata file that now contains the SiPixel LA PCL (and also the PPS, but that's not relevant here)\r\n\r\nT0 Operations HyperNews thread \r\n\r\n\r\nhttps://hypernews.cern.ch/HyperNews/CMS/get/tier0-Ops/2319.html

Jira Issue : https://its.cern.ch/jira/browse/CMSTZDEV-700

@germanfgv
Copy link
Contributor

@gkfthddk There seems to be a problem with the recent changes to the monitoring. Can you please take a look?

@cmsdmwmbot
Copy link

There are 17 repack workflows.
There are 5 express workflows.
There are 926 filesets not closed.
There are 2 paused jobs in the replay.

@germanfgv
Copy link
Contributor

We have found an error affecting ZeroBias` PromptReco jobs.

2021-11-19 08:14:12,149:CRITICAL:CMSSW:Error message: An exception of category 'NoRecord' occurred while
   [0] Processing  Event run: 324841 lumi: 977 event: 1802500492 stream: 5
   [1] Running path 'dqmoffline_13_step'
   [2] Prefetching for module SMPDQM/'SMPDQM'
   [3] Prefetching for module MuonProducer/'muons'
   [4] Prefetching for module MuonIdProducer/'muons1stStep'
   [5] Prefetching for module EcalRecHitProducer/'ecalRecHit@cpu'
   [6] Calling method for EventSetup module EcalLaserCorrectionService/''
   [7] While getting dependent Record from Record EcalLaserDbRecord
Exception Message:
No "EcalLaserAPDPNRatiosRcd" record found in the EventSetup.

@germanfgv
Copy link
Contributor

We also have issues creating Express jobs with configBuilder:

Failed to load process from Scenario ppEra_Run2_2018 (<Configuration.DataProcessing.Impl.ppEra_Run2_2018.ppEra_Run2_2018 object at 0x2b034bdefa60>).
Traceback (
most recent call last):
  File "/cvmfs/cms.cern.ch/share/overrides/bin/cmssw_wm_create_process.py", line 144, in <module>
    main()
  File "/cvmfs/cms.cern.ch/share/overrides/bin/cmssw_wm_create_process.py", line 135, in main
    process=create_process(args, func_args)
  File "/cvmfs/cms.cern.ch/share/overrides/bin/cmssw_wm_create_process.py", line 97, in create_process
    raise ex
  File "/cvmfs/cms.cern.ch/share/overrides/bin/cmssw_wm_create_process.py", line 93, in create_process
    process = my_func(*call_func_args, **func_args)
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_1_0/python/Configuration/DataProcessing/Reco.py", line 248, in alcaSkim
    cb.prepare() 
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_1_0/python/Configuration/Applications/ConfigBuilder.py", line 2170, in prepare
    self.addStandardSequences()
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_1_0/python/Configuration/Applications/ConfigBuilder.py", line 786, in addStandardSequences
    getattr(self,"prepare_"+stepName)(sequence = '+'.join(stepSpec))
  File "/cvmfs/cms.cern.ch/slc7_amd64_gcc900/cms/cmssw/CMSSW_12_1_0/python/Configuration/Applications/ConfigBuilder.py", line 1330, in prepare_ALCA
    raise Exception("The following alcas could not be found "+str(alcaList))
Exception: The following alcas could not be found ['PromptCalibProdSiPixelLA']

etc/ReplayOfflineConfiguration.py Outdated Show resolved Hide resolved
etc/ReplayOfflineConfiguration.py Outdated Show resolved Hide resolved
@tvami tvami force-pushed the TestPCL_forSiPixelLA branch 2 times, most recently from 7be1dc4 to b115c88 Compare November 19, 2021 13:30
@tvami
Copy link
Contributor Author

tvami commented Nov 19, 2021

run replay please

@cmsdmwmbot
Copy link

Replay testing PR '2018 replay for Pixel LA PCL test'
An automatic replay has been requested by tvami.
Here is a brief description of the replay.
Github PR : #4635
PR author : tvami
Requestor : None
Injected runs : 324841
CMSSW release : CMSSW_12_1_0
Tier0 release : 3.0.1
ppScenario : ppEra_Run2_2018
Tier0 Config : https://cmst0.web.cern.ch/CMST0/tier0/offline_config/ReplayOfflineConfiguration_047.php
Contatiner ID : 1
Jenkins Build : https://cmssdt.cern.ch/dmwm-jenkins/job/DMWM-T0-PR-test-job/364/
Jira Issue : https://its.cern.ch/jira/browse/CMSTZDEV-701

@cmsdmwmbot
Copy link

There are 17 repack workflows.
There are 5 express workflows.
There are 1286 filesets not closed.
There are 9 paused jobs in the replay.

@cmsdmwmbot
Copy link

There are 16 repack workflows.
There are 4 express workflows.
There are 2154 filesets not closed.
There are 29 paused jobs in the replay.

@tvami
Copy link
Contributor Author

tvami commented Feb 11, 2022

I updated this, and we should run it, but #4644 should have priority

@tvami
Copy link
Contributor Author

tvami commented Feb 13, 2022

run replay please

@tvami
Copy link
Contributor Author

tvami commented Feb 13, 2022

run replay please

Since the other one converged

@cmsdmwmbot
Copy link

Replay testing PR '2018 replay for Pixel LA PCL test'
An automatic replay has been requested by tvami.
Here is a brief description of the replay.
Deployment ID: 220213200934
Github PR: #4635
PR author: tvami
Requestor: AlCaDB
Injected runs: 324841
CMSSW release: CMSSW_12_2_1
Tier0 release: 3.0.3
ppScenario: ppEra_Run2_2018
Tier0 Config: https://cmst0.web.cern.ch/CMST0/tier0/offline_config/ReplayOfflineConfiguration_047.php
Contatiner ID: 1
Jenkins Build: https://cmssdt.cern.ch/dmwm-jenkins/job/DMWM-T0-PR-test-job/422/
Jira Issue : https://its.cern.ch/jira/browse/CMSTZDEV-722

@cmsdmwmbot
Copy link

Replay testing PR '2018 replay for Pixel LA PCL test'
An automatic replay has been requested by tvami.
Here is a brief description of the replay.
Deployment ID: 220213200934
Github PR: #4635
PR author: tvami
Requestor: AlCaDB
Injected runs: 324841
CMSSW release: CMSSW_12_2_1
Tier0 release: 3.0.3
ppScenario: ppEra_Run2_2018
Tier0 Config: https://cmst0.web.cern.ch/CMST0/tier0/offline_config/ReplayOfflineConfiguration_047.php
Contatiner ID: 1
Jenkins Build: https://cmssdt.cern.ch/dmwm-jenkins/job/DMWM-T0-PR-test-job/423/
Jira Issue : https://its.cern.ch/jira/browse/CMSTZDEV-723

@tvami
Copy link
Contributor Author

tvami commented Mar 11, 2022

run replay please

1 similar comment
@tvami
Copy link
Contributor Author

tvami commented Mar 11, 2022

run replay please

@tvami
Copy link
Contributor Author

tvami commented Mar 11, 2022

run replay please

1 similar comment
@jhonatanamado
Copy link
Contributor

run replay please

@cmsdmwmbot
Copy link

Monitoring for replay is closed.
Log Begins ====
Tier0_REPLAY v426 DMWM-T0-PR-test-job on vocms047.cern.ch. 2018 replay for Pixel LA PCL test
JIRA URL : None
Monitoring closed
ORA-00942: table or view does not exist

End Of Log ====

@cmsdmwmbot
Copy link

Monitoring for replay is closed.
Log Begins ====
Tier0_REPLAY v429 DMWM-T0-PR-test-job on vocms047.cern.ch. 2018 replay for Pixel LA PCL test
JIRA URL : None
Monitoring closed
ORA-00942: table or view does not exist
Return 139 : 954 job(s)
#Error message ====
INFO:root:CMSSW configured for GPU required: forbidden, with these settings: None
INFO:root:Executing CMSSW step
INFO:root:Runing SCRAM
INFO:root:Running PRE scripts
INFO:root: Invoking command:
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH
/cvmfs/cms.cern.ch/COMP/slc7_amd64_gcc630/external/python3/3.8.2-comp/bin/python3 -m WMCore.WMRuntime.ScriptInvoke WMTaskSpace.cmsRun2 SetupCMSSWPset

INFO:root:RUNNING SCRAM SCRIPTS
INFO:root:Executing CMSSW. args: ['/bin/bash', '/srv/job/WMTaskSpace/cmsRun2/cmsRun2-main.sh', '', 'slc7_amd64_gcc10', 'scramv1', 'CMSSW', 'CMSSW_12_3_0_pre6', 'FrameworkJobReport.xml', 'cmsRun', 'PSet.py', '', '', '']
CRITICAL:root:Error running cmsRun
{'arguments': ['/bin/bash', '/srv/job/WMTaskSpace/cmsRun2/cmsRun2-main.sh', '', 'slc7_amd64_gcc10', 'scramv1', 'CMSSW', 'CMSSW_12_3_0_pre6', 'FrameworkJobReport.xml', 'cmsRun', 'PSet.py', '', '', '']}
Linux Return code: 139

CRITICAL:root:Error message: None
WARNING:root:Exit code: 139 has been already added to the job report
INFO:root:Steps.Executors.CMSSW.post called
INFO:root:StepName: cmsRun2, StepType: CMSSW, with result: 139
INFO:root:Steps.Executor logging started
INFO:root:Steps.Executors.StageOut.pre called
INFO:root:Steps.Executors.StageOut.execute called
INFO:root:StageOut override is: stageOut1.asyncDest = None
stageOut1.section_('override')
stageOut1.override.previousCmsRunFailure = True

#END====

End Of Log ====

@tvami
Copy link
Contributor Author

tvami commented Mar 15, 2022

I reverted back to CMSSW_12_2_1_patch2 that just came out

@tvami
Copy link
Contributor Author

tvami commented Mar 15, 2022

run replay please

tvami and others added 2 commits March 15, 2022 10:48
Ignore all streams but Express for this test. This  configuration will avoid all PromptReco and skips the Repack for the ignored streamers.
@jhonatanamado
Copy link
Contributor

run replay please

@cmsdmwmbot
Copy link

Monitoring for replay is closed.
Log Begins ====
Tier0_REPLAY v434 DMWM-T0-PR-test-job on vocms047.cern.ch. 2018 replay for Pixel LA PCL test
JIRA URL : None
All repack workflows were processed.
All filesets were closed.
There was NO paused job in the replay.
End.
Replay was succesfull.
End.

End Of Log ====

@tvami
Copy link
Contributor Author

tvami commented Mar 28, 2022

After today's presentation
https://indico.cern.ch/event/1143221/#21-sipixel-lorentz-angle-pcl-r
we can finally conclude this is ready for production, I'm closing this PR meant for the replay

@tvami tvami closed this Mar 28, 2022
@germanfgv
Copy link
Contributor

This is being included in #4677

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants