Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run endStream concurrently as part of endJob #35097

Merged
merged 1 commit into from
Sep 1, 2021

Conversation

Dr15Jones
Copy link
Contributor

PR description:

  • Run the different streams and SubProcesses concurrently at the EndStream transition.

PR validation:

Code compiles and framework unit tests pass.

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35097/24983

  • This PR adds an extra 32KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @Dr15Jones (Chris Jones) for master.

It involves the following packages:

  • FWCore/Framework (core)

@makortel, @smuzaffar, @cmsbuild, @Dr15Jones can you please review it and eventually sign? Thanks.
@makortel, @wddgit this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

-1

Failed Tests: ClangBuild
Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cfaa9b/18181/summary.html
COMMIT: ac77993
CMSSW: CMSSW_12_1_X_2021-08-31-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/35097/18181/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cfaa9b/18181/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cfaa9b/18181/git-merge-result

Clang Build

I found compilation warning while trying to compile with clang. Command used:

USER_CUDA_FLAGS='--expt-relaxed-constexpr' USER_CXXFLAGS='-Wno-register -fsyntax-only' scram build -k -j 32 COMPILER='llvm compile'

See details on the summary page.

@makortel
Copy link
Contributor

From visual inspection the only question I have is if we should eventually do the same for beginStream?

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-35097/24984

  • This PR adds an extra 32KB to repository

@cmsbuild
Copy link
Contributor

Pull request #35097 was updated. @makortel, @smuzaffar, @cmsbuild, @Dr15Jones can you please check and sign again.

@Dr15Jones
Copy link
Contributor Author

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2021

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cfaa9b/18182/summary.html
COMMIT: 2601610
CMSSW: CMSSW_12_1_X_2021-08-31-1100/slc7_amd64_gcc900
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/35097/18182/install.sh to create a dev area with all the needed externals and cmssw changes.

The following merge commits were also included on top of IB + this PR after doing git cms-merge-topic:

You can see more details here:
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cfaa9b/18182/git-recent-commits.json
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-cfaa9b/18182/git-merge-result

Comparison Summary

The workflows 140.53 have different files in step1_dasquery.log than the ones found in the baseline. You may want to check and retrigger the tests if necessary. You can check it in the "files" directory in the results of the comparisons

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 1303 differences found in the comparisons
  • DQMHistoTests: Total files compared: 39
  • DQMHistoTests: Total histograms compared: 3000404
  • DQMHistoTests: Total failures: 3676
  • DQMHistoTests: Total nulls: 20
  • DQMHistoTests: Total successes: 2996686
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 45.699 KiB( 38 files compared)
  • DQMHistoSizes: changed ( 140.53 ): 44.531 KiB Hcal/DigiRunHarvesting
  • DQMHistoSizes: changed ( 140.53 ): 1.172 KiB RPC/DCSInfo
  • DQMHistoSizes: changed ( 312.0 ): -0.004 KiB MessageLogger/Warnings
  • Checked 165 log files, 37 edm output root files, 39 DQM output files
  • TriggerResults: no differences found

@makortel
Copy link
Contributor

makortel commented Sep 1, 2021

+1

@cmsbuild
Copy link
Contributor

cmsbuild commented Sep 1, 2021

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@qliphy
Copy link
Contributor

qliphy commented Sep 1, 2021

+1

@cmsbuild cmsbuild merged commit 6c471a7 into cms-sw:master Sep 1, 2021
@makortel
Copy link
Contributor

makortel commented Sep 1, 2021

Could this PR be causing many failures at the end of the jobs in IBs along

dropped waiting message count 0
----- Begin Fatal Exception 01-Sep-2021 19:40:33 CEST-----------------------
An exception of category 'MultipleExceptions' occurred while
   [0] Calling endJob
Exception Message:
Multiple exceptions were thrown while executing endJob. An exception message follows for each.
1
An exception of category 'NotFound' occurred while
   [0] Calling endStream for module MixingModule/'mix'
Exception Message:
Service no ServiceRegistry has been set for this thread 
2
An exception of category 'NotFound' occurred while
   [0] Calling endStream for module MixingModule/'mix'
Exception Message:
Service no ServiceRegistry has been set for this thread 
3
An exception of category 'NotFound' occurred while
   [0] Calling endStream for module MixingModule/'mix'
Exception Message:
Service no ServiceRegistry has been set for this thread 
----- End Fatal Exception -------------------------------------------------

https://cmssdt.cern.ch/SDT/cgi-bin/logreader/slc7_amd64_gcc900/CMSSW_12_1_X_2021-09-01-1100/pyRelValMatrixLogs/run/300.0_Pyquen_GammaJet_pt20_2760GeV+Pyquen_GammaJet_pt20_2760GeV+DIGIHIMIX+RECOHIMIX+HARVESTHI2018PPRECO/step1_Pyquen_GammaJet_pt20_2760GeV+Pyquen_GammaJet_pt20_2760GeV+DIGIHIMIX+RECOHIMIX+HARVESTHI2018PPRECO.log#/

@qliphy
Copy link
Contributor

qliphy commented Sep 2, 2021

@Dr15Jones
I could reproduce the IB issue using "runTheMatrix.py -l 200 --job-reports -t 4 --ibeos"

CMSSW_12_1_X_2021-08-31-2300: works well
CMSSW_12_1_X_2021-08-31-2300 + this PR: give the issue

while "runTheMatrix.py -l 200" with this PR also runs well, that is why the PR test doesn't catch this error.

Do you have a workaround quickly? Or otherwise we revert this PR as quite many workflows in IB are failing?

@Dr15Jones
Copy link
Contributor Author

@qliphy I'll have a fix in a few minutes.

@Dr15Jones
Copy link
Contributor Author

@qliphy fixed in #35120

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants