Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get rid of more UB in TrackListMerger #37384

Merged
merged 1 commit into from
Apr 26, 2022

Conversation

mmusich
Copy link
Contributor

@mmusich mmusich commented Mar 28, 2022

PR description:

In response to #35036 (comment), follow-up to #37308.

PR validation:

cmssw compiles, also run runTheMatrix.py -l limited -j 8 --ibeos

if this PR is a backport please specify the original PR and why you need to backport that PR:

N/A

@cmsbuild
Copy link
Contributor

+code-checks

Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-37384/29037

  • This PR adds an extra 20KB to repository

@cmsbuild
Copy link
Contributor

A new Pull Request was created by @mmusich (Marco Musich) for master.

It involves the following packages:

  • RecoTracker/FinalTrackSelectors (reconstruction)

@jpata, @cmsbuild, @clacaputo, @slava77 can you please review it and eventually sign? Thanks.
@felicepantaleo, @GiacomoSguazzoni, @JanFSchulte, @rovere, @VinInn, @ebrondol, @gpetruc, @mmusich, @mtosi, @dgulhan this is something you requested to watch as well.
@perrotta, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@mmusich
Copy link
Contributor Author

mmusich commented Mar 28, 2022

@cmsbuild, please test

@mmusich
Copy link
Contributor Author

mmusich commented Mar 29, 2022

seems tests have been stuck here for a while?

@jpata
Copy link
Contributor

jpata commented Mar 30, 2022

I checked UBSAN with this PR and the number of errors is reduced but does not completely disappear (2 errors in 100 events)

joosep@joosep-desktop-work:~/reco/37384/CMSSW_12_4_UBSAN_X_2022-03-28-1100/35434.0_TTbar_14TeV+2026D78+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal$ grep TrackListMerger step3_TTbar_14TeV+2026D78+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HARVESTGlobal.log 
/home/joosep/reco/37384/CMSSW_12_4_UBSAN_X_2022-03-28-1100/src/RecoTracker/FinalTrackSelectors/plugins/TrackListMerger.cc:451:45: runtime error: variable length array bound evaluates to non-positive value 0
/home/joosep/reco/37384/CMSSW_12_4_UBSAN_X_2022-03-28-1100/src/RecoTracker/FinalTrackSelectors/plugins/TrackListMerger.cc:452:20: runtime error: variable length array bound evaluates to non-positive value 0

@jpata
Copy link
Contributor

jpata commented Mar 30, 2022

@smuzaffar looks like the bot may need manual intervention here, could you please check? Or would "please cancel / please test" have an effect here?

@smuzaffar
Copy link
Contributor

please test
as previous run was 2 days ago, so let re-run the full tests

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-946778/23525/summary.html
COMMIT: aad6457
CMSSW: CMSSW_12_4_X_2022-03-30-1100/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37384/23525/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 8 differences found in the comparisons
  • DQMHistoTests: Total files compared: 48
  • DQMHistoTests: Total histograms compared: 3585896
  • DQMHistoTests: Total failures: 14
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 3585860
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 47 files compared)
  • Checked 200 log files, 45 edm output root files, 48 DQM output files
  • TriggerResults: no differences found

@mmusich
Copy link
Contributor Author

mmusich commented Mar 31, 2022

I checked UBSAN with this PR and the number of errors is reduced but does not completely disappear (2 errors in 100 events)

thanks @jpata I am addressing those. Running local tests now.

@jpata
Copy link
Contributor

jpata commented Apr 11, 2022

@mmusich just to check in on this - were you able to reproduce the failing UBSAN even after this PR?

@mmusich
Copy link
Contributor Author

mmusich commented Apr 11, 2022

@jpata

just to check in on this - were you able to reproduce the failing UBSAN even after this PR?

yes, but after applying the same kind of trivial fix that I have applied everywhere (which is to return before the arrays with size ngood = 0 are created)

diff --git a/RecoTracker/FinalTrackSelectors/plugins/TrackListMerger.cc b/RecoTracker/FinalTrackSelectors/plugins/TrackListMerger.cc
index 3be2d7e3db0..98ae408d4f0 100644
--- a/RecoTracker/FinalTrackSelectors/plugins/TrackListMerger.cc
+++ b/RecoTracker/FinalTrackSelectors/plugins/TrackListMerger.cc
@@ -448,6 +448,13 @@ void TrackListMerger::produce(edm::Event& e, const edm::EventSetup& es) {
   typedef std::pair<unsigned int, const TrackingRecHit*> IHit;
   std::vector<std::vector<IHit>> rh1(ngood);  // "not an array" of vectors!
   //const TrackingRecHit*  fh1[ngood];  // first hit...
+
+  if (ngood == 0) {
+    // output empty collections and early return
+    this->returnEmptyCollections(e);
+    return;
+  }
+
   reco::TrackBase::TrackAlgorithm algo[ngood];
   float score[ngood];

I run into an assertion failure (for a single wf in the limited matrix 23234.0):

cmsRun: /data/cmsbld/jenkins/workspace/build-any-ib/w/tmp/BUILDROOT/06ca28086ff42d83fe311221dd768eec/opt/cmssw/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_X_2022-04-10-2300/src/RecoLocalTracker/SubCollectionProducers/src/TrackClusterRemoverPhase2.cc:156: virtual void {anonymous}::TrackClusterRemoverPhase2::produce(edm::StreamID, edm::Event&, const edm::EventSetup&) const: Assertion `s == (*quals).size()' failed.

a quick solution would just be to create the arrays algo and score with size std::max(1,ngood) (I tested that it removes all remaining UBSAN warnings in TrackListMerger), but I am not sure if that's acceptable or not.

@mmusich
Copy link
Contributor Author

mmusich commented Apr 21, 2022

type tracking

@mmusich
Copy link
Contributor Author

mmusich commented Apr 26, 2022

a quick solution would just be to create the arrays algo and score with size std::max(1,ngood) (I tested that it removes all remaining UBSAN warnings in TrackListMerger), but I am not sure if that's acceptable or not.

@cms-sw/reconstruction-l2 any thoughts on this came up in the past 15 days?

@jpata
Copy link
Contributor

jpata commented Apr 26, 2022

Naively, why does the assert in TrackClusterRemoverPhase2.cc:156 get triggered? By a quick look, it's comparing the number of tracks (apparently not 0) to the number of quals (which should be 0, because there were no tracks).

@mmusich
Copy link
Contributor Author

mmusich commented Apr 26, 2022

By a quick look, it's comparing the number of tracks (apparently not 0) to the number of quals (which should be 0, because there were no tracks).

this is quite literally what the assertion message is saying.
I am rather asking if the solution I proposed at #37384 (comment) is acceptable.
If not, I would suggest to merge to the PR as it is in its current state because I won't have time to track down the remaining issue and as far as I understand it is already a step forward as the number of errors is decreased (as per #37384 (comment))

@jpata
Copy link
Contributor

jpata commented Apr 26, 2022

this is quite literally what the assertion message is saying.

Sure, I'm trying to undestand why?

If not, I would suggest to merge to the PR as it is in its current state because I won't have time to track down the remaining issue and as far as I understand it is already a step forward as the number of errors is decreased (as per

Sounds fine to me.

@jpata
Copy link
Contributor

jpata commented Apr 26, 2022

+reconstruction

  • technical, reduces (but does not completely remove) UBSAN warnings in TrackListMerger

@cmsbuild
Copy link
Contributor

This pull request is fully signed and it will be integrated in one of the next master IBs (tests are also fine). This pull request will now be reviewed by the release team before it's merged. @perrotta, @dpiparo, @qliphy (and backports should be raised in the release meeting by the corresponding L2)

@perrotta
Copy link
Contributor

please test
(27 days old tests must be refreshed)

@cmsbuild
Copy link
Contributor

+1

Summary: https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-946778/24218/summary.html
COMMIT: aad6457
CMSSW: CMSSW_12_4_X_2022-04-25-2300/slc7_amd64_gcc10
User test area: For local testing, you can use /cvmfs/cms-ci.cern.ch/week0/cms-sw/cmssw/37384/24218/install.sh to create a dev area with all the needed externals and cmssw changes.

Comparison Summary

@slava77 comparisons for the following workflows were not done due to missing matrix map:

  • /data/cmsbld/jenkins/workspace/compare-root-files-short-matrix/data/PR-946778/39434.75_TTbar_14TeV+2026D88_HLT75e33+TTbar_14TeV_TuneCP5_GenSimHLBeamSpot14+DigiTrigger+RecoGlobal+HLT75e33+HARVESTGlobal

Summary:

  • No significant changes to the logs found
  • Reco comparison results: 6 differences found in the comparisons
  • DQMHistoTests: Total files compared: 49
  • DQMHistoTests: Total histograms compared: 3695434
  • DQMHistoTests: Total failures: 13
  • DQMHistoTests: Total nulls: 1
  • DQMHistoTests: Total successes: 3695398
  • DQMHistoTests: Total skipped: 22
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.004 KiB( 48 files compared)
  • DQMHistoSizes: changed ( 312.0 ): 0.004 KiB MessageLogger/Warnings
  • Checked 205 log files, 45 edm output root files, 49 DQM output files
  • TriggerResults: no differences found

@perrotta
Copy link
Contributor

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants