-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer overflow in RecHitsSortedInPhi on NaN #35489
Comments
A new Issue was created by @dan131riley Dan Riley. @Dr15Jones, @perrotta, @dpiparo, @makortel, @smuzaffar, @qliphy can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
assign hlt,reconstruction |
New categories assigned: hlt,reconstruction @slava77,@Martin-Grunewald,@jpata,@missirol you have been requested to review this Pull request/Issue and eventually sign? Thanks |
@dan131riley @makortel how can i reproduce the issue? (forgive my ignorance here..) |
It was seen in step2 of the ASAN builds of WF 11834.0, logs here: To reproduce, you'd need an ASAN build, e.g. CMSSW_12_1_ASAN_X_2021-09-29-2300, and probably the exact step1 input for that relval, as this appears to be a rare condition. Since this is a rare condition, I'd recommend just putting in some protection against NaNs. |
so... it took a lot of time to run in an ASAN_X build but this change: diff --git a/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h b/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h
index cadec4e0d1d..b952a873cf4 100644
--- a/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h
+++ b/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h
@@ -3,6 +3,7 @@
#include "DataFormats/TrackerRecHit2D/interface/BaseTrackerRecHit.h"
#include "TrackingTools/DetLayers/interface/DetLayer.h"
+#include "FWCore/Utilities/interface/isFinite.h"
#include <vector>
#include <array>
@@ -33,7 +34,13 @@ public:
};
struct HitLessPhi {
- bool operator()(const HitWithPhi& a, const HitWithPhi& b) { return a.phi() < b.phi(); }
+ bool operator()(const HitWithPhi& a, const HitWithPhi& b) {
+ if (!edm::isFinite(a.phi()))
+ return false;
+ if (!edm::isFinite(b.phi()))
+ return true;
+ return a.phi() < b.phi();
+ }
};
typedef std::vector<HitWithPhi>::const_iterator HitIter;
typedef std::pair<HitIter, HitIter> Range; seems to be able to remove the address sanitizer issues in step2 of wf. 11834.0. |
or
|
as is , it does not compile.
the interface is:
This, instead: diff --git a/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h b/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h
index cadec4e0d1d..e3936af535e 100644
--- a/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h
+++ b/RecoTracker/TkHitPairs/interface/RecHitsSortedInPhi.h
@@ -33,7 +33,13 @@ public:
};
struct HitLessPhi {
- bool operator()(const HitWithPhi& a, const HitWithPhi& b) { return a.phi() < b.phi(); }
+ bool operator()(const HitWithPhi& a, const HitWithPhi& b) {
+ int ia, ib;
+ auto phia = a.phi();
+ auto phib = b.phi();
+ memcpy(&ia,&phia,4); memcpy(&ib,&phib,4);
+ return ia<ib;
+ }
};
typedef std::vector<HitWithPhi>::const_iterator HitIter;
typedef std::pair<HitIter, HitIter> Range;
does compile and resolves the issue, though I observe some more warnings that were not before with the solution at #35489 (comment) %MSG-w Muon: L3MuonProducer:hltL3MuonsIterL3OIScoutingNoVtx 01-Oct-2021 13:22:46 CEST Run: 1 Event: 9
Failed to load tracker track trajectory
%MSG
%MSG-w RecoMuon: L3MuonProducer:hltL3MuonsIterL3OIScoutingNoVtx 01-Oct-2021 13:22:46 CEST Run: 1 Event: 9
Failed to load tracker track trajectory
%MSG
%MSG-w GlobalMuon: L3MuonProducer:hltL3MuonsIterL3OIScoutingNoVtx 01-Oct-2021 13:22:46 CEST Run: 1 Event: 9
Failed to load tracker track trajectory
%MSG
%MSG-w GlobalTrajectoryBuilderBase: L3MuonProducer:hltL3MuonsIterL3OIScoutingNoVtx 01-Oct-2021 13:22:46 CEST Run: 1 Event: 9
Failed to load tracker track trajectory
%MSG
I can open the PR with any of these solutions, please @cms-sw/reconstruction-l2 let me know. |
I think we should cure the cause not the symptoms..
|
HLT is full of those warning (even in MC). Again would be useful to understand their origin |
Definitely not, that was @VinInn trolling. Preferences, in order best to least preferred:
Right out: anything with memcpy(). |
On 1 Oct, 2021, at 2:38 PM, Dan Riley ***@***.***> wrote:
•
memcpy(&ia,&phia,4); memcpy(&ib,&phib,4);
•
return ia<ib;
Definitely not, that was @VinInn trolling. Preferences, in order best to least preferred:
Definitively not trolling.
That's the standard way to deal with NaN in comparison
edited
OK SORRY!
I forgot the sign need to be taken into account...
It's a bit more compilcated than that... still doable (I mean I've done elsewhere)
v.
|
I have difficulties to reconcile this what you wrote earlier:
Also I don't understand how
can be OK. I guess they still need to be put somewhere in the list. |
Hits with NaN (or other illness) do not require to be loaded in RecHitsSortedInPhi that is used to build Doublets for seeds. |
I am uneasy with adding two ifs in the less operator... |
back to 1) added
generated this
and see no error report
We can put a warning (or even an error) and see if ever triggers... |
is step2 reproducible?
|
This issue could be the culprit for random failures in |
Pileup mixing is not fully reproducible when run with multiple threads because the "assignment" of input GEN events to edm streams (MinBias files for mixing are processed per stream) is not deterministic. |
ASAN found a heap buffer overflow in
RecHitsSortedInPhi::RecHitsSortedInPhi()
, very likely because phi for the track was NaN, which doesn't sort well.The text was updated successfully, but these errors were encountered: