-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding parameters needed for PUjetID 94X and 102X #28827
Conversation
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28827/13548
|
A new Pull Request was created by @alefisico (Alejandro Gomez Espinosa) for master. It involves the following packages: RecoJets/JetProducers @perrotta, @cmsbuild, @slava77 can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28827/13551
|
@cmsbuild please test |
The tests are being triggered in jenkins. |
+1 |
Comparison job queued. |
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-28827/13636
|
@cmsbuild please test |
The tests are being triggered in jenkins. |
+1 |
Comparison job queued. |
Comparison is ready Comparison Summary:
|
@alefisico In these tests, the discriminator seems to always peak at -1, regardless of how much pileup is in the sample, e.g. Was this also the case in the tests done in 10_2_X? It looks fishy to me. |
Did the code change so that jets with pt > 50 GeV return -1? |
Is there any update on the explanation of the outputs? |
#4 Eta Categories 0-2.5 2.5-2.75 2.75-3.0 3.0-5.0 | ||
|
||
#Tight Id | ||
Pt010_Tight = cms.vdouble( 0.69, -0.35, -0.26, -0.21), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the values appear to be the same.
Do we really need to copy-paste this fairly large number of values?
Simply full_94x_chs_wp = full_102x_chs_wp.clone()
would work.
Same comment applies to the other points.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These numbers are preliminary the same, but it will potentially change in the near future. (and could not be the same for all the years like now) Perhaps is better to include the structure now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back, indeed the values of the working points are rather different for different training cases.
What are the targets in training?
If the development strategy is to derive the working points from some fixed numerical target (like x% efficiency or rejection), then there would be very likely no accidental agreement. This, however doesn't seem to be the case this time.
If the WPs are weakly defined [this time?] with a replacement based on judgement calls,
it would be OK to copy-paste if the older values can update independently.
However it sounds like the development strategy this time is to first introduce placeholders to be replaced after some later processing.
[I'm judging from the history of the past updates and guessing from the symptoms of potentially broken WPs visible in the validation results].
If you feel like a shorter new_wp = old_wp.clone() #TEMPORARY
is less acceptable,
please at least add an explicit comment inline that these are to be replaced soon.
tmvaWeights = cms.FileInPath("RecoJets/JetProducers/data/pileupJetId_102X_Eta2p5To2p75_chs_BDT.weights.xml.gz"), | ||
tmvaVariables = cms.vstring( | ||
"nvtx" , | ||
"dR2Mean" , |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like the same variable name vector repeats for the first 3 ranges. These can be initialized from a separately defined (temporary) vector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I will change it
|
||
################################################################################################################### | ||
full_94x_chs = cms.PSet( | ||
impactParTkThreshold = cms.double(1.), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is anything changing here other than the training file names and the working point PSet?
If not, a more compact and perhaps more maintainable way would be to use
full_94x_chs = full_102x_chs.clone(JetIdParams = full_94x_chs_wp)
full_94x_chs.trainings[0].tmvaWeights = "RecoJets/JetProducers/data/pileupJetId_94X_Eta0p0To2p5_chs_BDT.weights.xml.gz"
...
Having the older IDs listed first may also be more maintainable for cases of future additions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as previous comment. We are currently using the same values for WP in these years, but they will potentially change in the near future. What is your suggestion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
full_94x_chs = full_102x_chs.clone(JetIdParams = full_94x_chs_wp)
is applying different working points for 94X vs 102X. The suggestion in the earlier comment simply allows not to repeat what's known to be the same.
Hi all, |
PR description:
PR validation:
Successfully run the runTheMatrix tests and here is the output: https://cernbox.cern.ch/index.php/s/XNTEPDN5VU01O84
Before submitting your pull requests, make sure you followed this checklist: