-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RF] Consider relative number of events in sub-ranges for multirange fit #7719
Conversation
Starting build on |
Build failed on ROOT-fedora31/noimt. Failing tests: |
Build failed on ROOT-performance-centos8-multicore/default. Failing tests: |
Build failed on ROOT-ubuntu16/nortcxxmod. Failing tests: |
Build failed on ROOT-fedora30/cxx14. Failing tests: |
Build failed on mac1014/python3. Failing tests: |
Build failed on windows10/cxx14. Failing tests: |
The failing tests are expected here because the reference results might be affected by the bug. I will investigate this further next week. |
Build failed on mac11.0/cxx17. Failing tests: |
Build failed on ROOT-debian10-i386/cxx14. Failing tests: |
b4c2e55
to
fa314d1
Compare
Starting build on |
Build failed on ROOT-fedora31/noimt. Failing tests: |
Build failed on mac1014/python3. Failing tests: |
Build failed on ROOT-ubuntu16/nortcxxmod. Failing tests: |
Build failed on ROOT-fedora30/cxx14. Failing tests: |
Build failed on ROOT-performance-centos8-multicore/default. Failing tests: |
Starting build on |
021d576
to
819bbac
Compare
Starting build on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice improvement.
This fixes outstanding issues open since a long time,
see https://sft.its.cern.ch/jira/browse/ROOT-8440
A small comment: it would be nice to document (probably in RooAbsPdf::fitTo) than when doing a multi-range fit the pdf's are normalised in the union of the ranges and not in separate ranges.
For doing this second case (much less common), one would have to create two separate RooNLL objects
This problem affects multi-range fits with `RooAbsPdf::fitTo` and was reported in Jira issue ROOT-9530. Each RooNLLVar was created with the normalization set corresponding to the subrange, not the union range like it should be. This commit adds an extra term to the final likelihood function to cancel this normalization problem. The extra term doesn't have to be added for the extended fit, because adding extension terms for the likelihoods in each range as currently done is already equivalent to applying the normalization correction term plus adding a global extension term.
The multirange fit now considers the number of relative events in subranges even when there is no exteded term. This has to be accounted for in the reference results.
819bbac
to
4bbe3bd
Compare
Starting build on |
The PR root-project#7719 changed the behavior of multi-range fits in RooFit, and from then one the relative yields in the multiple regions were also considered in the fit. The tutorial `rf204b_extendedLikelihood_rangedFit` was explaining the old caveat that relative yields are not considered in non-extended fits, but since that is not a problem anymore, a big chunk of explanation in the tutorial should be removed. A Python translation of the tutorial was also added. Closes root-project#8808.
The PR root-project#7719 changed the behavior of multi-range fits in RooFit, and from then one the relative yields in the multiple regions were also considered in the fit. The tutorial `rf204b_extendedLikelihood_rangedFit` was explaining the old caveat that relative yields are not considered in non-extended fits, but since that is not a problem anymore, a big chunk of explanation in the tutorial should be removed. A Python translation of the tutorial was also added. Closes root-project#8808.
The PR root-project#7719 changed the behavior of multi-range fits in RooFit, and from then one the relative yields in the multiple regions were also considered in the fit. The tutorial `rf204b_extendedLikelihood_rangedFit` was explaining the old caveat that relative yields are not considered in non-extended fits, but since that is not a problem anymore, a big chunk of explanation in the tutorial should be removed. A Python translation of the tutorial was also added. Closes root-project#8808.
The PR #7719 changed the behavior of multi-range fits in RooFit, and from then one the relative yields in the multiple regions were also considered in the fit. The tutorial `rf204b_extendedLikelihood_rangedFit` was explaining the old caveat that relative yields are not considered in non-extended fits, but since that is not a problem anymore, a big chunk of explanation in the tutorial should be removed. A Python translation of the tutorial was also added. Closes #8808.
The PR root-project#7719 changed the behavior of multi-range fits in RooFit, and from then one the relative yields in the multiple regions were also considered in the fit. The tutorial `rf204b_extendedLikelihood_rangedFit` was explaining the old caveat that relative yields are not considered in non-extended fits, but since that is not a problem anymore, a big chunk of explanation in the tutorial should be removed. A Python translation of the tutorial was also added. Closes root-project#8808.
This problem affects multi-range fits with
RooAbsPdf::fitTo
and wasreported in Jira issue ROOT-9530.
Each RooNLLVar was created with the normalization set corresponding to
the subrange, not the union range like it should be. This commit adds an
extra term to the final likelihood function to cancel this normalization
problem.
The extra term doesn't have to be added for the extended fit, because
adding extension terms for the likelihoods in each range as currently
done is mathematically equivalent to applying the normalization correction term
plus adding a global extension term.
I'm not sure if I should call this a bug or not. On the one hand, the result that was computed previously was not wrong, it just didn't use all the information (relative number of events in each subrange was not considered). On the other hand, the result was not the best possible estimate given the information in all sub-ranges, and users expects the maximum likelihood estimate to be the best estimate available given the information.
There is also a PDF document that explains the math behind this pull request, in particular why the normalization correction doesn't have to be applied for the extended fit.