-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RF] Re-implement multi-range fits #11455
Conversation
6d09731
to
72d1028
Compare
c3a3c35
to
66175f9
Compare
Starting build on |
The function that created the RooAddPdf cache elements was completely broken once either the AddPdf or one of its components had a custom normalization set defined via `RooAbsPdf::setNormSet()`. To make this work, I was rethinking and rewriting the cache creation logic to hopefully cover more cases correctly. There are some PRs coming up where I will use `RooAbsPdf::setNormSet()` for all ranged fits, so the changes in this commit will get a lot of test coverage from that.
In the `RooBinSamplingPdf`, the underlying PDF is evaluated with a call to `getVal()` without normalization set. This is not a good idea, becaues it implicitly assumes that the PDF was evaluated with the right normalization range the last time and breaks the case where you set the normalization range of the PDF with `setNormRange()`. It is better to evaluate the PDF explicitly normalized and then declate the RooBinSamplingPdf as self normalized. Some unnecessary printouts in the corresponding unit tests are also removed.
Now that defining the normalization range via `RooAbsPdf::setNormRange` works, we can directly use it to create the right normalization integral for the top-level PDF for the new BatchMode, and we don't need to create any further correction integrals.
Multi-range fits in RooFit are more complicated than they should be. In principle, all that is required is to change the normalization range of the PDF to the union of the ranges. There is a RooAbdPdf interface suggesting that this could be done easily like this: ```C++ pdf.setNormRange("range1,range2") ``` But this didn't wor for RooAddPdfs, which is probably why it was chosen to implement mulit-range fits as a sum of separate RooNLLVars. In the old test statistics framework. But in this case, the PDFs are normalized separately, and extra terms need to be introduced to correct for that. This resulted in lots of complicated code, and still there are issues like root-project#11447, i.e. is still doesn't work for simultaneous fits. Now that in the previous commits the `RooAddPdf` behavior for `setNormRange()` was fixed, one can actually use comma-separated normalization ranges for a single PDF in a multi-range fit. With this commit, this is done in the old RooFit test statistics classes. This has sevaral advantages: 1. Fixes issues with multi-range simultaneous fits 2. It's now in harmony with the logic in the new BatchMode 3. Some speedup because there are less nodes in the computation graph 4. Less code required 5. Logic is easier to understand Maybe there are also new bugs now, but they can be fixed later. I am sure that already now the commit fixes more issues that it creates.
With the recent refactoring of ranged fits, the RooAddPdf configuration when you do a ranged fits changed in the construction of the test statistic. Before: reset default range of the observables to the fitting range and create named range for the coefficient normalization. After: keep default range of the observables and coefficient normalization, and only change the normalization range of the PDF. With the refactoring, there is no test coverage for the first case anymore. That's why this commit suggests a new unit test to cover the RooAddPdf configurations that were covered by the test statistics formally.
Before, the `checkIfRangesOverlap` function was used in `RooAbsPdf::createNLL`, which is why it had some command arguments that accomodated specifically for that usecase. This is not necessary anymore.
The new implementation of multi-ranged fits is not relying on the `RooAbsData::reduce()` function to work with multiple comma-separated ranges. Hence, this is implemented now.
Now that the multi-ranged fits for RooSimultaneous got re-implemented in a better way with hopefully less bugs, the closure check for multi-range fits in `testRooSimultaneous` succeeds, so it should be done in the test. Closes root-project#11447.
Starting build on |
Just pushed again with the fixup commits squashed |
Build failed on ROOT-ubuntu2004/python3. Errors:
|
Build failed on mac1015/cxx17. Failing tests: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice improvements! Thank you Jonas!
Almost a year ago, I fixed the support for comma-separated normalization ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also for RooAddPdfs. As a result, the logic for multi-range likelihood fits was removed from `createNLL()`, because the multi-range fit didn't have to be treated as a special case anymore. The same applies also to chi-square fits. In fact, the reason why the `RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a `RooAbsPdf` overload was *only* this workaround! For regular RooAbsReals, the workaround was not necessary, because there is no normalization. Therefore, quite a few functions were removed in this commit. This is a follow-up to PR root-project#11455 (commit fa10523 in particular), where the same change was already make for regular likelihoods.
Almost a year ago, I fixed the support for comma-separated normalization ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also for RooAddPdfs. As a result, the logic for multi-range likelihood fits was removed from `createNLL()`, because the multi-range fit didn't have to be treated as a special case anymore. The same applies also to chi-square fits. In fact, the reason why the `RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a `RooAbsPdf` overload was *only* this workaround! For regular RooAbsReals, the workaround was not necessary, because there is no normalization. Therefore, quite a few functions were removed in this commit. The multi-range chi2 fit is now also validated by the multi-range fit unit test in `testRooAbsPdf`. This is a follow-up to PR root-project#11455 (commit fa10523 in particular), where the same change was already make for regular likelihoods.
Almost a year ago, I fixed the support for comma-separated normalization ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also for RooAddPdfs. As a result, the logic for multi-range likelihood fits was removed from `createNLL()`, because the multi-range fit didn't have to be treated as a special case anymore. The same applies also to chi-square fits. In fact, the reason why the `RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a `RooAbsPdf` overload was *only* this workaround! For regular RooAbsReals, the workaround was not necessary, because there is no normalization. Therefore, quite a few functions were removed in this commit. The multi-range chi2 fit is now also validated by the multi-range fit unit test in `testRooAbsPdf`. This is a follow-up to PR root-project#11455 (commit fa10523 in particular), where the same change was already make for regular likelihoods.
Almost a year ago, I fixed the support for comma-separated normalization ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also for RooAddPdfs. As a result, the logic for multi-range likelihood fits was removed from `createNLL()`, because the multi-range fit didn't have to be treated as a special case anymore. The same applies also to chi-square fits. In fact, the reason why the `RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a `RooAbsPdf` overload was *only* this workaround! For regular RooAbsReals, the workaround was not necessary, because there is no normalization. Therefore, quite a few functions were removed in this commit. The multi-range chi2 fit is now also validated by the multi-range fit unit test in `testRooAbsPdf`. This is a follow-up to PR #11455 (commit fa10523 in particular), where the same change was already make for regular likelihoods.
Almost a year ago, I fixed the support for comma-separated normalization ranges for pdfs, e.g. `pdf.setNormRange("range1,range2")` was fixed also for RooAddPdfs. As a result, the logic for multi-range likelihood fits was removed from `createNLL()`, because the multi-range fit didn't have to be treated as a special case anymore. The same applies also to chi-square fits. In fact, the reason why the `RooChi2Var` constructor and `RooAbsReal::createChi2()` methods had a `RooAbsPdf` overload was *only* this workaround! For regular RooAbsReals, the workaround was not necessary, because there is no normalization. Therefore, quite a few functions were removed in this commit. The multi-range chi2 fit is now also validated by the multi-range fit unit test in `testRooAbsPdf`. This is a follow-up to PR root-project#11455 (commit fa10523 in particular), where the same change was already make for regular likelihoods.
Multi-range fits in RooFit are more complicated than they should be.
In principle, all that is required is to change the normalization range of the PDF to the union of the ranges.
There is a RooAbdPdf interface to suggest that this could be done easily like this:
pdf.setNormRange("range1,range2")
But this doesn't work well for RooAddPdfs, which is probably why it was chosen to implement mulit-range fits as a sum of separate RooNLLVars. But in this case, the PDFs are normalized separately, and extra terms need to be introduced to correct for that. This resulted in lots of complicated code, and still there are issues like #11447, i.e. is still doesn't work for simultaneous fits.
That's why I decided to fix the
setNormRange()
for RooAddPdfs, and then starting from that re-implement multi-ranged fits in both the new BatchMode and the old RooFit based on that.Closes #11447.