-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Use MLEs of NPs to create sampling distributions in ToyCalculator #1610
fix: Use MLEs of NPs to create sampling distributions in ToyCalculator #1610
Conversation
Thanks for the PR @masonproffitt. This looks good, but I'll review it in full tomorrow (Saturday). There was already some discussion on the IRIS-HEP Slack that I want to follow up on, but I'll probably want to summarize some of those points in Issue #1426 so that things will be public and searchable. |
Codecov Report
@@ Coverage Diff @@
## master #1610 +/- ##
==========================================
- Coverage 97.70% 97.70% -0.01%
==========================================
Files 63 63
Lines 4050 4048 -2
Branches 576 576
==========================================
- Hits 3957 3955 -2
Misses 54 54
Partials 39 39
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note to self. I'll push this in.
Yeah, for the following (@masonproffitt's script with some edits) # issue_1426.py
import argparse
import numpy as np
import pyhf
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Determine number of toys.")
parser.add_argument(
"--ntoys",
dest="ntoys",
type=int,
default="2000",
help="Number of pseudo-experiments to perform.",
)
args = parser.parse_args()
spec = {
"channels": [
{
"name": "A",
"samples": [
{
"name": "signal",
"data": [0.9],
"modifiers": [
{"name": "mu", "type": "normfactor", "data": None}
],
},
{
"name": "background",
"data": [0.1],
"modifiers": [
{
"name": "background_normalization",
"type": "normfactor",
"data": None,
}
],
},
],
},
{
"name": "B",
"samples": [
{
"name": "signal",
"data": [0.1],
"modifiers": [
{"name": "mu", "type": "normfactor", "data": None}
],
},
{
"name": "background",
"data": [0.9],
"modifiers": [
{
"name": "background_normalization",
"type": "normfactor",
"data": None,
}
],
},
],
},
]
}
model = pyhf.Model(spec)
data = [15, 90]
test_mu = 7.0
CLs_obs_asymptotics, CLs_exp_asymptotics = pyhf.infer.hypotest(
test_mu,
data,
model,
par_bounds=[(0, 30), (0, 300)],
test_stat="qtilde",
return_expected=True,
calctype="asymptotics",
)
print(
f"Asymptotics: Observed: {CLs_obs_asymptotics}, Expected: {CLs_exp_asymptotics}"
)
np.random.seed(0)
CLs_obs_toybased, CLs_exp_toybased = pyhf.infer.hypotest(
test_mu,
data,
model,
par_bounds=[(0, 30), (0, 300)],
test_stat="qtilde",
return_expected=True,
calctype="toybased",
ntoys=args.ntoys,
)
print(f"Toybased: Observed: {CLs_obs_toybased}, Expected: {CLs_exp_toybased}") we see this effect more prominently as the number of pseudo-experiments ramps up # On master
$ python issue_1426.py --ntoys 2_000
Asymptotics: Observed: 0.42483777279816676, Expected: 0.11233914460989139
Toybased: Observed: 0.444, Expected: 0.0028964518464880524
# PR 1610
$ python issue_1426.py --ntoys 2_000
Asymptotics: Observed: 0.42483777279816676, Expected: 0.11233914460989139
Toybased: Observed: 0.4443237370994025, Expected: 0.13147410358565736 # On master
$ python issue_1426.py --ntoys 10_000
Asymptotics: Observed: 0.42483777279816676, Expected: 0.11233914460989139
Toybased: Observed: 0.4251, Expected: 0.0015866147410933218
# PR 1610
$ python issue_1426.py --ntoys 10_000
Asymptotics: Observed: 0.42483777279816676, Expected: 0.11233914460989139
Toybased: Observed: 0.43241200604621033, Expected: 0.11898498909379338 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @masonproffitt. 👍 I'm going to approve this — thanks for the nice example btw — however I'm going to need @lukasheinrich or @kratsg to review this to make sure that I'm not missing something else that should get patched (it is late at night for me when reviewing) as well as this is a big enough change that this will also probably warrant an immediate patch release.
Description
Resolves #1426.
In short, the problem is a significant divergence between
AsymptoticCalculator
andToyCalculator
for models which have MLEs of their NPs that are not very close to the initial parameter values. This is caused by the fact thatToyCalculator
generates MC toys based on the initial parameter values rather than the profiled MLE values. This behavior is inconsistent with the default RooStats configuration and is inconsistent with the method thatToyCalculator
itself uses to calculate the test statistic values of these MC toys. This invariably leads to inaccurate distributions of the test statistic for both the background-only hypothesis toys and signal + background hypothesis toys.This can easily be demonstrated by a model with a
normfactor
(inspired by a simplified version of the ABCD method that I'm using in my analysis):Using the current
master
(02b1951), if I run:I get:
The observed values are close enough, but the expected values are very different. This would lead to completely different expected upper limits on
mu
.After applying the changes in this PR, I now get:
Now the expected values are at least as compatible as the observed values.
Note that if I do actually want to get expected CLs values assuming the initial parameter values are the true ones, I can still get this using
model.expected_data(model.config.suggested_init())
:which gives me:
in line with the first (expected) result.
Checklist Before Requesting Reviewer
Before Merging
For the PR Assignees: