Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Anomalous couplings histograms merged #644

Open
wants to merge 56 commits into
base: 102x
Choose a base branch
from

Conversation

skyriacoCMS
Copy link

Addition of physics models for HIG-19-009 anomalous coupling analysis.
This includes the models for:

  • NonSMEFT fai
  • SMEFT fai
  • and Lagrangian couplings scans.

hroskes and others added 30 commits June 4, 2018 15:58
Merging fix for compilation error originally targetted at 94x
- compiler warning suggests its redundant to check for NULL in these
  cases
…ombinedLimit into anomalous-couplings-histograms
…lysis-CombinedLimit into anomalous-couplings-histograms
…ombinedLimit into anomalous-couplings-histograms
hroskes and others added 16 commits August 18, 2019 12:55
… case that's needed

(it is for my case because it is very easy to get negative probability if you're not careful)
Revert "class to put systematics in the datacard with linear interpolation in case that's needed"

This reverts commit d8862da.
…imit into anomalous-couplings-histograms

Conflicts:
	python/SpinZeroStructure.py
…Limit into anomalous-couplings-histograms

Conflicts:
	python/ModelTools.py
@usarica
Copy link
Contributor

usarica commented Feb 11, 2021

Hello @skyriacoCMS

Thanks for committing the changes in the physics model. However, since the changes are not tested in the off-shell width formalism, is it possible to commit the SpinZeroStructure.py as a separate file? Once the model is also tested on off-shell models, which use the same file, we can make a separate PR merging the two files.

@usarica
Copy link
Contributor

usarica commented Feb 11, 2021

(Note that committing it as a separate file doesn't change any operation other than specifying the new file name in the t2ws command, the object could even have the same name.)

Copy link
Collaborator

@adewit adewit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the changes to the HZZ classes - is this the only analysis that uses these or are there others (if so, are we sure that the changes don't affect those analyses? I think the only changes in the classes are additions of some methods, but easy to miss things when just reading the code)

Also - shouldn't the ClassDef of the existing classes be incremented? (I'm not 100% sure if this is the case, and couldn't quickly find an example of what's needed in the specific situation where the inheritance structure changes)

Finally a few cosmetic things: one of files that is showing as modified is test/diffNuisances.py, but the diff shows Empty file for it. Do you know what happened?
I'm also wondering if (some of) the commits could be squashed, especially since most of them relate to the physics model itself and Ulascan has already asked if the physics model could be committed as a new file.

src/utils.cc Show resolved Hide resolved
@usarica
Copy link
Contributor

usarica commented Feb 12, 2021

@adewit I actually second the ClassDef version comment; I missed this detail when I commented. The classes are also used in the previous HIG-18-002 analysis, and since their inheritance changes, the version should be incremented to allow compatibility with PDFs imported already in previous analyses.

@skyriacoCMS
Copy link
Author

Hello @adewit and @usarica,

Thanks for your comments. I am working on addressing the points you bring up.

@skyriacoCMS
Copy link
Author

Dear @adewit,

About the changes to the HZZ classes - is this the only analysis that uses these or are there others (if so, are we sure that the >changes don't affect those analyses? I think the only changes in the classes are additions of some methods, but easy to miss >things when just reading the code)

We tested that our changes do not affects anything else and they are indeed simply additions. Originally we were thinking about using a class like that instead of histograms, but it got too complicated.

Also - shouldn't the ClassDef of the existing classes be incremented? (I'm not 100% sure if this is the case, and couldn't quickly > find an example of what's needed in the specific situation where the inheritance structure changes)

I think what you say is correct.

Finally a few cosmetic things: one of files that is showing as modified is test/diffNuisances.py, but the diff shows Empty file for it. > Do you know what happened?

We fixed some bugs but it seems the same bug fixes happened independently in the branch and when me merged we removed our changes but probably it registered as a change.

I'm also wondering if (some of) the commits could be squashed, especially since most of them relate to the physics model itself > and Ulascan has already asked if the physics model could be committed as a new file.

What do you mean by squashed? I can put the changes to the physics model as new dependent file if this is what you suggest.

About the code comment in src/utils.cc:
I'm probably missing something, but why do we need this? Why would we declare a RooCategory parameter as a flatParam?

This is actually important, we do declare a RooCategory as a flatParam in our analysis. Normally (using the convention where we fold the sign into the definition of fai, and fa1 is always positive) we float each fai and then define fa1 = 1 - sum(|other fai|). However Heshy found sometimes that he was able to get a better minimization in some fai regions if the order of the fai definition was changed , so that we could float fa1 between 0 and 1 and define for example |fL1| = 1 - sum(|other fai|). In this case, even though |fL1| is a RooFormulaVar defined in terms of the other parameters, we still needed to float the sign of fL1. This was done with a RooCategory. The relevant lines are here:
https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/pull/644/files#diff-ab7e678783749b2173d108f426769263af625502f70474ad997edc89fb17ab84R208-R221

Heshy used this multiple times and it worked great, combine already knows what to do with it. It sets the sign to positive and minimizes wrt everything else, then sets the sign to negative and minimizes wrt everything else, and takes the min of the two. We would be happy to dig up the output and share with you if you would like to see it.

Thank you,
Savvas.

@usarica
Copy link
Contributor

usarica commented Feb 16, 2021

Hi @skyriacoCMS

Re. how fa1 is obtained, how is it ensured that fa1>=0 is always the case? The fai as defined here would need to apply such constraint. Is there any redefinition of the parameter space or a constraint in normalizations to ensure this? Separately (but you might be using the same workflow as well depending on whether you do it in dedicated classes or in the physics model), how do you ensure this is the case for pdfs?

@nucleosynthesis
Copy link
Contributor

nucleosynthesis commented Feb 16, 2021

We're mixing up a few points here so it might be better to discuss some of this on the combination hn, rather than in this PR thread.

For what is contained in the PR itself, @skyriacoCMS I think Adinda has picked out the main points and you have answered, and hopefully its clear what to do. Savvas, I think when Adinda says "squash" she just means to essentially turn N commits into M commits (M<N). We can also do it in the merge but then we really turn it into 1 commit, while it could be cleaner to keep separate pieces as separate commits in case we need to unpick things later. I think the main one here is to split out the physics model as a separate one (as @usarica suggested)

Now, turning to the behaviour of combine with the RooCategory, what you say about the behaviour of combine is news to me. it's true that combine will do separate fits and then find the one that gives the lowest NLL but for this one has to declare the parameter as a discrete rather than flatParam - at least thats the behaviour I am aware of

@usarica
Copy link
Contributor

usarica commented Feb 17, 2021

Hi @nucleosynthesis, all
Sorry for the confusion, my question was actually only based on curiosity and to learn a bit more about how the changes in the physics model interact. We could also iterate with the authors privately on this question, no need to keep the discussion here. I am not sure I understand the purpose of the change about RooCategory, but probably I don't need to.

Copy link
Collaborator

@ajgilbert ajgilbert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From my side, I think two points to address before merging:

  • Unclear why the ModelTools.py and PhysicsModel.py changes are needed
  • Issue with RooCategory/flatParam raised by @adewit should be addressed. I also don't understand why this is needed.

@@ -106,6 +106,9 @@ def __init__(self,datacard,options):
def setPhysics(self,physicsModel):
self.physics = physicsModel
self.physics.setModelBuilder(self)
for b in self.DC.bins:
for p in self.DC.exp[b].keys():
self.physics.tellAboutProcess(b, p)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me why this is needed - the physics model will already have access to the DC object after L108 here, so if you need to loop through bins and processes in your physics model you can already do this before you reach getYieldScale. Or am I missing something?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajgilbert

Hello Andrew,

About the issue with the RooCategory/flatparameter.
You can find in these paths, the workspace, log and output:
/work-zfs/lhc/heshy/anomalouscouplings/scans/cards_fa3fa2fL1fL1Zg_morecategories_finalforthesis/workspace_lumi137.10_scanfa2_fa2,fL1Zg,fa3,fa1,fL1.root
/work-zfs/lhc/heshy/anomalouscouplings/scans/cards_fa3fa2fL1fL1Zg_morecategories_finalforthesis/log_MultiDimFit_obs_lumi137.10_scanfa2_fa2,fL1Zg,fa3,fa1,fL1.txt
/work-zfs/lhc/heshy/anomalouscouplings/scans/cards_fa3fa2fL1fL1Zg_morecategories_finalforthesis/higgsCombine_obs_lumi137.10_scanfa2_fa2,fL1Zg,fa3,fa1,fL1.MultiDimFit.mH125.root
(the RooCategory in question is sgnCMS_zz4l_fai3, which is the sign of fL1).

About the changes the Physics Model
Basically we need to determine, before we call getYieldScale, whether or not there are anomalous Hff couplings. I accomplished that using this function:
https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/pull/644/files#diff-cb18764af2f0bc5de53451336a743a02148824b812b911c4ba119cc49a994b7eR628-R631
Andrew I can do the modification you suggest here and make a new commit, if you think this is cleaner.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Andrew,

The paths above for the files need to be updated. Let me do this quickly.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I don't really understand what I'm supposed to checking in this link. To confirm:

  • You will add a new commit to revert the changes in PhysicsModel.py and ModelTools.py?
  • RooCategory/flatParam: as Nick wrote above

Now, turning to the behaviour of combine with the RooCategory, what you say about the behaviour of combine is news to me. it's true that combine will do separate fits and then find the one that gives the lowest NLL but for this one has to declare the parameter as a discrete rather than flatParam - at least thats the behaviour I am aware of

I second this, it is not clear to me that flatParam is needed for floating a discrete parameter. Can you explain more clearly why you think it is, ideally with links to the code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In reference to the RooCategory/flatParam thing, there is a way that something not declared in the list of discretes can be picked up, by including the runtime definition ADD_DISCRETE_FALLBACK. Then the current version of the code will pick up anything with pdfindex in the name : https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/102x/src/Combine.cc#L1226-L1240

Now the intention was never to then allow the behaviour to pick up any category, but really for debugging certain workspaces (in particular for the diphoton analyses). I think the best would be to just declare sgnCMS_zz4l_fai3 (which is clearly picked up as a discrete parameter in the output log from @skyriacoCMS) as a discrete in the datacard where its needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the intention was never to then allow the behaviour to pick up any category, but really for debugging certain workspaces (in particular for the diphoton analyses). I think the best would be to just declare sgnCMS_zz4l_fai3 (which is clearly picked up as a discrete parameter in the output log from @skyriacoCMS) as a discrete in the datacard where its needed.

That would be not so desirable for us. The point is that you can use the same datacard to run for multiple configurations (in particular the order of defining the anomalous couplings). All couplings get floated, except that the last one is defined as +/-(1 - sum of the others), where only the sign gets floated. That's decided at the time of text2workspace. Of course you can write it in the datacard and make a separate datacard for every configuration, but it's ugly.

Copy link
Contributor

@nucleosynthesis nucleosynthesis Mar 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why can you not use multiple "sign" categories in the datacard and freeze/float the one that are needed for a specific configuration? Then you only need one workspace.

Or an alternative, to avoid writing lines in the datacard, would be to declare the RooCategory for the sign in the model and then add it to the list of discreteParams in the workspace (since that is what is checked in the end when combine is run). This is where they are created when the card is read :

discparams = ROOT.RooArgSet("discreteParams")

, and here is the lines that will pick them up :
RooArgSet *discreteParameters = (RooArgSet*) w->genobj("discreteParams");
CascadeMinimizerGlobalConfigs::O().pdfCategories = RooArgList();
CascadeMinimizerGlobalConfigs::O().allRooMultiPdfParams = RooArgList();
if (discreteParameters != 0) {
TIterator *dp = discreteParameters->createIterator();
while (RooAbsArg *arg = (RooAbsArg*)dp->Next()) {
RooCategory *cat = dynamic_cast<RooCategory*>(arg);

It should be easy enough in the model to do something similar to the first code block such that the second block will spot them as though they were in the card.

Either of these options would avoid changing the behaviour of combine and would be more desirable

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ajgilbert

Hello Andrew, To answer your request on the python files.
I will make a commit that will change the place of the modifications in the ModelTools.py and add them in the new file SpinZeroStructureAC.py
I will see what we can do about PhysicsModel.py.

Thanks for the feedback.

@nsmith- nsmith- mentioned this pull request Apr 5, 2022
Copy link
Contributor

@usarica usarica left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While not formally a reviewer, I think it is wise if I make a comment based on the mention from Nick Smith in case this PR gets merged. Changes to SpinZeroStructure.py should be migrated to the new SpinZeroStructureAC.py, and SpinZeroStructure.py should be left unchanged. There are substantial changes to SpinZeroStructure.py, which are not discussed with other users of this model.

@nsmith-
Copy link
Collaborator

nsmith- commented Apr 6, 2022

Based on the review comments left so far, I understand that the only python-related changes that will remain after revision will be the new file SpinZeroStructureAC.py. @skyriaco do you agree?
If so, we leave this open and not mark it as a prerequisite for python3 migration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants