-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about mca parameter passing #1731
Comments
Don't look at me - I haven't looked at the ompi schizo component in a very long time. Command line parsing starts with your |
It looks like it's because the options becomes |
So it turns out that --mca mca_base_param_files is not ignored, it was just that pmixmca prefixes were all ignored. The relevant code is all in place |
Actually let me ask a clarifying question: Should the parameter be prefixed with pmixmca and should this be read by the ompischizo component? (This prefixing seems to happen somewhere before parse_cli is called) |
You have a problem. PRRTE and PMIx share a common MCA "base", and so any generic MCA param on the cmd line gets interpreted as belonging to PMIx. Those get converted at the very beginning of Problem you have is that this means all generic MCA params that begin with |
Tentative proposal: we reject all generic Alternative option: apply that only to params that start with |
@jsquyres @bwbarrett Any thoughts? You two generally care the most about such things. |
In the absence of any feedback, I'm going to implement this option. Please note that this applies to more than just the |
Yes, I've been trying to get @jsquyres or @bwbarrett to chime in but they've been busy. I've seen push back from George and Edgar about the change as it would break lots of user scripts. George proposed everyone just check every argument, and there was some debate about why pmix prefixed an --mca mca_ as --pmixmca, we could leave it generic and have each component check to see if they can handle it. I didn't want to make a decision either way without more consensus from our community, but I have not forgotten about this issue. |
I'm afraid that isn't possible. We only change it for the MCA base params, and you cannot have those generally apply as the various layers can't discern which layer is the intended target. It winds up generating a great deal of confusion. I'm probably not available this coming Tues, but I could participate on the OMPI call the week after if it would help. |
Sorry for the huge delay in replying to this; my bandwidth has been sucked up elsewhere. 😦 Hey @rhc54 Could we just add prrte/src/mca/schizo/ompi/schizo_ompi.c Lines 1448 to 1481 in aa498b9
Having OMPI add However, I notice that by the time OMPI schizo is invoked, any prrte/src/mca/schizo/ompi/schizo_ompi.c Line 1617 in aa498b9
|
The problem is that PMIx has a duplicate of every MCA base parameter, so you cannot disambiguate them. Root problem is that the MCA base doesn't prefix its params by project, and we now have multiple projects that contain an MCA base. We do look for frameworks and convert them - no ambiguity there as we already ensure that the params are properly prefixed with the framework/component name. It is just the MCA base that is the problem, and only when the user doesn't prefix the One possible solution is to say that non-prefixed Probably no ideal solution 🤷♂️ |
Depends on what perspective you are looking from. From a user perspective what is not ideal is to be forced to learn the internal naming scheme of a software I am using. I am passing an argument to OMPI, and everything behind shall obey it. If PRRTE wants to have their own parameter naming other than |
So this problem has two parts:
Another issue, I suppose, is that we lost the "source" tag for a param - i.e., all our params get converted to envars, and so the MCA system really doesn't know how the envar got set. I'm not sure there is a simple way to resolve that one, but it might merit some thought as it can be useful to know how a param got set. |
There is a fourth problem: we don't currently intercept and translate envar params. For example, if someone sets |
We chatted on the OMPI webex about this today and came up with plans for this. I posted the info on the wiki: https://github.com/open-mpi/ompi/wiki/WeeklyTelcon_20230425#pmix-mca-parameter-issues @rhc54 Let me know if my text accurately reflects what we discussed. |
Looks accurate, but incomplete - you missed the fourth problem, that we don't currently translate the environment variables. However, the fix for the param file problem will solve the envar one as well, so the text is probably fine as-is. |
I've been reviewing the code, in order to address this issue. I see that the fix that @rhc54 merged into PMIx is exposing parameters that start with "mca_base_" and not parameters that start with just "mca_". I believe that the intention was to expose the latter (which would include the former). @jsquyres - Is that a correct reading of your notes on the wiki? |
Looking into the tuned file aspect of it now |
Sigh - I feel like this has become the never ending treadmill of explanations. Let's try one more time. The MCA component parameters are already dealt with - there is no ambiguity there as we know what framework they belong to, and which project includes that framework. So we can trivially expose those and already did so. The same is not true for the "mca_base_" params as they all refer to the MCA base, and we have multiple projects with their own MCA bases. So we needed a way of resolving them, which is what I provided in the referenced fix. The tuned file support already had that implementation (minus the "mca_base_" problem), so it didn't need changing. |
So, just the tuned files need updating? |
Or, is this completely finished now, and I can close the issue? |
This is the current behavior, for command line parameters:
|
The first MCA parameter ("mca_garbage_param") is getting the "pmixmca" component prefix, but the others are not. Is this correct behavior? |
Also, is there a document that describes the overall (intended) behavior? |
What @jsquyres wrote was an accurate description of what they wanted to do - much of it was already implemented. The referenced PR addressed the remaining pieces. The only piece not yet addressed is what to do with envars set by the user prior to invoking an executable. |
@qkoziol What the blazes are you talking about? Are you quoting the current behavior on OMPI v5??? If so, try updating the submodule pointers before generating outdated reports. |
That output is from a fresh checkout. |
OF WHAT BRANCH? Truly, I'm going to close this issue as it has passed beyond being useful and is just an irritation. |
Using 20ee752 for PRRTE.
I've been messing around with the ompi schizo component in order to add some warning/aborts around the old --mca mca_base_param_files parameters. However it looks like the arg is dropped before we get to the mca parameter parsing. See my garbage parameters:
It looks like parameters prefixed with "mca" are being dropped. Do you have any idea what's happening here or if this is intended behavior?
The text was updated successfully, but these errors were encountered: