-
Notifications
You must be signed in to change notification settings - Fork 876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When using btls always consider the self component #1482
Conversation
enum { | ||
/** Always consider this component for selection. For this flag to | ||
* work properly the component must always be built statically. */ | ||
MCA_BASE_COMPONENT_FLAG_ALLWAYS_CONSIDER = 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALWAYS
has one L
, not 2. 🙊
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hah, misspelled it once and emacs was happy to complete it. fixing.
This commit adds a new flag to the mca_base_component_t structure that specifies component flags. The only flag currently support is MCA_BASE_COMPONENT_FLAG_ALLWAYS_CONSIDER which forces the selection logic in the MCA base to always consider a component whether or not it was specified. In order to use this support a component must be built statically and not dynamically. For example, if btl:self is built statically and sets this flag in its component then -mca btl vader,openib would provide vader, openib, and self. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit makes it so the btl/self component will always be considered for selection in the btl framework. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Just to clarify, with this commit the user need only: What happens if the user does: From my reading of the PR it look like this will be silently ignored by the MCA system. Correct? |
@jjhursey That is correct and I agree we should probably print an error is they try to exclude a component that can not be excluded. Will update the PR shortly. |
This commit adds a show_help() message when a user explicitly attempts to disable a component that can not be disabled. After printing the show_help message the component is enabled and execution continues. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Ok, the show help is now in there as a warning:
|
As no other BTL takes care of local communications, how it is possible that the initialization stage was successfully completed ? |
Initialization completed because the self btl is forced. That is the idea behind the PR. Allow components to be forced on. |
[framework-param:exclude-always-component] | ||
A request was made to exclude a component from consideration that | ||
must always be considered. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make this super clear for the user (per George's comment) that this component will be considered anyway.
Maybe change to:
Warning: A request was made to exclude a component from consideration that
must always be considered. This component (noted below) will -not- be excluded
from consideration. The program will continue running with this component being
made available for selection.
Framework: %s
Component: %s
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My $0.02: if someone tries to disable it and it can't be disabled, we should abort (in line with the philosophy of "a human asked for X, and OMPI can't provide X, so OMPI needs to abort and let a human figure it out").
I understand that having to add self might became an annoyance. However, it is unexpected, that despite the user explicit request not to select self, we forced it's inclusion. We are limiting the flexibility of our MCA components. With this patch, even if someone implements a different self component, she will not be able to test/run it on an installed Open MPI. If instead of forcing self indiscriminately, we strictly abide by the meaning of the inclusion/exclusion parameters, we do not prevent such developments. Would it be enough to add "self" to the mca_btl_base_include variable to obtain a similar effect ? |
@bosilca I agree that it might limit the usefulness of the MCA system for developers. The idea is to make it so users do not have to worry about including a component that should always be there. I can add yet another MCA parameter to disable this behavior so developers who know what they are doing can disable forced selection. |
@hjelmn Having a parameter to override the forced loading seems a bit odd to me in terms of semantics. It also causes a problem for my eventual usecase - where I want to force a component to be there always with no means for the user to exclude it. However, I could propose to add that back end when the time comes. |
@jjhursey Good point. Also, right now there is a way to override self if a replacement is being tested. A developer can simply set btl_self_exclusivity to 0. |
I just noticed this is breaking the backward compatibility with the other version by decreasing the length of the component name. Not sure the expected benefit outweights the drawback. So let me reask my question: Would it be enough to add "self" to the mca_btl_base_include variable (during the MCA parameter registration) to obtain a similar effect (improving our user friendliness), but without breaking backward compatibility nor limiting the scope of the MCA ? |
The component name has not been shortened. I took 4 bytes out of the reserved to handle the flags so this is a backwards compatible change as reserved is always 0. As for adding self to the mca_btl_base_include parameter, there is currently no such parameter. The btl parameter is the only selection parameter and if we set it before registration then it can get overridden by the environment, a file, or the command line. That is what we have now. We can add it after registration which will have more or less the same effect as this PR but moves the responsibility down to the base. I have no objections to handling this functionality in that way but I think this PR provides a cleaner solution. If a component is always required by not also require it to be static. |
Also, if we use the selection parameter it will need to be added in both the framework register an open functions because of MPI_T. |
@hjelmn Does this summarize the current status of this PR:
|
Test passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a Signed-off-by line to this PR's commit.
This PR also contains the necessary infrastructure to support this new feature. This PR reflects the reality that btl/self is always needed when using btls for communication.