Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 175 chapter 2 #235

Closed
wants to merge 5 commits into from
Closed

Issue 175 chapter 2 #235

wants to merge 5 commits into from

Conversation

dsolt
Copy link
Contributor

@dsolt dsolt commented Jan 23, 2020

Accompanying slides to be added shortly

@jjhursey
Copy link
Member

We should define:

  • peer as "a process within the same namespace".
  • WLM as the scheduler (see notes from the discussion from chapter 1)

Should we define more crisply here what the following mean:

  • tool
    • Note from chapter 1 "tools do not have peers"
  • client
  • server

@jjhursey jjhursey added this to the PMIx v5 Standard milestone Feb 11, 2020
@jjhursey
Copy link
Member

See note here on the universe definition.

The "universe" was meant to be the collection of sessions under the control of a common scheduler or SMS - usually that will mean "everything on the cluster", but could be broader when you extend to the grid environment. IIRC, there was a sentence or two somewhere that tried to capture that definition (perhaps in one of the papers, if not in this doc?).

@dsolt dsolt force-pushed the issue_175_chapter_2 branch 2 times, most recently from 6510014 to 8b56a2a Compare March 19, 2020 16:30
@dsolt
Copy link
Contributor Author

dsolt commented Mar 19, 2020

We would like feedback on whether the changes outlined in this PR can be presented for a first reading at the next quarterly or if you think a preliminary discussion is necessary first as you foresee possible issues with these changes. If you are ok with us moving this forward to a 1st reading please emoji a thumbs up. If you prefer that the working group represent this first for discussion, please emoji a thumbs down.

Please use emoji reactions ON THIS COMMENT to indicate your position on this proposal.

  • You do not need to vote on every proposal
  • If you have no opinion, don't vote - that is also useful data
  • If you've already commented on this issue, please still vote so

@jjhursey
Copy link
Member

I think the red/green cleanup may need some more work. The macros for the PRI (red) vs PRRTE (green) are still in place:

  • pmix-standard/pmix.sty

    Lines 392 to 426 in 8b56a2a

    \newcommand{\refPRIAttributeItem}[1]{\index{#1} \hyperref[attr:#1]{\color{red}\code{#1}} }
    \newcommand{\pastePRIAttributeItemBegin}[1]{
    \refPRIAttributeItem{#1} ~~\StdPaste{str:#1}~~(\StdPaste{attr:#1})
    \vspace{-1.3ex}
    \expandafter
    \begin{adjustwidth}{.95cm}{}
    \StdPaste{#1}
    }
    \newcommand{\pastePRIAttributeItemEnd}{
    \end{adjustwidth}
    }
    \newcommand{\pastePRIAttributeItem}[1]{
    \pastePRIAttributeItemBegin{#1}
    \pastePRIAttributeItemEnd{}
    }
    \newcommand{\refPRRTEAttributeItem}[1]{\index{#1} \hyperref[attr:#1]{\color{green!60!black}\code{#1}} }
    \newcommand{\pastePRRTEAttributeItemBegin}[1]{
    \refPRRTEAttributeItem{#1} ~~\StdPaste{str:#1}~~(\StdPaste{attr:#1})
    \vspace{-1.3ex}
    \expandafter
    \begin{adjustwidth}{.95cm}{}
    \StdPaste{#1}
    }
    \newcommand{\pastePRRTEAttributeItemEnd}{
    \end{adjustwidth}
    }
    \newcommand{\pastePRRTEAttributeItem}[1]{
    \pastePRRTEAttributeItemBegin{#1}
    \pastePRRTEAttributeItemEnd{}
    }

I would replace the old macros with aliases to the generic form. We can go through and clean out the old versions as a separate cleanup effort.

- \newcommand{\refPRIAttributeItem}[1]{\index{#1} \hyperref[attr:#1]{\color{red}\code{#1}} }
+ \newcommand{\refPRIAttributeItem}[1]{ \refAttributeItem{#1} }
- \newcommand{\pastePRIAttributeItemBegin}[1]{
-   \refPRIAttributeItem{#1} ~~\StdPaste{str:#1}~~(\StdPaste{attr:#1})
-   \vspace{-1.3ex}
-    \expandafter
-    \begin{adjustwidth}{.95cm}{}
-     \StdPaste{#1}
- }
- \newcommand{\pastePRIAttributeItemEnd}{
-    \end{adjustwidth}
- }
+ \newcommand{\pastePRIAttributeItemBegin}[1]{ \ pasteAttributeItemBegin{#1} }
+ \newcommand{\pastePRIAttributeItemEnd}{ \pasteAttributeItemEnd{} }

Chap_Terms.tex Outdated Show resolved Hide resolved
@jjhursey
Copy link
Member

jjhursey commented Apr 7, 2020

The changes look good now.
It looks like this PR needs two things:

  • Newer commits to be signed-off
  • Rebased onto master to resolve a conflict.
    • Alternatively, you could leave the declareNewAttribute and declareDepAttribute cleanup for a separate PR by adding them back in. I'm fine with taking them out in this PR if you want.

@jjhursey
Copy link
Member

Can you post a PDF of the latest version of the text to post with the agenda?

Chap_Terms.tex Outdated Show resolved Hide resolved
Chap_Terms.tex Outdated Show resolved Hide resolved
Chap_Terms.tex Outdated Show resolved Hide resolved
@jjhursey
Copy link
Member

PMIx ASC 2Q 2020 Meeting:

@jjhursey jjhursey added the Eligible Eligible for consideration by ASC label Apr 15, 2020
@dsolt
Copy link
Contributor Author

dsolt commented May 8, 2020

We would like feedback on whether the changes in the last commit are sufficiently trivial to avoid a moving this ticket back to a first reading. If you are ok with us moving this forward to a 2nd reading please emoji a thumbs up. If you prefer that the working group represent this as a first reading because you feel the changes are significant, please emoji a thumbs down.

Please use emoji reactions ON THIS COMMENT to indicate your position on this proposal.

You do not need to vote on every proposal
If you have no opinion, don't vote - that is also useful data
If you've already commented on this issue, please still vote so

@rhc54
Copy link
Member

rhc54 commented May 8, 2020

@dsolt I fixed a merge conflict for you - just two conflicting definitions that needed resolution. Feel free to either squash or delete the commit.

@dsolt dsolt force-pushed the issue_175_chapter_2 branch from 8870395 to 0085def Compare June 9, 2020 15:09
Pophale and others added 5 commits June 9, 2020 10:13
…xt to reflect

recent changes in standardization process (ASC, etc).
Signed-off-by: dsolt@us.ibm.com
Signed-off-by: dsolt@us.ibm.com
Signed-off-by: dsolt@us.ibm.com
Signed-off-by: dsolt@us.ibm.com
@dsolt dsolt force-pushed the issue_175_chapter_2 branch from 0085def to a1370b8 Compare June 9, 2020 15:15
@rhc54
Copy link
Member

rhc54 commented Jun 23, 2020

Given the status of this PR and that we are going to just continue to create conflicts as we prep for release of v4, I would like to commit this into the master branch. Can we discuss and perhaps get agreement to do so at the July meeting given that there doesn't appear to be any opposition?

Copy link

@schulzm schulzm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally (and based on experience having just gone through this with MPI) find some of the definitions very vague and in some cases ambiguous. Also, some definitions clearly limit the implementation choices and restrict the use to "currently known" systems.

\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} sessions have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated entry for a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assigned to a particular user by the system -> this seems to indicate that if a user submits multiple allocation requests independently, they are part of the same session, which is IMHO very odd.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's odd if we look at the allocation as independent entities. In which case the RM could create a unique session for reach WLM allocation. However, for a dynamic environment where the user requests their allocation grow or shrink (possibly with the WLM creating an allocation for the new addition, but making it depended on the requesting allocation) we needed wording to capture that.

We're open to suggestions on how to better capture that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what Martin's getting at is not the dynamic increase/decrease of resources in a set, but completely independent allocations that just happen to be concurrent.

If I am right about what Martin means, perhaps rewording slightly: 'session refers to a set of resources allocated as a logical unit and assigned to a particular user by the system WLM, sometimes referred to as a "job" or "allocation".'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference to a job then? Should we just drop "Session"? What is the difference between a session, a job, and an application? What is the intention if a user submits two allocations - are those one session or two? I am not suggestion a change, I am just trying to understand the intention behind these terms.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, oops. I shouldn't have suggested the word "job" as a synonym since in the rest of the document a "job" refers to the execution of applications as a single unit (aka a "job step" in SLURM). I was thinking colloquially and not how we have things defined in the document.

I do see the word "allocation" seemingly used interchangeably with session in the document. Sometimes it is used as part of the definition of a SESSION attribute, e.g. 3.2.11 ("Data Persistence Structure")

Possibly confusingly 3.2.20 and 3.4.27 (I think that is right, the numbering is messed up in my build) the text talks about allocations with respect to jobs. In these cases (and there may be others in the doc), I think "jobs" should be changed to "sessions" since in our definition "sessions" refers to allocated resources while a "job" does not. I don't suggest that as part of this PR, but if indeed we mean "sessions" where it says "jobs" in these cases, we should fix that in a new PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

application: A collection of processes that share the same expression of executable, environment and other associated parameters within a job.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Todo: Let's do a review of use in the standard and make sure this definition is compatible with current usage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will post a PR soon with this text, though I do find it is unfortunate, that the term application does appear to be used in multiple ways in the document. At times it is used very generically to refer to the launching of possibly multiple job steps that coordinate to solve a problem and at other times it is used in this very narrow context to refer to the expression of an executable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So why not just broaden the definition to be: generically to refer to the launching of possibly multiple job steps that coordinate to solve a problem, or refer to the expression of an executable? The context makes it clear which definition is being referenced. No need to overthink this 😄

\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Processes -> do you mean OS processes here? Seems implementation specific and too narrowing to me. What if someone starts designing a system that works only on thread basis (e.g., how it has happened with MPI and MPC) - I don 't think we want to prevent that

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schulzm I agree that this section would benefit from a definition of what we mean by "process". I also agree (I think this is in a different comment somewhere below) that keeping the definition abstract (and not tied to an OS process) will keep the door open for different kinds of implementations in the future (e.g. threaded)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the term process extensively in the standard, so if we want to broaden its definition to mean "execution unit" to allow for a thread interpretation then we would definitely need to define it here. However, that would be a new perspective on the interface and might be much broader reaching that just Chapter 2.

My suggestion would be to take the (maybe implied) definition of a unix process in this PR, then follow up with another that expands that definition so that it can be debated and refined on its own.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the discussion in the ASC meeting, I agree with @jjhursey that broadening the scope beyond OS processes would be a much more extensive change to the document (and thus a separate issue if needed). I am not sure if the meaning of OS processes is 100% clear from this writing. However, I suppose without explicitly saying that 'processes' means something different than OS processes, one should assume OS processes. If we did clarify it, I am not even sure what terminology would be most appropriate.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really an OS process, though? Really intended as a question, but as an application is a set of processes, and a job is a set of applications, what does this mean for heterogeneous systems? How do you represent parts of the applications run on accelerators (both within nodes, but also run on self hosted boosters)? Also, what does this mean for MPC applications? Does for me, as a user, the terminology change when I want to run a 20 MPI process MPC code vs. a 20 MPI process MPICH code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue to track this: #262

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(my) Notes from Aug. 24, 2020 meeting on this thread:

  • v4 draft document added the following items:

process refers to an operating system process, also commonly referred to as a heavyweight process. A process is often comprised of multiple lightweight threads, commonly known as simply threads.

node refers to a single operating system instance. Note that this may encompass one or more physical objects.

  • For a BlueGene system, there is no process separate from the OS. We should make sure the definition encompasses that type of environment.
  • If you PMIx_Spawn a program. That program then calls fork(), and both the parent and the child process both call PMIx_Init. Is this a supported scenario? (Current implementation would not allow this - the question is should we?) If so, then from the server's perspective it created 1 process, but now two processes are connecting back - should it give them two different ranks? This gets tricky... Maybe something to think about for the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just clarifying: the current implementation does indeed support the case where a parent forks a child and both call PMIx_Init.

Note that the PMIx server does not assign ranks - that is left to the host environment. The PMIx server is required to detect that multiple clients have connected using the same rank and track them. OpenPMIx does this today, correctly declaring that the "client has disconnected" once all "clones" of that rank have done so. I can update the client and server chapters to make that clear.

\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is a "single executable"? What if you have multiple identical binaries copied across nodes? What about if you have a heterogeneous system where you need different versions for different ISAs from the same source? How does the use of accelerators, who have separate binaries for the different HW pieces, affect this statement? I think this is too imprecise.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@schulzm I agree here too. Also, sometimes applications have multiple binaries acting in a loosely synchronous manner (e.g. a climate model with ocean and atmosphere binaries). Perhaps we can rephrase as:
"application refers to the executable(s) that comprise a job."

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about this last night when preparing my talk for the ASC today. If we consider PMIx groups then it is possible that the only aspect that distinguishes the two groups of processes is the process set identifier.

For example (a modification of the example in Ch. 13.1):

prun -n 4 --pset ocean myapp -ocean : -n 3 --pset ice myapp -ice

There are two applications in this one job. They are distinct in their arguments, but not in their binaries. If we remove the arguments and assume that myapp will read it's process set to determine its mode of operation then the distinguishing aspect is the pset.

How would you all suggest we adjust the wording here to capture the full meaning of application?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside but possibly relevant: Is the only way you can define a process set is by using PRRTE? How would you do it otherwise since there is no API for defining a process set? Does this launch the two different myapp as separate MPI_WORLDs?

Scanning through the document, it seems that what is meant by application is: "executables that run concurrently as part of a logical unit to compute a task."

I think we can get away with saying "executables" and not specifying if they are the same or different, since they could be either and still be part of the same application.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to define this in the first place? Once a RM allocates the resources, it is up the user to decide what to do: running a set of scripts that subdivide the allocation, or one application binary looks the same to the RM, doesn't it? For the RM, what is the difference to a job or session?

Copy link
Member

@jjhursey jjhursey Aug 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per working group meeting Aug. 17, 2020

  • The term application is easily confused in HPC as the whole workflow (application = workflow) or a job (application = Weather code like WRF) or a grouping within the job (application = ocean subset of the job).
  • In PMIx we mean the latter (a subset of the job). Don't change the term, but let's clarify the definition.
  • Try this definition:

A collection of processes that share the same expression of executable, environment, and other associated parameters within a job.

  • Hierarchy:
    • Process is smallest unit managed by PMIx
    • Application is a collection of processes in a job
    • Job is a collection of applications
    • Session is a collection of jobs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FWIW: I'm not a big fan of reinventing the wheel, so I simply looked up the definition of "application" in a few dictionaries:

  • Websters: a sequence of coded instructions (such as a word processor or a spreadsheet) that can be inserted into a mechanism to perform a particular task or set of tasks
  • Webopedia: An organized list of instructions that, when executed, causes the computer to behave in a predetermined manner.
  • Dictionary.com: a precise sequence of instructions enabling a computer to perform a task;
  • Oxford Dictionary of Computer Science: “A particular role or task to which a computer system can be applied, or, more usually, the software used for such a purpose”. “Any program that is specific to a particular role and makes a direct contribution to performing that role. For example, where a computer handles a company’s finances a payroll program would be an applications program. By contrast, an *operating system or a *software tool may both be essential to the effective use of the computer system, but neither makes a direct contribution to meeting the end-user’s eventual needs.”

Process is smallest unit managed by PMIx

Please note that PMIx does not manage anything. Other than that, your hierarchy looks correct to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, whether PMIx manages anything is something we were pondering in our last meeting. The Job Control related functions do some things that look a little bit like managing and I wonder if we need to be clear about how the "things" controlled by Job Control map to the PMix clients (or the "things" that were created by PMIx, but have not called PMIx_Init). You can pause, resume, kill, signal, terminate based on a namespace/rank specification. What does that namespace/rank designate. Is it only the client that called PMIx_Init or is it the specific thing that PMIx started? I think some of this will have to be system/implementation specific, but we may want to say something about the distinction.

\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The statements "applications refer to binaries" and "application consit of ... processes" doesn't make sense to me: the first is static, the second dynamic. Does this refer to the actual code or the execution of such code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the grouping of processes in the MPMD job. The binary name could be the distinguishing factor. I see your point about mixing binary and process here - it could be clearer.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary name is yet a different term that has not been introduced, yet, and could lead to further complications? Also, some binaries may be created on the fly or modified on the fly - what does this mean for the term application. Two parts may run independently using the MPI Sessions model in their subsets (i.e., multiple applications) and later during the execution decide to create a super session (i.e., turning into one application). You could have a tool on a separate binary name, but linked against the same MPI library, that is integrated into one application, although it does something definitely different? What about if dynamic libraries change (they could even have different names, worst case)? I still think the term application is ill-defined, as this can mean anything from a user's point of view.

\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or in parallel at any given time during their execution.
\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the term "Global rank" a bit unfortunate, as this is not really global. There can be multiple jobs per session and multiple sessions per PMIx instance, which can then no longer be expressed. Also, this seems to cement a two level view of a process hierarchy, which seems limiting for the future.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is not correct. There is a PMIX_GLOBAL_RANK defined as (and associated PMIX_NPROC_OFFSET:

Process rank spanning across all jobs in this session.

So this should be changed to something like Job rank

\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or in parallel at any given time during their execution.
\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Process -> I am not sure this is a good idea to tie this processes, which I assume are OS processes. This is an implementation detail (sure, common at the moment, but with new architectures, accelerators, ... not a fixed thing). One way around that would be to define the term "PMIx processes" and then explicitly state that those are the managing granularity of PMIx, but not guaranteed to map 1:1 to OS processes (similar to the trick we had to do - unfortunately after the fact - in MPI)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my earlier comment, I think this is a good idea, but will ikely be a much broader scope change than just Ch. 2.
I suggest we work on another ticket just for broadening the definition that will chase this PR. So the final result is that both get into v5, but by separating them we can more easily contain the discussion.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is probably the best way, but it should really be done at the same time as not to start making assumptions. We had this in MPI and it caused troubles. I was just pointing this out here to avoid the same issues, especially as this seems more severe here, as PMIx also targets (or should target) heterogenous systems.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we work on another ticket just for broadening the definition that will chase this PR.

Created an issue to track this: #262

\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or in parallel at any given time during their execution.
\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A process is not a member of a node - for one the terminology doesn't match IMHO, but also a node is a piece of HW, which can have multiple OS instances, but there can also be (and there have been systems), which run a global OS across nodes and allow processes to migrate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To the best of my knowledge that type of migration is not well-articulated int eh standard, though should be as the dynamics working group is thinking along those lines. Maybe a following PR developed with that working group could suggest some language here that is sufficiently flexible?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not quite what I meant: a node is not well defined term for a piece of hardware, not a logical instance you schedule something to. Assuming we mean OS processes, we schedule them to OS instances. and then OS has control of them (to some degree). There can be one OS instance spanning several nodes, and there can be several OS instances on one node. IMHO we should not use node here, since different architectures have different meanings for it (and the meaning is changing with the increased use of super nodes (e.g., DGXs).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"the set of processes could be the members of a job, a compute node, or an application execution" Is this what you mean Martin? The way this is written it implies "the set of processes could be members of a compute node." I think that it would be better and more clear as:

"the set of processes could be the members of a job, executing on the same compute node, or part of an application execution."

\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} or \emph{sessions} carried out under the control of a \emph{workflow manager}. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spanning multiple sessions -> I don't think it is frequent that workflows are run by multiple users, is it? (as sessions refer to all resources allocated by one user)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the WLM makes a session equal to an allocation then the workflow could span multiple allocations and sessions all tied to the same user.
Maybe that needs to be clarified here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would then contradict the definition from above, which implies that different sessions belong to different users (see discussion there). Is session and job identical? If not, what is the difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which definition implies that different sessions belong to different users? Session and Job are not identical. A session refers to a set of resources assigned to a user. (not necessarily all the resources a user has at one time, but simply one set of resources assigned to a user). Those resources can be used to create one or more jobs. Each job is assigned a namespace to name that job. (In SLURM terminology I think this would map PMIx Session = slurm allocation, PMIx Job = job step)

Chap_Terms.tex Show resolved Hide resolved
Users should not use the \textbf{\code{PMIX}}, \textbf{\code{PMIx}}, or \textbf{\code{pmix}} prefixes in their applications or libraries so as to avoid symbol conflicts with current and later versions of the \ac{PMIx} standard and implementations such as the \ac{PRI}.

%%%%%%%%%%%
Users shall not use the \textbf{\code{PMIX}}, \textbf{\code{PMIx}}, or \textbf{\code{pmix}} prefixes in their applications or libraries so as to avoid symbol conflicts with \ac{PMIx} implementations.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to add an "" after the reserved name prefixes, as this allows extensions like "PMIxX", similar to what is done in MPI

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it isn't a good point. We chose the prefixing rule years ago, and people have built around it. We cannot change it now. The reservation applies to anything starting with "pmix".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per discussion in ASC meeting (7/22/2020):

  • Here we are talking about symbol names, so we should add the _ after the pmix designators. Allows for PMIXY
  • We might add a line here that the "pmix" string prefix is also reserved by PMIx for keys and attributes. Note that "pmixy" is not allowed, but "pmi-xy" would be ok.

Suggested:

Users shall not have symbols prefixed with ... PMIX_ (add underscores)
Users shall not have attributes prefixed with "pmix"

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry that didn't come across via the github editor. I was not suggesting any name change, but rather to include the underscore in the reservation part, as I assumed that all names have a following underscore anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, working thru the document again for v4, this change simply isn't possible as it will break backward compatibility. We cannot include the underscore in the reservation. If people want to extend the definitions, they will need to either prefix "PMIX" or use some completely different identifier.

@jjhursey
Copy link
Member

PMIx ASC 3Q 2020 Meeting:

  • The vote was delayed to the Q4 meeting to account for the changes recommended in the two comments below:
  • The community requests that the same procedure regarding reading/vote be used for the Q4 meeting. Meaning post a comment when the changes are finished and read only the changed material unless there is need to re-read the whole change during the meeting.
  • The community expressed frustration in the extensive late comments pushed to this ticket just before the vote given that it has been under review since April.
    • The suggestion from the community was that since the vote is delayed for the two items mentioned above the working group should sort through the suggestions and triage those that should be discussed in a separate PR, and those that should be included in this PR (and voted on in the next meeting).
    • It is at the discretion of the working group to triage and resolve the comments.
  • The sense of the community was to not let perfection get in the way of progress. To make the best effort at clarifying the current understanding of the definitions, but not excessively delay progress.
    • It should be noted that the intent of this PR is to clarify the existing meaning of the definitions not to expand or change those definitions. Such expansion or modification needs to be discussed in a separate PR.

Copy link

@schulzm schulzm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional comments to discussion on terms

\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} sessions have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated entry for a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference to a job then? Should we just drop "Session"? What is the difference between a session, a job, and an application? What is the intention if a user submits two allocations - are those one session or two? I am not suggestion a change, I am just trying to understand the intention behind these terms.

\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or in parallel at any given time during their execution.
\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not quite what I meant: a node is not well defined term for a piece of hardware, not a logical instance you schedule something to. Assuming we mean OS processes, we schedule them to OS instances. and then OS has control of them (to some degree). There can be one OS instance spanning several nodes, and there can be several OS instances on one node. IMHO we should not use node here, since different architectures have different meanings for it (and the meaning is changing with the increased use of super nodes (e.g., DGXs).

\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} or \emph{sessions} carried out under the control of a \emph{workflow manager}. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would then contradict the definition from above, which implies that different sessions belong to different users (see discussion there). Is session and job identical? If not, what is the difference?

Chap_Terms.tex Show resolved Hide resolved
Users should not use the \textbf{\code{PMIX}}, \textbf{\code{PMIx}}, or \textbf{\code{pmix}} prefixes in their applications or libraries so as to avoid symbol conflicts with current and later versions of the \ac{PMIx} standard and implementations such as the \ac{PRI}.

%%%%%%%%%%%
Users shall not use the \textbf{\code{PMIX}}, \textbf{\code{PMIx}}, or \textbf{\code{pmix}} prefixes in their applications or libraries so as to avoid symbol conflicts with \ac{PMIx} implementations.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry that didn't come across via the github editor. I was not suggesting any name change, but rather to include the underscore in the reservation part, as I assumed that all names have a following underscore anyway.

\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to define this in the first place? Once a RM allocates the resources, it is up the user to decide what to do: running a set of scripts that subdivide the allocation, or one application binary looks the same to the RM, doesn't it? For the RM, what is the difference to a job or session?

\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Binary name is yet a different term that has not been introduced, yet, and could lead to further complications? Also, some binaries may be created on the fly or modified on the fly - what does this mean for the term application. Two parts may run independently using the MPI Sessions model in their subsets (i.e., multiple applications) and later during the execution decide to create a super session (i.e., turning into one application). You could have a tool on a separate binary name, but linked against the same MPI library, that is integrated into one application, although it does something definitely different? What about if dynamic libraries change (they could even have different names, worst case)? I still think the term application is ill-defined, as this can mean anything from a user's point of view.

\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or in parallel at any given time during their execution.
\item \declareterm{rank}\emph{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, {global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declareterm{workflow}\declareterm{workflows}\emph{workflow} refers to an orchestrated execution plan frequently spanning multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{rank}\emph{rank} refers to the ordinal number (starting from zero) assigned to a process within a set of peer processes. For example, the set of processes could be the members of a job, a compute node, or an application execution. \emph{Global rank} is the rank of a process within its \emph{job}, while \emph{application rank} is the rank of that process within its \emph{application}.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is probably the best way, but it should really be done at the same time as not to start making assumptions. We had this in MPI and it caused troubles. I was just pointing this out here to avoid the same issues, especially as this seems more severe here, as PMIx also targets (or should target) heterogenous systems.

\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really an OS process, though? Really intended as a question, but as an application is a set of processes, and a job is a set of applications, what does this mean for heterogeneous systems? How do you represent parts of the applications run on accelerators (both within nodes, but also run on self hosted boosters)? Also, what does this mean for MPC applications? Does for me, as a user, the terminology change when I want to run a 20 MPI process MPC code vs. a 20 MPI process MPICH code?

While a \ac{PMIx} library implementer, or an \ac{SMS} component server, may choose to support a particular \ac{PMIx} \ac{API}, they are not required to support every attribute that might apply to it. This would pose a significant barrier to entry for an implementer as there can be a broad range of applicable attributes to a given \ac{API}, at least some of which may rarely be used in a specific market area. The \ac{PMIx} community is attempting to help differentiate the attributes by indicating in the standard those that are generally used (and therefore, of higher importance to support) versus those that a ``complete implementation'' would support.

In addition, the document refers to the following entities and process stages when describing use-cases or operations involving \ac{PMIx}:
The \ac{PMIx} \ac{ASC} has further adopted a policy that modification of existing released \acp{API} will only be permitted under extreme circumstances. In its effort to avoid introduction of any such backward incompatibility, the community has avoided the definitions of large numbers of \acp{API} that each focus on a narrow scope of functionality, and instead relied on the definition of fewer generic \acp{API} that include arrays of key-value attributes for ``tuning'' the function's behavior. Thus, modifications to the PMIx standard increasingly consist of the definition of new attributes along with a description of the \acp{API} to which they relate and the expected behavior when used with those \acp{API}.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first sentence here feels like it belongs more in the governance document than the specification. I recommend deleting the first sentence replacing with:
"In an effort to maintain long-term backward compatibility, the PMIx API does not include large numbers of \acp{API} that each focus on a narrow scope of functionality, but instead relies on the definition of fewer generic \acp{API} that include arrays of key-value attributes for ``tuning'' the function's behavior. Thus, modifications to the PMIx standard primarily consist of the definition of new attributes along with a description of the \acp{API} to which they relate and the expected behavior when used with those \acp{API}."

\item \declareterm{session}\emph{session} refers to an allocated set of resources assigned to a particular user by the system \ac{WLM}. Historically, \ac{HPC} \emph{sessions} have consisted of a static allocation of resources - i.e., a block of resources are assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to the current block of assigned resources and is a potentially dynamic entity.
\item \declareterm{slot}\declareterm{slots}\emph{slot} refers to an allocated set of resources assigned to a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{application}\emph{application} refers to a single executable (binary, script, etc.) member of a \emph{job}. Applications consist of one or more \emph{processes}, either operating independently or cooperatively at any given time during their execution.
\item \declareterm{job}\emph{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a \emph{session}. For example, ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' is considered a single \ac{MPMD} job containing two applications.
\item \declareterm{namespace}\emph{namespace} refers to a character string value assigned by the \ac{RM} to a \textit{job}. All \textit{applications} executed as part of that \textit{job} share the same \emph{namespace}. The \emph{namespace} assigned to each \emph{job} must be unique within the scope of the governing \ac{RM}.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is considered -> creates (think about other options, but the goal being to make sure that the command line is not the job or application, but it creates instances of these things in its normal operation)

@jjhursey jjhursey removed the Eligible Eligible for consideration by ASC label Oct 2, 2020
@dsolt
Copy link
Contributor Author

dsolt commented Oct 19, 2020

Closing this and re-opening a new PR because so much has changed since this began. Many of the individual commits where partially pulled into master already and it became a mess.

@dsolt dsolt closed this Oct 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants