Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

orte/prrte: review of some orte dir edit to carry to prrte #7216

Closed
naughtont3 opened this issue Dec 3, 2019 · 3 comments
Closed

orte/prrte: review of some orte dir edit to carry to prrte #7216

naughtont3 opened this issue Dec 3, 2019 · 3 comments
Labels
PR to PRRTE for PRs that we may want to push to PRRTE

Comments

@naughtont3
Copy link
Contributor

naughtont3 commented Dec 3, 2019

Background information

Reviewing commmits to master for orte subdir to see if there are bits we will want/need to carry over to PRRTE. This was prompted by noticing some changes in Nov related to IOF that likely might need to be carried over when we move to PRRTE.

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)


Details of the problem

I walked through recent history for 2019 OMPI commits in theorte subdir and tried to identify items that should be taken over (TAKE), should be skipped or already exist in PRRTE (SKIP), and items where I was unsure/need a review (REVIEW).

Steps:

  1. Generate logfile for orte dir
git log --pretty="format:%h; %ad; %s" orte/ > ORTE.log
  1. Prune down 2019 commits (starting after last clear PRRTE sync)
  2. Manually review changeset (git show SHA), comparing against PRRTE master

My initial review of things looks like this...

TAKE/EDIT:

  • TAKE: f03e2f5; Mon Dec 2 17:29:58 2019 -0500; orte/util/nidmap.c: fix uninitialized variable
    • RHC: Done
  • TAKE: 4f1de51; Fri Nov 15 12:36:36 2019 -0500; Add detection for JSM direct launch
    • RHC: I don't think we can take this over to PRRTE as it won't do anything there. PRRTE is just daemons - there are no PRRTE-based apps and no application would ever call prrte_init. Thus, it isn't clear to me what this component would do.@jjhursey - any thoughts here?
  • TAKE: 9b73f6a; Sun Nov 10 14:58:55 2019 -0500; Fix misleading error message with missing #! interpreter
    • RHC: Done
  • TAKE: b0a487a; Wed Nov 13 11:36:43 2019 -0800; Update IOF redirection options
    • RHC: Done
  • TAKE: c1b8599; Mon Oct 14 12:46:24 2019 -0400; plm/rsh: Add chdir option to change directory before orted exec
    • RHC: Done
  • TAKE: 0e8a97c; Tue Oct 1 17:32:39 2019 -0400; Fix the sigkill timeout sleep to prevent SIGCHLD from preventing completion.
    • RHC: already over there
  • TAKE: 7444b32; Sun Sep 29 18:52:03 2019 -0700; Remove stale references to orte_oob_base.ev_base
    • RHC: already over there
  • TAKE: 5e9d07d; Wed Sep 25 14:23:43 2019 -0700; Merge pull request Cleanup stale code in ORTE/OOB #7010 from rhc54/topic/oob
    • RHC: already over there
  • TAKE: 41eb41c; Wed Sep 25 13:27:41 2019 -0700; Cleanup stale code in ORTE/OOB
    • RHC: already over there
  • EDIT: 027f74b; Mon Apr 22 16:51:04 2019 +0900; orterun.1in: Fix typo and other minor updates
    • RHC: Done
  • TAKE: bf3980d; Fri Apr 12 14:51:52 2019 -0400; fix hang in -np 3 --rank-by core
    • RHC: I fixed this a different way in PRRTE

MAYBE/REVIEW:

  • REVIEW: 37d70a6; Wed Nov 27 06:53:08 2019 -0500; Merge pull request fix misleading error message with missing #! interpreter #7153 from mcoil1/pr/fix-misleading-error-message
    • RHC: Done
  • REVIEW (NEED?): 197beb3; Mon Aug 19 15:36:59 2019 -0400; orterun: remove duplicate code
    • RHC: Not relevant as orted_submit.c was removed in PRRTE - thus, the referenced code only exists in prun
  • REVIEW (NEED?): 9c8671c; Mon Aug 12 16:15:42 2019 -0700; Run-as-root env vars in orterun.c
    • RHC: Done
  • REVIEW: ea0dfc3; Tue Aug 6 07:48:58 2019 -0700; Allow individual jobs to set their map/rank/bind policies
    • RHC: Already over there
  • REVIEW: 4ebb37a; Mon Jul 29 18:23:59 2019 +0000; opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
    • RHC: Already over there
  • REVIEW: 24f2961; Fri Jul 12 17:00:46 2019 +0900; ess/base: fix a misc memory leak in orte_ess_base_proc_binding()
    • RHC: Not relevant as that function is only used by app procs - file removed in PRRTE

NOT-NEEDED/ALREADY-DONE:

  • SKIP: 7714468; Wed Sep 18 17:44:40 2019 -0400; Add 'orte_' prefix to noop_mpir_breakpoint_ptr.
  • SKIP: 067adfa; Tue Sep 17 14:45:51 2019 -0400; Conform MPIR_Breakpoint to MPIR standard.
  • SKIP: 06d188e; Fri Sep 6 08:27:05 2019 -0700; Be a little less restrictive on interface requirements
  • SKIP: 373e816; Wed Sep 4 07:51:20 2019 -0700; Ensure buffer_unload leaves the buffer in a clean state
  • SKIP: bd5a176; Wed Aug 7 05:47:12 2019 -0700; Fix typos
  • SKIP: 00106f5; Thu Jul 18 18:45:50 2019 -0400; Try to prevent the compiler from optimizing out MPIR_Breakpoint().
  • SKIP: b738fa2; Mon Jul 8 18:13:52 2019 -0400; Merge pull request Fix oob_tcp tcp_component_close segfault with active listeners #6796 from orivej/fix-tcp_component_close-segfault
  • SKIP: 78b7e34; Thu Jul 4 20:24:50 2019 +0000; Fix oob_tcp tcp_component_close segfault with active listeners
  • SKIP: de52254; Fri Jun 28 00:44:09 2019 +0000; Fix ORTE_FORCED_TERMINATE message
  • SKIP: 7dad740; Mon May 13 13:19:29 2019 -0700; plm_slurm_module: adjust for new SLURM CLI options
@naughtont3 naughtont3 added the PR to PRRTE for PRs that we may want to push to PRRTE label Dec 3, 2019
@rhc54
Copy link
Contributor

rhc54 commented Dec 27, 2019

@jjhursey Please see comment regarding JSM schizo component in main description

@rhc54
Copy link
Contributor

rhc54 commented Dec 30, 2019

All cited references have been resolved - thx!

@rhc54 rhc54 closed this as completed Dec 30, 2019
@jjhursey
Copy link
Member

jjhursey commented Feb 4, 2020

I guess I didn't follow up here about the JSM schizo component. That component was intended to help OMPI detect being direct launched with JSM. When we move to PRRTE then OMPI will use standard PMIx mechanisms for this detection, so it shouldn't be needed anymore. I agree it wouldn't make sense to have it in PRRTE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR to PRRTE for PRs that we may want to push to PRRTE
Projects
None yet
Development

No branches or pull requests

3 participants