-
Notifications
You must be signed in to change notification settings - Fork 876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FORCE-TERMINATE AT base/plm_base_launch_support.c #5048
Comments
note you should (that is unlikely to cause the issue though)
did you configure SLURM with the |
Nah, this has nothing to do with slurm - the problem here is the external PMIx v1.2.3. OMPI v3.1.0 is based on PMIx v2.1, and I suspect the OPAL "glue" to the older PMIx library is having a problem. Changing to PMIx v1.2.5 might help - going to PMIx v2.1.1 would be a better option, though that won't work with direct launch against Slurm PMIx plugin (you could, however, still use the Slurm PMI2 or PMI1 support). |
yep, the issue is in the glue and PMIX v1.2.5 does not help here. in if (NULL == key) {
PMIX_OUTPUT_VERBOSE((7, pmix_globals.debug_output,
"dstore: Does not support passed parameters"));
rc = PMIX_ERR_BAD_PARAM;
PMIX_ERROR_LOG(rc);
return rc;
} At first glance, I could not find a way to fetch all data for a specific rank in PMIx v1.2.5 (kind of I tried rebuilding PMIx 1.2.5 with |
Yeah, that's what I kind of expected. The problem is that the rank probably needs to be "wildcard" for v1.2.5 - but it would be hard to know when to use that value vs any other one. I suspect you should just kill off the ext1x component as there are bound to be more problems. I'm told that Slurm will work with the PMIx v2.x series (only a few APIs were implemented in the Slurm plugin, and they didn't change), so the best solution is to advise Slurm admins to build against PMIx v2.1.1 and things should just work. |
@rhc54 @ggouaillardet Last time I checked (in August), SLURM did not compile with pmix2. But it seems that the newest version does, so it might be an option to use PMIX2 starting from now. I'll talk to our expert here and see if I can get it tested. One problem with this is of course that we need to re-build all our OpenMPI stack so that it can use the new PMIX lib. I guess better now than later - I guess you mean that PMIX1 will not be supported any more in OpenMPI 3? |
It's up to them - it could be done, but it might take a bit of work. Someone would have to look at the OMPI v2.x series to see how certain calls were made and then alter the glue code in OMPI v3 to make the required conversions. Not impossible, and there are probably not that many places requiring it - but I don't know if folks will want to invest their time that way. Note that OMPI v2 code (built against PMIx v1.2.5) should run just fine against Slurm with PMIx v2.1.1, so you don't have to rebuild the older OMPI releases (assuming they are linked against PMIx v1.2.5). |
Well, at the time of deployment we installed the then available pmix 1.2.3. so I guess we do need to rebuild? Rebuilding itself is not a big problem. Just wondering about any potential compatibility issues for users. The earliest we have is OpenMPI 2.0.2, so if that will work fine with pmix 2.1.1, then I guess it sounds safe? Thanks for your help! |
@bwbarrett any opinion ? Is it ok to drop support for PMIx v1.2 mid series ? Or should we try to fix that issue ? |
This problem exists in the combination of OMPI 3.1.0rc3 + PMIx 2.0.3, too.
|
back to That being said, it fails at runtime ... I noted some stuff was not backported into I will resume the v1 related stuff when/if we decide we do not simply want to drop this from the From 6202a1211157c41d393b6a243f4a749051c5af90 Mon Sep 17 00:00:00 2001
From: Gilles Gouaillardet <gilles@rist.or.jp>
Date: Thu, 12 Apr 2018 13:16:43 +0900
Subject: [PATCH] pmix: backport the legacy_get callback
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(back-ported from commit open-mpi/ompi@9fb80bd239860a3d5e571f425cff3d5ebc09dd62)
(back-ported from commit open-mpi/ompi@187352eb3daba9357e53b2e02581824fdeef0539)
(back-ported from commit open-mpi/ompi@e9cd7fd7e6cb90ff1ce1f62fb9f057d14e6fc8c2)
---
examples/hello_c.c | 4 +
opal/mca/pmix/ext1x/pmix1x.c | 7 +-
opal/mca/pmix/ext1x/pmix1x_client.c | 6 +-
opal/mca/pmix/pmix.h | 3 +
opal/mca/pmix/pmix2x/pmix2x.c | 10 +-
opal/mca/pmix/pmix2x/pmix2x.h | 1 +
opal/mca/pmix/pmix2x/pmix2x_client.c | 19 ++-
opal/mca/pmix/pmix2x/pmix2x_component.c | 4 +-
orte/mca/grpcomm/direct/grpcomm_direct.c | 58 +++++--
orte/mca/odls/base/odls_base_default_fns.c | 5 +-
orte/mca/plm/base/plm_base_launch_support.c | 72 +---------
orte/mca/state/dvm/state_dvm.c | 221 ++++++++++++++++-----------
orte/orted/orted_main.c | 94 ++++++++----
13 files changed, 287 insertions(+), 217 deletions(-)
diff --git a/examples/hello_c.c b/examples/hello_c.c
index e44f684..e038065 100644
--- a/examples/hello_c.c
+++ b/examples/hello_c.c
@@ -8,13 +8,17 @@
*/
#include <stdio.h>
+#include <poll.h>
+
#include "mpi.h"
int main(int argc, char* argv[])
{
int rank, size, len;
+ volatile int _dbg = 1;
char version[MPI_MAX_LIBRARY_VERSION_STRING];
+ while (_dbg) poll(NULL, 0, 1);
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &rank);
MPI_Comm_size(MPI_COMM_WORLD, &size);
diff --git a/opal/mca/pmix/ext1x/pmix1x.c b/opal/mca/pmix/ext1x/pmix1x.c
index fbc6025..410c7c7 100644
--- a/opal/mca/pmix/ext1x/pmix1x.c
+++ b/opal/mca/pmix/ext1x/pmix1x.c
@@ -1,6 +1,6 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
- * Copyright (c) 2014-2017 Intel, Inc. All rights reserved.
+ * Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2015 Mellanox Technologies, Inc.
@@ -48,8 +48,13 @@
static const char *pmix1_get_nspace(opal_jobid_t jobid);
static void pmix1_register_jobid(opal_jobid_t jobid, const char *nspace);
+static bool legacy_get(void)
+{
+ return true;
+}
const opal_pmix_base_module_t opal_pmix_ext1x_module = {
+ .legacy_get = legacy_get,
/* client APIs */
.init = pmix1_client_init,
.finalize = pmix1_client_finalize,
diff --git a/opal/mca/pmix/ext1x/pmix1x_client.c b/opal/mca/pmix/ext1x/pmix1x_client.c
index 3d45d35..741cae3 100644
--- a/opal/mca/pmix/ext1x/pmix1x_client.c
+++ b/opal/mca/pmix/ext1x/pmix1x_client.c
@@ -232,8 +232,10 @@ int pmix1_store_local(const opal_process_name_t *proc, opal_value_t *val)
}
}
if (NULL == job) {
- OPAL_ERROR_LOG(OPAL_ERR_NOT_FOUND);
- return OPAL_ERR_NOT_FOUND;
+ job = OBJ_NEW(opal_pmix1_jobid_trkr_t);
+ (void)opal_snprintf_jobid(job->nspace, PMIX_MAX_NSLEN, proc->jobid);
+ job->jobid = proc->jobid;
+ opal_list_append(&mca_pmix_ext1x_component.jobids, &job->super);
}
(void)strncpy(p.nspace, job->nspace, PMIX_MAX_NSLEN);
p.rank = proc->vpid;
diff --git a/opal/mca/pmix/pmix.h b/opal/mca/pmix/pmix.h
index 53e0457..a4936af 100644
--- a/opal/mca/pmix/pmix.h
+++ b/opal/mca/pmix/pmix.h
@@ -867,10 +867,13 @@ typedef int (*opal_pmix_base_process_monitor_fn_t)(opal_list_t *monitor,
opal_list_t *directives,
opal_pmix_info_cbfunc_t cbfunc, void *cbdata);
+typedef bool (*opal_pmix_base_legacy_get_fn_t)(void);
+
/*
* the standard public API data structure
*/
typedef struct {
+ opal_pmix_base_legacy_get_fn_t legacy_get;
/* client APIs */
opal_pmix_base_module_init_fn_t init;
opal_pmix_base_module_fini_fn_t finalize;
diff --git a/opal/mca/pmix/pmix2x/pmix2x.c b/opal/mca/pmix/pmix2x/pmix2x.c
index 34bc3d7..3f38835 100644
--- a/opal/mca/pmix/pmix2x/pmix2x.c
+++ b/opal/mca/pmix/pmix2x/pmix2x.c
@@ -1,6 +1,6 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
- * Copyright (c) 2014-2017 Intel, Inc. All rights reserved.
+ * Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2015 Mellanox Technologies, Inc.
@@ -50,7 +50,7 @@
/* These are functions used by both client and server to
* access common functions in the embedded PMIx library */
-
+static bool legacy_get(void);
static const char *pmix2x_get_nspace(opal_jobid_t jobid);
static void pmix2x_register_jobid(opal_jobid_t jobid, const char *nspace);
static void register_handler(opal_list_t *event_codes,
@@ -72,6 +72,7 @@ static void pmix2x_log(opal_list_t *info,
opal_pmix_op_cbfunc_t cbfunc, void *cbdata);
const opal_pmix_base_module_t opal_pmix_pmix2x_module = {
+ .legacy_get = legacy_get,
/* client APIs */
.init = pmix2x_client_init,
.finalize = pmix2x_client_finalize,
@@ -126,6 +127,11 @@ const opal_pmix_base_module_t opal_pmix_pmix2x_module = {
.register_jobid = pmix2x_register_jobid
};
+static bool legacy_get(void)
+{
+ return mca_pmix_pmix2x_component.legacy_get;
+}
+
static void opcbfunc(pmix_status_t status, void *cbdata)
{
pmix2x_opcaddy_t *op = (pmix2x_opcaddy_t*)cbdata;
diff --git a/opal/mca/pmix/pmix2x/pmix2x.h b/opal/mca/pmix/pmix2x/pmix2x.h
index 19683d0..86eb009 100644
--- a/opal/mca/pmix/pmix2x/pmix2x.h
+++ b/opal/mca/pmix/pmix2x/pmix2x.h
@@ -48,6 +48,7 @@ BEGIN_C_DECLS
typedef struct {
opal_pmix_base_component_t super;
+ bool legacy_get;
opal_list_t jobids;
bool native_launch;
size_t evindex;
diff --git a/opal/mca/pmix/pmix2x/pmix2x_client.c b/opal/mca/pmix/pmix2x/pmix2x_client.c
index 7b8c897..e32f9ef 100644
--- a/opal/mca/pmix/pmix2x/pmix2x_client.c
+++ b/opal/mca/pmix/pmix2x/pmix2x_client.c
@@ -1,6 +1,6 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
- * Copyright (c) 2014-2017 Intel, Inc. All rights reserved.
+ * Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2017 Mellanox Technologies, Inc.
@@ -400,7 +400,6 @@ int pmix2x_store_local(const opal_process_name_t *proc, opal_value_t *val)
PMIX_VALUE_CONSTRUCT(&kv);
pmix2x_value_load(&kv, val);
-
/* call the library - this is a blocking call */
rc = PMIx_Store_internal(&p, val->key, &kv);
PMIX_VALUE_DESTRUCT(&kv);
@@ -596,10 +595,11 @@ int pmix2x_get(const opal_process_name_t *proc, const char *key,
return OPAL_ERR_NOT_INITIALIZED;
}
- if (NULL == proc) {
+ if (NULL == proc && NULL != key) {
/* if they are asking for our jobid, then return it */
if (0 == strcmp(key, OPAL_PMIX_JOBID)) {
(*val) = OBJ_NEW(opal_value_t);
+ (*val)->key = strdup(key);
(*val)->type = OPAL_UINT32;
(*val)->data.uint32 = OPAL_PROC_MY_NAME.jobid;
OPAL_PMIX_RELEASE_THREAD(&opal_pmix_base.lock);
@@ -608,6 +608,7 @@ int pmix2x_get(const opal_process_name_t *proc, const char *key,
/* if they are asking for our rank, return it */
if (0 == strcmp(key, OPAL_PMIX_RANK)) {
(*val) = OBJ_NEW(opal_value_t);
+ (*val)->key = strdup(key);
(*val)->type = OPAL_INT;
(*val)->data.integer = pmix2x_convert_rank(my_proc.rank);
OPAL_PMIX_RELEASE_THREAD(&opal_pmix_base.lock);
@@ -642,6 +643,9 @@ int pmix2x_get(const opal_process_name_t *proc, const char *key,
rc = PMIx_Get(&p, key, pinfo, sz, &pval);
if (PMIX_SUCCESS == rc) {
ival = OBJ_NEW(opal_value_t);
+ if (NULL != key) {
+ ival->key = strdup(key);
+ }
if (OPAL_SUCCESS != (ret = pmix2x_value_unload(ival, pval))) {
rc = pmix2x_convert_opalrc(ret);
} else {
@@ -663,6 +667,9 @@ static void val_cbfunc(pmix_status_t status,
OPAL_ACQUIRE_OBJECT(op);
OBJ_CONSTRUCT(&val, opal_value_t);
+ if (NULL != op->nspace) {
+ val.key = strdup(op->nspace);
+ }
rc = pmix2x_convert_opalrc(status);
if (PMIX_SUCCESS == status && NULL != kv) {
rc = pmix2x_value_unload(&val, kv);
@@ -702,6 +709,7 @@ int pmix2x_getnb(const opal_process_name_t *proc, const char *key,
if (0 == strcmp(key, OPAL_PMIX_JOBID)) {
if (NULL != cbfunc) {
val = OBJ_NEW(opal_value_t);
+ val->key = strdup(key);
val->type = OPAL_UINT32;
val->data.uint32 = OPAL_PROC_MY_NAME.jobid;
cbfunc(OPAL_SUCCESS, val, cbdata);
@@ -713,6 +721,7 @@ int pmix2x_getnb(const opal_process_name_t *proc, const char *key,
if (0 == strcmp(key, OPAL_PMIX_RANK)) {
if (NULL != cbfunc) {
val = OBJ_NEW(opal_value_t);
+ val->key = strdup(key);
val->type = OPAL_INT;
val->data.integer = pmix2x_convert_rank(my_proc.rank);
cbfunc(OPAL_SUCCESS, val, cbdata);
@@ -726,7 +735,9 @@ int pmix2x_getnb(const opal_process_name_t *proc, const char *key,
op = OBJ_NEW(pmix2x_opcaddy_t);
op->valcbfunc = cbfunc;
op->cbdata = cbdata;
-
+ if (NULL != key) {
+ op->nspace = strdup(key);
+ }
if (NULL == proc) {
(void)strncpy(op->p.nspace, my_proc.nspace, PMIX_MAX_NSLEN);
op->p.rank = pmix2x_convert_rank(PMIX_RANK_WILDCARD);
diff --git a/opal/mca/pmix/pmix2x/pmix2x_component.c b/opal/mca/pmix/pmix2x/pmix2x_component.c
index 03246c1..cdcdb7d 100644
--- a/opal/mca/pmix/pmix2x/pmix2x_component.c
+++ b/opal/mca/pmix/pmix2x/pmix2x_component.c
@@ -1,5 +1,5 @@
/*
- * Copyright (c) 2014-2017 Intel, Inc. All rights reserved.
+ * Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
* Copyright (c) 2014-2015 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2016 Cisco Systems, Inc. All rights reserved.
@@ -21,6 +21,7 @@
#include "opal/constants.h"
#include "opal/class/opal_list.h"
#include "opal/util/proc.h"
+#include "opal/util/show_help.h"
#include "opal/mca/pmix/pmix.h"
#include "pmix2x.h"
@@ -74,6 +75,7 @@ mca_pmix_pmix2x_component_t mca_pmix_pmix2x_component = {
MCA_BASE_METADATA_PARAM_CHECKPOINT
}
},
+ .legacy_get = true,
.native_launch = false
};
diff --git a/orte/mca/grpcomm/direct/grpcomm_direct.c b/orte/mca/grpcomm/direct/grpcomm_direct.c
index 8711d2c..530e2ce 100644
--- a/orte/mca/grpcomm/direct/grpcomm_direct.c
+++ b/orte/mca/grpcomm/direct/grpcomm_direct.c
@@ -275,7 +275,7 @@ static void xcast_recv(int status, orte_process_name_t* sender,
size_t inlen, cmplen;
uint8_t *packed_data, *cmpdata;
int32_t nvals, i;
- opal_value_t *kv;
+ opal_value_t kv, *kval;
orte_process_name_t dmn;
OPAL_OUTPUT_VERBOSE((1, orte_grpcomm_base_framework.framework_output,
@@ -461,33 +461,57 @@ static void xcast_recv(int status, orte_process_name_t* sender,
OBJ_CONSTRUCT(&wireup, opal_buffer_t);
opal_dss.load(&wireup, bo->bytes, bo->size);
/* decode it, pushing the info into our database */
- cnt=1;
- while (OPAL_SUCCESS == (ret = opal_dss.unpack(&wireup, &dmn, &cnt, ORTE_NAME))) {
- cnt = 1;
- if (ORTE_SUCCESS != (ret = opal_dss.unpack(&wireup, &nvals, &cnt, OPAL_INT32))) {
+ if (opal_pmix.legacy_get()) {
+ OBJ_CONSTRUCT(&kv, opal_value_t);
+ kv.key = OPAL_PMIX_PROC_URI;
+ kv.type = OPAL_STRING;
+ cnt=1;
+ while (OPAL_SUCCESS == (ret = opal_dss.unpack(&wireup, &dmn, &cnt, ORTE_NAME))) {
+ cnt = 1;
+ if (ORTE_SUCCESS != (ret = opal_dss.unpack(&wireup, &kv.data.string, &cnt, OPAL_STRING))) {
+ ORTE_ERROR_LOG(ret);
+ break;
+ }
+ if (OPAL_SUCCESS != (ret = opal_pmix.store_local(&dmn, &kv))) {
+ ORTE_ERROR_LOG(ret);
+ free(kv.data.string);
+ break;
+ }
+ free(kv.data.string);
+ kv.data.string = NULL;
+ }
+ if (ORTE_ERR_UNPACK_READ_PAST_END_OF_BUFFER != ret) {
ORTE_ERROR_LOG(ret);
- break;
}
- for (i=0; i < nvals; i++) {
+ } else {
+ cnt=1;
+ while (OPAL_SUCCESS == (ret = opal_dss.unpack(&wireup, &dmn, &cnt, ORTE_NAME))) {
+ cnt = 1;
+ if (ORTE_SUCCESS != (ret = opal_dss.unpack(&wireup, &nvals, &cnt, OPAL_INT32))) {
+ ORTE_ERROR_LOG(ret);
+ break;
+ }
+ for (i=0; i < nvals; i++) {
cnt = 1;
- if (ORTE_SUCCESS != (ret = opal_dss.unpack(&wireup, &kv, &cnt, OPAL_VALUE))) {
+ if (ORTE_SUCCESS != (ret = opal_dss.unpack(&wireup, &kval, &cnt, OPAL_VALUE))) {
ORTE_ERROR_LOG(ret);
break;
}
OPAL_OUTPUT_VERBOSE((5, orte_grpcomm_base_framework.framework_output,
- "%s STORING MODEX DATA FOR PROC %s KEY %s",
- ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
- ORTE_NAME_PRINT(&dmn), kv->key));
- if (OPAL_SUCCESS != (ret = opal_pmix.store_local(&dmn, kv))) {
+ "%s STORING MODEX DATA FOR PROC %s KEY %s",
+ ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
+ ORTE_NAME_PRINT(&dmn), kval->key));
+ if (OPAL_SUCCESS != (ret = opal_pmix.store_local(&dmn, kval))) {
ORTE_ERROR_LOG(ret);
- OBJ_RELEASE(kv);
+ OBJ_RELEASE(kval);
break;
}
- OBJ_RELEASE(kv);
+ OBJ_RELEASE(kval);
+ }
+ }
+ if (ORTE_ERR_UNPACK_READ_PAST_END_OF_BUFFER != ret) {
+ ORTE_ERROR_LOG(ret);
}
- }
- if (ORTE_ERR_UNPACK_READ_PAST_END_OF_BUFFER != ret) {
- ORTE_ERROR_LOG(ret);
}
/* done with the wireup buffer - dump it */
OBJ_DESTRUCT(&wireup);
diff --git a/orte/mca/odls/base/odls_base_default_fns.c b/orte/mca/odls/base/odls_base_default_fns.c
index c178c4a..7b30f20 100644
--- a/orte/mca/odls/base/odls_base_default_fns.c
+++ b/orte/mca/odls/base/odls_base_default_fns.c
@@ -152,8 +152,9 @@ int orte_odls_base_default_get_add_procs_data(opal_buffer_t *buffer,
/* if we haven't already done so, provide the info on the
* capabilities of each node */
- if (!orte_node_info_communicated ||
- orte_get_attribute(&jdata->attributes, ORTE_JOB_LAUNCHED_DAEMONS, NULL, OPAL_BOOL)) {
+ if (1 < orte_process_info.num_procs &&
+ (!orte_node_info_communicated ||
+ orte_get_attribute(&jdata->attributes, ORTE_JOB_LAUNCHED_DAEMONS, NULL, OPAL_BOOL))) {
flag = 1;
opal_dss.pack(buffer, &flag, 1, OPAL_INT8);
if (ORTE_SUCCESS != (rc = orte_regx.encode_nodemap(buffer))) {
diff --git a/orte/mca/plm/base/plm_base_launch_support.c b/orte/mca/plm/base/plm_base_launch_support.c
index b39b348..f9c0af4 100644
--- a/orte/mca/plm/base/plm_base_launch_support.c
+++ b/orte/mca/plm/base/plm_base_launch_support.c
@@ -38,6 +38,7 @@
#include "opal/hash_string.h"
#include "opal/util/argv.h"
+#include "opal/util/opal_environ.h"
#include "opal/class/opal_pointer_array.h"
#include "opal/dss/dss.h"
#include "opal/mca/hwloc/hwloc-internal.h"
@@ -681,18 +682,7 @@ void orte_plm_base_post_launch(int fd, short args, void *cbdata)
ORTE_JOBID_PRINT(jdata->jobid)));
goto cleanup;
}
- /* if it was a dynamic spawn, and it isn't an MPI job, then
- * it won't register and we need to send the response now.
- * Otherwise, it is an MPI job and we should wait for it
- * to register */
- if (!orte_get_attribute(&jdata->attributes, ORTE_JOB_NON_ORTE_JOB, NULL, OPAL_BOOL) &&
- !orte_get_attribute(&jdata->attributes, ORTE_JOB_DVM_JOB, NULL, OPAL_BOOL)) {
- OPAL_OUTPUT_VERBOSE((5, orte_plm_base_framework.framework_output,
- "%s plm:base:launch job %s is MPI",
- ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
- ORTE_JOBID_PRINT(jdata->jobid)));
- goto cleanup;
- }
+
/* prep the response */
rc = ORTE_SUCCESS;
answer = OBJ_NEW(opal_buffer_t);
@@ -743,10 +733,7 @@ void orte_plm_base_post_launch(int fd, short args, void *cbdata)
void orte_plm_base_registered(int fd, short args, void *cbdata)
{
- int ret, room, *rmptr;
- int32_t rc;
orte_job_t *jdata;
- opal_buffer_t *answer;
orte_state_caddy_t *caddy = (orte_state_caddy_t*)cbdata;
ORTE_ACQUIRE_OBJECT(caddy);
@@ -770,61 +757,8 @@ void orte_plm_base_registered(int fd, short args, void *cbdata)
return;
}
/* update job state */
- caddy->jdata->state = caddy->job_state;
-
- /* if this isn't a dynamic spawn, just cleanup */
- if (ORTE_JOBID_INVALID == jdata->originator.jobid ||
- orte_get_attribute(&jdata->attributes, ORTE_JOB_NON_ORTE_JOB, NULL, OPAL_BOOL) ||
- orte_get_attribute(&jdata->attributes, ORTE_JOB_DVM_JOB, NULL, OPAL_BOOL)) {
- OPAL_OUTPUT_VERBOSE((5, orte_plm_base_framework.framework_output,
- "%s plm:base:launch job %s is not a dynamic spawn",
- ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
- ORTE_JOBID_PRINT(jdata->jobid)));
- goto cleanup;
- }
-
- /* if it was a dynamic spawn, send the response */
- rc = ORTE_SUCCESS;
- answer = OBJ_NEW(opal_buffer_t);
- if (ORTE_SUCCESS != (ret = opal_dss.pack(answer, &rc, 1, OPAL_INT32))) {
- ORTE_ERROR_LOG(ret);
- ORTE_FORCED_TERMINATE(ORTE_ERROR_DEFAULT_EXIT_CODE);
- OBJ_RELEASE(caddy);
- return;
- }
- if (ORTE_SUCCESS != (ret = opal_dss.pack(answer, &jdata->jobid, 1, ORTE_JOBID))) {
- ORTE_ERROR_LOG(ret);
- ORTE_FORCED_TERMINATE(ORTE_ERROR_DEFAULT_EXIT_CODE);
- OBJ_RELEASE(caddy);
- return;
- }
- /* pack the room number */
- rmptr = &room;
- if (orte_get_attribute(&jdata->attributes, ORTE_JOB_ROOM_NUM, (void**)&rmptr, OPAL_INT)) {
- if (ORTE_SUCCESS != (ret = opal_dss.pack(answer, &room, 1, OPAL_INT))) {
- ORTE_ERROR_LOG(ret);
- ORTE_FORCED_TERMINATE(ORTE_ERROR_DEFAULT_EXIT_CODE);
- OBJ_RELEASE(caddy);
- return;
- }
- }
- OPAL_OUTPUT_VERBOSE((5, orte_plm_base_framework.framework_output,
- "%s plm:base:launch sending dyn release of job %s to %s",
- ORTE_NAME_PRINT(ORTE_PROC_MY_NAME),
- ORTE_JOBID_PRINT(jdata->jobid),
- ORTE_NAME_PRINT(&jdata->originator)));
- if (0 > (ret = orte_rml.send_buffer_nb(orte_mgmt_conduit,
- &jdata->originator, answer,
- ORTE_RML_TAG_LAUNCH_RESP,
- orte_rml_send_callback, NULL))) {
- ORTE_ERROR_LOG(ret);
- OBJ_RELEASE(answer);
- ORTE_FORCED_TERMINATE(ORTE_ERROR_DEFAULT_EXIT_CODE);
- OBJ_RELEASE(caddy);
- return;
- }
+ jdata->state = caddy->job_state;
- cleanup:
/* if this wasn't a debugger job, then need to init_after_spawn for debuggers */
if (!ORTE_FLAG_TEST(jdata, ORTE_JOB_FLAG_DEBUGGER_DAEMON)) {
ORTE_ACTIVATE_JOB_STATE(jdata, ORTE_JOB_STATE_READY_FOR_DEBUGGERS);
diff --git a/orte/mca/state/dvm/state_dvm.c b/orte/mca/state/dvm/state_dvm.c
index 98ef551..6ae8e16 100644
--- a/orte/mca/state/dvm/state_dvm.c
+++ b/orte/mca/state/dvm/state_dvm.c
@@ -257,119 +257,156 @@ static void vm_ready(int fd, short args, void *cbdata)
/* if this is my job, then we are done */
if (ORTE_PROC_MY_NAME->jobid == caddy->jdata->jobid) {
- /* send the daemon map to every daemon in this DVM - we
- * do this here so we don't have to do it for every
- * job we are going to launch */
- buf = OBJ_NEW(opal_buffer_t);
- opal_dss.pack(buf, &command, 1, ORTE_DAEMON_CMD);
- /* if we couldn't provide the allocation regex on the orted
- * cmd line, then we need to provide all the info here */
- if (!orte_nidmap_communicated) {
- if (ORTE_SUCCESS != (rc = orte_regx.nidmap_create(orte_node_pool, &nidmap))) {
- ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(buf);
- return;
+ /* if there is only one daemon in the job, then there
+ * is just a little bit to do */
+ if (1 == orte_process_info.num_procs) {
+ if (!orte_nidmap_communicated) {
+ if (ORTE_SUCCESS != (rc = orte_regx.nidmap_create(orte_node_pool, &orte_node_regex))) {
+ ORTE_ERROR_LOG(rc);
+ return;
+ }
+ orte_nidmap_communicated = true;
}
- orte_nidmap_communicated = true;
} else {
- nidmap = NULL;
- }
- opal_dss.pack(buf, &nidmap, 1, OPAL_STRING);
- if (NULL != nidmap) {
- free(nidmap);
- }
- /* provide the info on the capabilities of each node */
- if (!orte_node_info_communicated) {
- flag = 1;
- opal_dss.pack(buf, &flag, 1, OPAL_INT8);
- if (ORTE_SUCCESS != (rc = orte_regx.encode_nodemap(buf))) {
- ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(buf);
- return;
- }
- orte_node_info_communicated = true;
- /* get wireup info for daemons */
- jptr = orte_get_job_data_object(ORTE_PROC_MY_NAME->jobid);
- wireup = OBJ_NEW(opal_buffer_t);
- for (v=0; v < jptr->procs->size; v++) {
- if (NULL == (dmn = (orte_proc_t*)opal_pointer_array_get_item(jptr->procs, v))) {
- continue;
+ /* send the daemon map to every daemon in this DVM - we
+ * do this here so we don't have to do it for every
+ * job we are going to launch */
+ buf = OBJ_NEW(opal_buffer_t);
+ opal_dss.pack(buf, &command, 1, ORTE_DAEMON_CMD);
+ /* if we couldn't provide the allocation regex on the orted
+ * cmd line, then we need to provide all the info here */
+ if (!orte_nidmap_communicated) {
+ if (ORTE_SUCCESS != (rc = orte_regx.nidmap_create(orte_node_pool, &nidmap))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ return;
}
- val = NULL;
- if (OPAL_SUCCESS != (rc = opal_pmix.get(&dmn->name, NULL, NULL, &val)) || NULL == val) {
+ orte_nidmap_communicated = true;
+ } else {
+ nidmap = NULL;
+ }
+ opal_dss.pack(buf, &nidmap, 1, OPAL_STRING);
+ if (NULL != nidmap) {
+ free(nidmap);
+ }
+ /* provide the info on the capabilities of each node */
+ if (!orte_node_info_communicated) {
+ flag = 1;
+ opal_dss.pack(buf, &flag, 1, OPAL_INT8);
+ if (ORTE_SUCCESS != (rc = orte_regx.encode_nodemap(buf))) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(buf);
- OBJ_RELEASE(wireup);
return;
- } else {
- /* pack the name of the daemon */
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &dmn->name, 1, ORTE_NAME))) {
- ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(buf);
- OBJ_RELEASE(wireup);
- return;
- }
- /* the data is returned as a list of key-value pairs in the opal_value_t */
- if (OPAL_PTR != val->type) {
- ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
- OBJ_RELEASE(buf);
- OBJ_RELEASE(wireup);
- return;
- }
- modex = (opal_list_t*)val->data.ptr;
- numbytes = (int32_t)opal_list_get_size(modex);
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &numbytes, 1, OPAL_INT32))) {
- ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(buf);
- OBJ_RELEASE(wireup);
- return;
+ }
+ orte_node_info_communicated = true;
+ /* get wireup info for daemons */
+ jptr = orte_get_job_data_object(ORTE_PROC_MY_NAME->jobid);
+ wireup = OBJ_NEW(opal_buffer_t);
+ for (v=0; v < jptr->procs->size; v++) {
+ if (NULL == (dmn = (orte_proc_t*)opal_pointer_array_get_item(jptr->procs, v))) {
+ continue;
}
- OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &kv, 1, OPAL_VALUE))) {
+ val = NULL;
+ if (opal_pmix.legacy_get()) {
+ if (OPAL_SUCCESS != (rc = opal_pmix.get(&dmn->name, OPAL_PMIX_PROC_URI, NULL, &val)) || NULL == val) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ } else {
+ /* pack the name of the daemon */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &dmn->name, 1, ORTE_NAME))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ }
+ /* pack the URI */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &val->data.string, 1, OPAL_STRING))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ }
+ OBJ_RELEASE(val);
+ }
+ } else {
+ if (OPAL_SUCCESS != (rc = opal_pmix.get(&dmn->name, NULL, NULL, &val)) || NULL == val) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(buf);
OBJ_RELEASE(wireup);
return;
+ } else {
+ /* pack the name of the daemon */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &dmn->name, 1, ORTE_NAME))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ }
+ /* the data is returned as a list of key-value pairs in the opal_value_t */
+ if (OPAL_PTR != val->type) {
+ ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ }
+ modex = (opal_list_t*)val->data.ptr;
+ numbytes = (int32_t)opal_list_get_size(modex);
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &numbytes, 1, OPAL_INT32))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ }
+ OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &kv, 1, OPAL_VALUE))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buf);
+ OBJ_RELEASE(wireup);
+ return;
+ }
+ }
+ OPAL_LIST_RELEASE(modex);
+ OBJ_RELEASE(val);
}
}
- OPAL_LIST_RELEASE(modex);
- OBJ_RELEASE(val);
}
+ /* put it in a byte object for xmission */
+ opal_dss.unload(wireup, (void**)&bo.bytes, &numbytes);
+ /* pack the byte object - zero-byte objects are fine */
+ bo.size = numbytes;
+ boptr = &bo;
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &boptr, 1, OPAL_BYTE_OBJECT))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(wireup);
+ OBJ_RELEASE(buf);
+ return;
+ }
+ /* release the data since it has now been copied into our buffer */
+ if (NULL != bo.bytes) {
+ free(bo.bytes);
+ }
+ OBJ_RELEASE(wireup);
+ } else {
+ flag = 0;
+ opal_dss.pack(buf, &flag, 1, OPAL_INT8);
}
- /* put it in a byte object for xmission */
- opal_dss.unload(wireup, (void**)&bo.bytes, &numbytes);
- /* pack the byte object - zero-byte objects are fine */
- bo.size = numbytes;
- boptr = &bo;
- if (ORTE_SUCCESS != (rc = opal_dss.pack(buf, &boptr, 1, OPAL_BYTE_OBJECT))) {
+
+ /* goes to all daemons */
+ sig = OBJ_NEW(orte_grpcomm_signature_t);
+ sig->signature = (orte_process_name_t*)malloc(sizeof(orte_process_name_t));
+ sig->signature[0].jobid = ORTE_PROC_MY_NAME->jobid;
+ sig->signature[0].vpid = ORTE_VPID_WILDCARD;
+ if (ORTE_SUCCESS != (rc = orte_grpcomm.xcast(sig, ORTE_RML_TAG_DAEMON, buf))) {
ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(wireup);
OBJ_RELEASE(buf);
+ OBJ_RELEASE(sig);
+ ORTE_FORCED_TERMINATE(ORTE_ERROR_DEFAULT_EXIT_CODE);
return;
}
- /* release the data since it has now been copied into our buffer */
- if (NULL != bo.bytes) {
- free(bo.bytes);
- }
- OBJ_RELEASE(wireup);
- } else {
- flag = 0;
- opal_dss.pack(buf, &flag, 1, OPAL_INT8);
- }
-
- /* goes to all daemons */
- sig = OBJ_NEW(orte_grpcomm_signature_t);
- sig->signature = (orte_process_name_t*)malloc(sizeof(orte_process_name_t));
- sig->signature[0].jobid = ORTE_PROC_MY_NAME->jobid;
- sig->signature[0].vpid = ORTE_VPID_WILDCARD;
- if (ORTE_SUCCESS != (rc = orte_grpcomm.xcast(sig, ORTE_RML_TAG_DAEMON, buf))) {
- ORTE_ERROR_LOG(rc);
OBJ_RELEASE(buf);
- OBJ_RELEASE(sig);
- ORTE_FORCED_TERMINATE(ORTE_ERROR_DEFAULT_EXIT_CODE);
- return;
}
- OBJ_RELEASE(buf);
/* notify that the vm is ready */
fprintf(stdout, "DVM ready\n");
OBJ_RELEASE(caddy);
diff --git a/orte/orted/orted_main.c b/orte/orted/orted_main.c
index fec5082..75906ab 100644
--- a/orte/orted/orted_main.c
+++ b/orte/orted/orted_main.c
@@ -230,6 +230,7 @@ int orte_daemon(int argc, char *argv[])
#if OPAL_ENABLE_FT_CR == 1
char *tmp_env_var = NULL;
#endif
+ opal_value_t val;
/* initialize the globals */
memset(&orted_globals, 0, sizeof(orted_globals));
@@ -460,6 +461,20 @@ int orte_daemon(int argc, char *argv[])
}
ORTE_PROC_MY_DAEMON->jobid = ORTE_PROC_MY_NAME->jobid;
ORTE_PROC_MY_DAEMON->vpid = ORTE_PROC_MY_NAME->vpid;
+ OBJ_CONSTRUCT(&val, opal_value_t);
+ val.key = OPAL_PMIX_PROC_URI;
+ val.type = OPAL_STRING;
+ val.data.string = orte_process_info.my_daemon_uri;
+ if (OPAL_SUCCESS != (ret = opal_pmix.store_local(ORTE_PROC_MY_NAME, &val))) {
+ ORTE_ERROR_LOG(ret);
+ val.key = NULL;
+ val.data.string = NULL;
+ OBJ_DESTRUCT(&val);
+ goto DONE;
+ }
+ val.key = NULL;
+ val.data.string = NULL;
+ OBJ_DESTRUCT(&val);
/* if I am also the hnp, then update that contact info field too */
if (ORTE_PROC_IS_HNP) {
@@ -668,7 +683,6 @@ int orte_daemon(int argc, char *argv[])
&orte_parent_uri);
if (NULL != orte_parent_uri) {
orte_process_name_t parent;
- opal_value_t val;
/* set the contact info into our local database */
ret = orte_rml_base_parse_uris(orte_parent_uri, &parent, NULL);
@@ -684,6 +698,8 @@ int orte_daemon(int argc, char *argv[])
val.data.string = orte_parent_uri;
if (OPAL_SUCCESS != (ret = opal_pmix.store_local(&parent, &val))) {
ORTE_ERROR_LOG(ret);
+ val.key = NULL;
+ val.data.string = NULL;
OBJ_DESTRUCT(&val);
goto DONE;
}
@@ -758,52 +774,76 @@ int orte_daemon(int argc, char *argv[])
/* get any connection info we may have pushed */
{
- opal_value_t *val = NULL, *kv;
+ opal_value_t *vptr = NULL, *kv;
opal_list_t *modex;
int32_t flag;
- if (OPAL_SUCCESS != (ret = opal_pmix.get(ORTE_PROC_MY_NAME, NULL, NULL, &val)) || NULL == val) {
- /* just pack a marker indicating we don't have any to share */
- flag = 0;
- if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &flag, 1, OPAL_INT32))) {
- ORTE_ERROR_LOG(ret);
- OBJ_RELEASE(buffer);
- goto DONE;
- }
- } else {
- /* the data is returned as a list of key-value pairs in the opal_value_t */
- if (OPAL_PTR == val->type) {
- modex = (opal_list_t*)val->data.ptr;
- flag = (int32_t)opal_list_get_size(modex);
+ if (opal_pmix.legacy_get()) {
+ if (OPAL_SUCCESS != (ret = opal_pmix.get(ORTE_PROC_MY_NAME, OPAL_PMIX_PROC_URI, NULL, &vptr)) || NULL == vptr) {
+ /* just pack a marker indicating we don't have any to share */
+ flag = 0;
if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &flag, 1, OPAL_INT32))) {
ORTE_ERROR_LOG(ret);
OBJ_RELEASE(buffer);
goto DONE;
}
- OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
- if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &kv, 1, OPAL_VALUE))) {
- ORTE_ERROR_LOG(ret);
- OBJ_RELEASE(buffer);
- goto DONE;
- }
- }
- OPAL_LIST_RELEASE(modex);
} else {
- opal_output(0, "VAL KEY: %s", (NULL == val->key) ? "NULL" : val->key);
- /* single value */
flag = 1;
if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &flag, 1, OPAL_INT32))) {
ORTE_ERROR_LOG(ret);
OBJ_RELEASE(buffer);
goto DONE;
}
- if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &val, 1, OPAL_VALUE))) {
+ if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &vptr, 1, OPAL_VALUE))) {
ORTE_ERROR_LOG(ret);
OBJ_RELEASE(buffer);
goto DONE;
}
+ OBJ_RELEASE(vptr);
+ }
+ } else {
+ if (OPAL_SUCCESS != (ret = opal_pmix.get(ORTE_PROC_MY_NAME, NULL, NULL, &vptr)) || NULL == vptr) {
+ /* just pack a marker indicating we don't have any to share */
+ flag = 0;
+ if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &flag, 1, OPAL_INT32))) {
+ ORTE_ERROR_LOG(ret);
+ OBJ_RELEASE(buffer);
+ goto DONE;
+ }
+ } else {
+ /* the data is returned as a list of key-value pairs in the opal_value_t */
+ if (OPAL_PTR == vptr->type) {
+ modex = (opal_list_t*)vptr->data.ptr;
+ flag = (int32_t)opal_list_get_size(modex);
+ if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &flag, 1, OPAL_INT32))) {
+ ORTE_ERROR_LOG(ret);
+ OBJ_RELEASE(buffer);
+ goto DONE;
+ }
+ OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
+ if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &kv, 1, OPAL_VALUE))) {
+ ORTE_ERROR_LOG(ret);
+ OBJ_RELEASE(buffer);
+ goto DONE;
+ }
+ }
+ OPAL_LIST_RELEASE(modex);
+ } else {
+ /* single value */
+ flag = 1;
+ if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &flag, 1, OPAL_INT32))) {
+ ORTE_ERROR_LOG(ret);
+ OBJ_RELEASE(buffer);
+ goto DONE;
+ }
+ if (ORTE_SUCCESS != (ret = opal_dss.pack(buffer, &vptr, 1, OPAL_VALUE))) {
+ ORTE_ERROR_LOG(ret);
+ OBJ_RELEASE(buffer);
+ goto DONE;
+ }
+ OBJ_RELEASE(vptr);
+ }
}
- OBJ_RELEASE(val);
}
}
--
1.7.1
|
@kawashima-fj the previous patch plus the one below (that should land into master first) seems to fix the issue for me with PMIx v2.0.3 can you please give it a try ? i am now testing PMIX v1.2.5 diff --git a/orte/mca/odls/base/odls_base_default_fns.c b/orte/mca/odls/base/odls_base_default_fns.c
index 7b30f20..8d178a4 100644
--- a/orte/mca/odls/base/odls_base_default_fns.c
+++ b/orte/mca/odls/base/odls_base_default_fns.c
@@ -15,8 +15,8 @@
* All rights reserved.
* Copyright (c) 2011-2017 Cisco Systems, Inc. All rights reserved
* Copyright (c) 2013-2018 Intel, Inc. All rights reserved.
- * Copyright (c) 2014-2017 Research Organization for Information Science
- * and Technology (RIST). All rights reserved.
+ * Copyright (c) 2014-2018 Research Organization for Information Science
+ * and Technology (RIST). All rights reserved.
* Copyright (c) 2017 Mellanox Technologies Ltd. All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* $COPYRIGHT$
@@ -169,38 +169,60 @@ int orte_odls_base_default_get_add_procs_data(opal_buffer_t *buffer,
wireup = OBJ_NEW(opal_buffer_t);
/* always include data for mpirun as the daemons can't have it yet */
val = NULL;
- if (OPAL_SUCCESS != (rc = opal_pmix.get(ORTE_PROC_MY_NAME, NULL, NULL, &val)) || NULL == val) {
- ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(wireup);
- return rc;
- } else {
- /* the data is returned as a list of key-value pairs in the opal_value_t */
- if (OPAL_PTR != val->type) {
- ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
- OBJ_RELEASE(wireup);
- return ORTE_ERR_NOT_FOUND;
- }
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, ORTE_PROC_MY_NAME, 1, ORTE_NAME))) {
+ if (opal_pmix.legacy_get()) {
+ if (OPAL_SUCCESS != (rc = opal_pmix.get(ORTE_PROC_MY_NAME, OPAL_PMIX_PROC_URI, NULL, &val)) || NULL == val) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(wireup);
return rc;
+ } else {
+ /* pack the name of the daemon */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, ORTE_PROC_MY_NAME, 1, ORTE_NAME))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ /* pack the URI */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &val->data.string, 1, OPAL_STRING))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ OBJ_RELEASE(val);
}
- modex = (opal_list_t*)val->data.ptr;
- numbytes = (int32_t)opal_list_get_size(modex);
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &numbytes, 1, OPAL_INT32))) {
+ } else {
+ if (OPAL_SUCCESS != (rc = opal_pmix.get(ORTE_PROC_MY_NAME, NULL, NULL, &val)) || NULL == val) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(wireup);
return rc;
- }
- OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &kv, 1, OPAL_VALUE))) {
+ } else {
+ /* the data is returned as a list of key-value pairs in the opal_value_t */
+ if (OPAL_PTR != val->type) {
+ ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
+ OBJ_RELEASE(wireup);
+ return ORTE_ERR_NOT_FOUND;
+ }
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, ORTE_PROC_MY_NAME, 1, ORTE_NAME))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ modex = (opal_list_t*)val->data.ptr;
+ numbytes = (int32_t)opal_list_get_size(modex);
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &numbytes, 1, OPAL_INT32))) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(wireup);
return rc;
}
+ OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &kv, 1, OPAL_VALUE))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ }
+ OPAL_LIST_RELEASE(modex);
+ OBJ_RELEASE(val);
}
- OPAL_LIST_RELEASE(modex);
- OBJ_RELEASE(val);
}
/* if we didn't rollup the connection info, then we have
* to provide a complete map of connection info */
@@ -210,41 +232,66 @@ int orte_odls_base_default_get_add_procs_data(opal_buffer_t *buffer,
continue;
}
val = NULL;
- if (OPAL_SUCCESS != (rc = opal_pmix.get(&dmn->name, NULL, NULL, &val)) || NULL == val) {
- ORTE_ERROR_LOG(rc);
- OBJ_RELEASE(buffer);
- return rc;
- } else {
- /* the data is returned as a list of key-value pairs in the opal_value_t */
- if (OPAL_PTR != val->type) {
- ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
- OBJ_RELEASE(buffer);
- return ORTE_ERR_NOT_FOUND;
- }
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &dmn->name, 1, ORTE_NAME))) {
+ if (opal_pmix.legacy_get()) {
+ if (OPAL_SUCCESS != (rc = opal_pmix.get(&dmn->name, OPAL_PMIX_PROC_URI, NULL, &val)) || NULL == val) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(buffer);
OBJ_RELEASE(wireup);
return rc;
+ } else {
+ /* pack the name of the daemon */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &dmn->name, 1, ORTE_NAME))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buffer);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ /* pack the URI */
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &val->data.string, 1, OPAL_STRING))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buffer);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ OBJ_RELEASE(val);
}
- modex = (opal_list_t*)val->data.ptr;
- numbytes = (int32_t)opal_list_get_size(modex);
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &numbytes, 1, OPAL_INT32))) {
+ } else {
+ if (OPAL_SUCCESS != (rc = opal_pmix.get(&dmn->name, NULL, NULL, &val)) || NULL == val) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(buffer);
- OBJ_RELEASE(wireup);
return rc;
- }
- OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
- if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &kv, 1, OPAL_VALUE))) {
+ } else {
+ /* the data is returned as a list of key-value pairs in the opal_value_t */
+ if (OPAL_PTR != val->type) {
+ ORTE_ERROR_LOG(ORTE_ERR_NOT_FOUND);
+ OBJ_RELEASE(buffer);
+ return ORTE_ERR_NOT_FOUND;
+ }
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &dmn->name, 1, ORTE_NAME))) {
ORTE_ERROR_LOG(rc);
OBJ_RELEASE(buffer);
OBJ_RELEASE(wireup);
return rc;
}
+ modex = (opal_list_t*)val->data.ptr;
+ numbytes = (int32_t)opal_list_get_size(modex);
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &numbytes, 1, OPAL_INT32))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buffer);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ OPAL_LIST_FOREACH(kv, modex, opal_value_t) {
+ if (ORTE_SUCCESS != (rc = opal_dss.pack(wireup, &kv, 1, OPAL_VALUE))) {
+ ORTE_ERROR_LOG(rc);
+ OBJ_RELEASE(buffer);
+ OBJ_RELEASE(wireup);
+ return rc;
+ }
+ }
+ OPAL_LIST_RELEASE(modex);
+ OBJ_RELEASE(val);
}
- OPAL_LIST_RELEASE(modex);
- OBJ_RELEASE(val);
}
}
} |
Since Open MPI v3.1.0 has not been released yet, should we simply drop the |
@ggouaillardet OK. I'll try. (may be next week) |
@ggouaillardet I had time to try your patch (with OMPI 3.1.0rc3 + PMIx 2.0.3) today. The original issue was abend of
|
How did you test that ? I will try again too. |
Manually build using gcc 4.4.7 on Fujitsu PRIMEHPC FX100 (sparc64):
Run on Fujitsu PRIMEHPC FX100 (sparc64):
I didn't try on x86. |
@kawashima-fj fwiw, i cannot reproduce this on x86_64 nor FX10 (I could only test up to 24 nodes and 8 tasks per node though) |
@ggouaillardet Something was wrong in my previous try. Today I refreshed my installation and built Open MPI cleanly. It worked fine. |
@kawashima-fj thanks, I will push the fix in master and then PR vs the |
Do we know if v3.0.x is impacted by this? |
I do not think so. You can grep |
as I wrote earlier, this patch is necessary but not sufficient for PMIx v1.2 The first issue I faced is in OBJ_CONSTRUCT(&val, opal_value_t);
val.key = OPAL_PMIX_PROC_URI;
val.type = OPAL_STRING;
val.data.string = orte_process_info.my_daemon_uri;
if (OPAL_SUCCESS != (ret = opal_pmix.store_local(ORTE_PROC_MY_DAEMON, &val))) { This fails since there is no There is no such issue in orte_rml.set_contact_info(orte_process_info.my_daemon_uri); FWIW, I tried the following hack to force the creation of the missing objects, but I ended up with a deadlock when using 2 mpi tasks or more diff --git a/opal/mca/pmix/ext1x/pmix1x_client.c b/opal/mca/pmix/ext1x/pmix1x_client.c
index 480b206..1e663dd 100644
--- a/opal/mca/pmix/ext1x/pmix1x_client.c
+++ b/opal/mca/pmix/ext1x/pmix1x_client.c
@@ -232,8 +232,16 @@ int pmix1_store_local(const opal_process_name_t *proc, opal_value_t *val)
}
}
if (NULL == job) {
- OPAL_ERROR_LOG(OPAL_ERR_NOT_FOUND);
- return OPAL_ERR_NOT_FOUND;
+ opal_list_t info;
+ opal_value_t * val;
+ job = OBJ_NEW(opal_pmix1_jobid_trkr_t);
+ (void)opal_snprintf_jobid(job->nspace, PMIX_MAX_NSLEN, proc->jobid);
+ job->jobid = proc->jobid;
+ opal_list_append(&mca_pmix_ext1x_component.jobids, &job->super);
+ /* force PMIx to create the internal namespace */
+ OBJ_CONSTRUCT(&info, opal_list_t);
+ pmix1_get(proc, NULL, &info, &val);
+ OBJ_DESTRUCT(&info);
}
(void)strncpy(p.nspace, job->nspace, PMIX_MAX_NSLEN);
p.rank = proc->vpid; At this stage, I can only see two options
So this is both a technical issue (@rhc54 could you please share some insights ?) and a RM issue (@bwbarrett any opinion on this ?) |
@ggouaillardet, pretend I have like 10 minutes a week to spend on this issue. Can you summarize in an actionable way? Earlier, you asked about removing support for something mid-version. That's clearly not ok, but it's unclear what broken and when (like did it ever work in the series or were we claiming support for something broken)? |
@bwbarrett so here is the status (short version)
so I can see two ways of seeing this
Long(er) story
My current understanding is that these subroutines are still needed if we want to support
Bottom line, |
3.1.0 has not been released. However, as a minor version bump, it shouldn’t remove features from 3.0.x. So what does 3.0.x look like? |
|
I missed the discussion on this ticket from the call. Were there action items to move forward? |
@ggouaillardet Did these fixes get into the release branches? If so, can we close this? |
PMIx v1.2 has passed outside the support window. |
Background information
What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)
Open MPI repo revision: v3.1.0rc3
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Please describe the system on which you are running
Operating system/version: CentOS Linux release 7.4.1708 (Core)
Computer hardware: Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz
Network type: Infiniband controller: Mellanox Technologies MT27700 Family [ConnectX-4]
Details of the problem
mpirun
seems to have problems starting workers. I runmpirun ls
from within a SLURM allocation:The above works for v3.0.1. Note that I'm compiling against a locally installed PMIx 1.2.3. Is this the problem?
The text was updated successfully, but these errors were encountered: