-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Can't run horovod with latest nightly wheel #17292
Comments
Saw this on email list and got curious... Looks like problem is probably this commit: Basically
But now as of 3 days ago it is not an inline function anymore. Meaning that consumers need to make sure to link against c_api_error.o to get the symbol. I don't know enough about the build system that produces these nightly builds (does it use the CMake one or the Makefile one?) ... but my hunch would be that either c_api_error.o is not getting built into libmxnet.so. Or somehow it is, but the order it is presented to the linker is before MXAPIHandleException is used, so that symbol isn't included in libmxnet.so. |
@stephenrawls currently a Makefile build is used. You can find it at https://github.com/apache/incubator-mxnet/tree/master/tools/staticbuild We are working on migrating to the cmake build in the future though. |
Thanks @stephenrawls for the analysis.
So to summarize, the problem is not that Horovod requires @szha I will create a PR to fix this. |
Description
Cannot run horovod with latest nightly wheel. It could mean the 1.6 release have same problem too. The last working nightly wheel was 12/30/2019
Error Message
To Reproduce\
Steps to reproduce
What have you tried to solve it?
Environment
The text was updated successfully, but these errors were encountered: