Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perhaps call multiprocessing.set-executable at startup #1786

Closed
Kodiologist opened this issue Jun 18, 2019 · 7 comments · Fixed by #2337
Closed

Perhaps call multiprocessing.set-executable at startup #1786

Kodiologist opened this issue Jun 18, 2019 · 7 comments · Fixed by #2337
Labels

Comments

@Kodiologist
Copy link
Member

As mentioned elsewhere, rightly or wrongly, runnng hy sets sys.executable to the Hy startup script rather than the underlying Python executable. One surprising consequence of this is that importing sklearn, for example, yields the message

Traceback (most recent call last):
  File "/usr/local/bin/hy", line 10, in <module>
    sys.exit(hy_main())
  File "<string>", line 1, in <module>
NameError: name 'hyx_from' is not defined

from a child process, and then leaves that child as a zombie. The error is from a scikit-learn dependency, joblib, which tries to call /usr/local/bin/hy -c 'from multiprocessing.semaphore_tracker import main;main(3)' when it's imported.

One can avoid this problem by calling multiprocessing.set_executable to select a real Python executable, so perhaps Hy itself should do this.

@wongjoel
Copy link

wongjoel commented Aug 28, 2019

This issue also affects dask distributed when trying to start a LocalCluster (tested on hy 0.17.0+78.g645d2e0)

(import [dask.distributed [Client LocalCluster wait]])

(defmain [&rest args]
  (print "Start")
  (setv cluster (LocalCluster))
  (print "End"))
; NameError: name 'hyx_from' is not defined

However, if you select a real python executable with multiprocessing.set_executable, you run into an invalid syntax error as it appears dask will pass the hy source file directly to python.

(import shutil multiprocessing)
(import [dask.distributed [Client LocalCluster wait]])

(defmain [&rest args]
  (print "Start")
  (setv python_path (.which shutil "python3"))
  (.set_executable multiprocessing python_path)
  (setv cluster (LocalCluster))
  (print "End"))
; SyntaxError: invalid syntax

@allison-casey
Copy link
Contributor

I can't recreate the issue with sklearn but i can with dask. However, setting the multiprocessing executable to python3 seems to actually work now. @Kodiologist is the sklearn thing still an issue for you? We might just want to set multiprocessing.set_executable to python on startup instead.

@Kodiologist
Copy link
Member Author

Nope, I no longer see that error from importing sklearn or joblib. It's probably from a change to joblib.

@allison-casey
Copy link
Contributor

since dask still crashes because, and i'm sure other python packages that use multiprocessing extensively, maybe we should set multiprocessing to sys.executable before we set sys.executable to hy. Making it work more sensibly for 3rd party packages and something we can note as a behaviour hy users using multiprocessing should opt in to?

@mjreed-wbd
Copy link

mjreed-wbd commented Aug 25, 2021

It looks like Hy saves the original value of sys.executable in hy.sys_executable, so you can use that to set it correctly in multiprocessing or wherever. But then it blows up when the new process tries to parse the original file containing the Hy code. I don't see any way to use multiprocessing from Hy.

@Kodiologist
Copy link
Member Author

But then it blows up

Please provide a reproducible example. Regarding "the new process tries to parse the original file containing the Hy code", that sounds like Windows Python's emulation of fork; are you using Windows?

@Kodiologist
Copy link
Member Author

An updated example looks like this:

(import dask.distributed [LocalCluster])

(defn main []
  (LocalCluster))

(when (= __name__ "__main__")
  (main))

If I run this with Hy master, I get NameError: name 'from' is not defined, as before. Prepending (import multiprocessing) (multiprocessing.set-executable hy.sys-executable) doesn't work; it leads to Python trying to interpret this file as Python instead of Hy. However, using a Python wrapper script like

import hy

if __name__ == '__main__':
    import dask_ex
    dask_ex.main()

works fine, without multiprocessing.set-executable. This isn't the only issue you can work around by using a Python wrapper (or hy2pying all your Hy), so probably that's what we should tell people to do in this and similar situations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants