Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error invoking worker from module #18166

Closed
malmaud opened this issue Aug 21, 2016 · 10 comments · Fixed by #22589
Closed

Error invoking worker from module #18166

malmaud opened this issue Aug 21, 2016 · 10 comments · Fixed by #22589
Labels
modules parallelism Parallel or distributed computation

Comments

@malmaud
Copy link
Contributor

malmaud commented Aug 21, 2016

I'm pretty confused about this. Am I doing something wrong here?

julia> addprocs(1)
1-element Array{Int64,1}:
 2

julia> module M
       function f()
       remotecall_fetch(()->1, 2)
       end
       end
M

julia> M.f()
ERROR: On worker 2:
UndefVarError: M not defined
 in deserialize at ./serialize.jl:602
 in handle_deserialize at ./serialize.jl:581
 in deserialize at ./serialize.jl:541
 in deserialize_datatype at ./serialize.jl:822
 in handle_deserialize at ./serialize.jl:571
 in deserialize_msg at ./multi.jl:120
 in message_handler_loop at ./multi.jl:1317
 in process_tcp_streams at ./multi.jl:1276
 in #538 at ./event.jl:68
 in #remotecall_fetch#526(::Array{Any,1}, ::Function, ::Function, ::Base.Worker) at ./multi.jl:1070
 in remotecall_fetch(::Function, ::Base.Worker) at ./multi.jl:1062
 in #remotecall_fetch#529(::Array{Any,1}, ::Function, ::Function, ::Int64) at ./multi.jl:1080
 in f() at ./REPL[2]:3
@simonster
Copy link
Member

It seems you need @everywhere before the module definition so that it is loaded on the worker:

julia> addprocs(1)
1-element Array{Int64,1}:
 2

julia> @everywhere module M
       function f()
       remotecall_fetch(()->1, 2)
       end
       end

julia> M.f()
1

@yuyichao
Copy link
Contributor

Ref #17435 (comment) for cases where @everywhere module ... end doesn't work.

@kshyatt kshyatt added parallelism Parallel or distributed computation modules labels Aug 22, 2016
@amitmurthy
Copy link
Contributor

One workaround if you must absolutely do this is

module M
  function f()
    eval(Main, :(remotecall_fetch(()->1, 2)))
  end
end

There was one case where addprocs was being called from an external module/package and ran into this same issue. A soraround suggested there was to execute the module loading on the worker explicitly under Main. Assume M is in a file M.jl

module M
  function f()
    w = addprocs(1)[1]
    eval(Main, :(remotecall_wait(include, 2, "M.jl")))    
    remotecall_fetch(()->1, 2)
  end
end
export M

The underlying cause seems to be that the closure is defined under M. @JeffBezanson - Do anonymous functions necessarily have to be associated with the enclosing module? Or we need to try and pull in undefined modules from the master process automatically.

@amitmurthy
Copy link
Contributor

Another option is to introduce keyword args addprocs(....; load=[list_of_files], packages=[list_of_package_names]) which will include files or load packages respectively on the newly launched workers.

@jla497
Copy link

jla497 commented Nov 13, 2016

@amitmurthy
Hi, I tried using addprocs(...;load=[myfile.jl]) and it returned LoadError: ArgumentError: Invalid keyword argument load.

@amitmurthy
Copy link
Contributor

@jla497 that was a suggestion, not yet available.

@amitmurthy
Copy link
Contributor

@vtjnash how is #22589 related to this issue?

The original reported problem still exists.

We require that modules be loaded on all workers and can mark this issue as a "won't fix" but how is that related to #22589

@vtjnash
Copy link
Member

vtjnash commented Jul 24, 2017

#22589 implements and documents a corrected version of the above workaround as a replacement for using remotecall_fetch which does not share the problems of remotecall_fetch with regards to needing to define a module on all workers.

@amitmurthy
Copy link
Contributor

remotecall_eval needs to be exported and a doc reference added to the manual.

Also, I don't think the workaround should be suggested - the recommendation should be to load the module on all workers in case of anonymous functions shipped and executed remotely.

@vtjnash
Copy link
Member

vtjnash commented Jul 24, 2017

It's just a helper function. As it turned out, @everywhere already implemented the functionality and just needed an extra argument to expose it. PRs to improve docs are always welcome though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
modules parallelism Parallel or distributed computation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants