Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set default exeflags to start procs using active project #42089

Closed
wants to merge 2 commits into from

Conversation

chriselrod
Copy link
Contributor

@chriselrod chriselrod commented Sep 2, 2021

Maybe a breaking change, also probably not the best way to implement this. Someone setting exeflag may find it surprising that doing so suddenly no longer sets the project to be the active one.

However, I think most people will find this to be the expected behavior, so I'd like at least some discussion on this behavior. It can be surprising to addprocs and suddenly have using error or load different versions than from your current session.

Perhaps there already was some discussion on the desired behavior here?

EDIT:
I tested this locally. I'm willing to bet that this won't work when the procs don't share a filesystem, unless they all happen to have a project at the same path.

So maybe I should edit addprocs(np::Integer, kwargs...), having it forward to something like addprocs(manager; exeflags = get(kwargs, :exeflags, "--project=$(Base.active_project())"), kwargs...) instead, to set the exeflags this way).

@fredrikekre
Copy link
Member

See #28781

@jpsamaroo
Copy link
Member

+1 to this. The fact that we don't already set this by default is a large inconvenience to the average user, who's typically either running purely local processes, or running remote processes on a shared filesystem. The current behavior often creates situations where the desired packages are loaded on worker 1, but all other workers load a different set of packages (with potentially different structures and functions), causing serialization errors or odd bugs, especially when more complicated distributed computing packages (like Dagger.jl) are used.

@ericphanson
Copy link
Contributor

who's typically either running purely local processes, or running remote processes on a shared filesystem

Just to say this fix also works for a setup where the manager and workers all have copies of the same filesystem (but are not actually shared). That might sound unusual but it comes up when you have remote workers + manager all using the same docker image (e.g. with K8sClusterManagers.jl).

@ViralBShah ViralBShah added the triage This should be discussed on a triage call label Oct 19, 2021
@ViralBShah
Copy link
Member

I suppose we either accept this or close.

@JeffBezanson
Copy link
Member

👍 Can't see a reason not to do this.

@JeffBezanson JeffBezanson removed the triage This should be discussed on a triage call label Oct 21, 2021
@ViralBShah ViralBShah added the parallelism Parallel or distributed computation label Oct 22, 2021
@chriselrod
Copy link
Contributor Author

Good to merge?

@fredrikekre
Copy link
Member

I think it should replicate the full load path instead.

@vchuravy
Copy link
Member

@alanedelman just ran into this, and the worker processes and the main process had a completely separate environment that caused weird miss-behavior.

@chriselrod can you rebase this and address @fredrikekre comment?

@tkf
Copy link
Member

tkf commented Nov 25, 2021

I guess "full load path" means evaluating load_path_setup_code() on the remote workers on startup? It typically contains something like

julia> print(Base.load_path_setup_code())
append!(empty!(Base.DEPOT_PATH), ["/home/arakaki/.julia", "/home/arakaki/opt/julia/julia-1.6.0/local/share/julia", "/home/ar
akaki/opt/julia/julia-1.6.0/share/julia"])
append!(empty!(Base.DL_LOAD_PATH), String[])
append!(empty!(Base.LOAD_PATH), ["/home/arakaki/.julia/environments/v1.6/Project.toml", "/home/arakaki/opt/julia/julia-1.6.0
/share/julia/stdlib/v1.6"])
ENV["JULIA_LOAD_PATH"] = "/home/arakaki/.julia/environments/v1.6/Project.toml:/home/arakaki/opt/julia/julia-1.6.0/share/juli
a/stdlib/v1.6"
Base.ACTIVE_PROJECT[] = nothing

i.e., also replicate paths like DEPOT_PATH too.

fredrikekre added a commit that referenced this pull request Feb 24, 2022
Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes #28781, and closes #42089.
fredrikekre added a commit that referenced this pull request Feb 25, 2022
Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes #28781, and closes #42089.
@chriselrod chriselrod deleted the addprocsactiveproject branch February 25, 2022 13:03
staticfloat pushed a commit to JuliaCI/julia-buildkite-testing that referenced this pull request Mar 2, 2022
…ng#43270)

Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes JuliaLang#28781, and closes JuliaLang#42089.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
…ng#43270)

Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes JuliaLang#28781, and closes JuliaLang#42089.
Keno pushed a commit that referenced this pull request Jun 5, 2024
Local workers now inherit the package environment of the main process,
i.e. the active project, LOAD_PATH, and DEPOT_PATH. This behavior
can be overridden by passing the new `env` keyword argument, or by
passing `--project` in the `exeflags` keyword argument.

Fixes #28781, and closes #42089.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
parallelism Parallel or distributed computation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants