Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set JULIA_PROJECT="@." for kernel #820

Merged
merged 3 commits into from
Mar 14, 2019

Conversation

davidanthoff
Copy link
Contributor

This is one way to handle #750.

What this would do is that whenever you open a notebook, if there is a Project.toml in the folder of the notebook, or in one of its parent folders, that environment will be the active environment for the kernel of that notebook.

My understanding right now is that we are not going to somehow embed env info in the notebook itself, so this might be the next best thing to get around these annoying using Pkg; pkg"activate ." that one needs now at the top of each notebook if one is using environments.

This issue came up in the context of binder and jupyterhub/repo2docker#612.

Pinging a couple of folks that I think should opine on this: @StefanKarpinski, @KristofferC and @stevengj.

@StefanKarpinski
Copy link
Member

This seems sensible since starting a notebook in a directory already implicitly works in that directory. The reason that julia itself doesn't do this by default is that merely starting julia in a directory is not a sufficiently strong signal from the user that they trust the content of that directory and want to let it control what code they run, but it seems like starting a notebook server in a given directory is probably a strong enough indication from the user that they want to do just that.

@KristofferC
Copy link
Member

Yeah, I've also wanted this many times and I think it makes a lot of sense. 👍

@stevengj
Copy link
Member

stevengj commented Mar 9, 2019

Should this be done with a command line argument rather than an environment variable?

@davidanthoff
Copy link
Contributor Author

Should this be done with a command line argument rather than an environment variable?

I changed it to a command line argument. I think both versions would work identically, but I agree it is a little bit cleaner.

@stevengj
Copy link
Member

stevengj commented Mar 9, 2019

Environment variables leak into subprocesses.

@stevengj
Copy link
Member

stevengj commented Mar 9, 2019

What happens if there is a file named @.?

@davidanthoff
Copy link
Contributor Author

Environment variables leak into subprocesses.

Would that be a problem? Maybe we would even want that?

What happens if there is a file named @.?

As far as I can tell from the code in base it will never look for a directory (it would have to be a directory, not a file) with that name, because that string automatically triggers the case where it will look for a Project.toml in the current and parent dirs. @StefanKarpinski or @KristofferC probably know better, though.

@davidanthoff
Copy link
Contributor Author

@StefanKarpinski and @KristofferC, is my characterization of what happens when there is a file named @. correct?

@stevengj Beyond that question, anything else we need to do before we can merge this? I'd love to get the whole Jupyter Binder story sorted out, but that is currently waiting for a decision here.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Mar 11, 2019

If there's a file or directory named @. it will be ignored since @. is expanded first. If you really want to refer to a file named @. you can use ./@. which means the same thing. It seems likely to me that one does want subprocesses to inherit the @. meaning, but I'm not 100% sure. Another option here is to expand the pwd and set JULIA_PROJECT to an explicit absolute path; that is less likely to lead to unintended behavior in subprocesses, e.g. if the parent had called cd before spawning the child.

@davidanthoff
Copy link
Contributor Author

Thanks!

Another option here is to expand the pwd and set JULIA_PROJECT to an explicit absolute path

I think that would require a quite different approach from the two that I tried here: this PR here is changing the global kernel configuration, so we can't really write any absolute project path into that.

I also actually really like that @. also looks for a Project.toml in a parent directory. In my mind the typical use case is that one has a repo with a Project.toml in the root, and then a src folder where there are a number of IJulia notebooks, and it seems really great that they pick up the environment from their parent folder. That whole model also works really well with the Jupyter Binder design.

I think in light of all of this I would suggest that we just stick with the current approach in the PR here with a command line flag. If it turns out that this is causing trouble for sub-processes we can still revisit whether it would be better to use JULIA_PROJECT in the future. It also just doesn't seem super clear to me what the ideal approach here is, and using a command line flag approach seems more conservative for now.

@StefanKarpinski
Copy link
Member

That makes sense to me!

@NHDaly
Copy link
Member

NHDaly commented Mar 13, 2019

@davidanthoff should you remove RFC from the title? I think this is good to review and merge!

@davidanthoff davidanthoff changed the title RFC Set JULIA_PROJECT="@." for kernel Set JULIA_PROJECT="@." for kernel Mar 13, 2019
@davidanthoff
Copy link
Contributor Author

Good point! @stevengj would be great if you could merge this, I think this is ready!

Copy link
Member

@fredrikekre fredrikekre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but would be good to document this in the README, and link to the section about installing additional kernels for users that want to override this.

@davidanthoff
Copy link
Contributor Author

Done, I updated the README.

@fredrikekre fredrikekre merged commit bd9b39e into JuliaLang:master Mar 14, 2019
@davidanthoff davidanthoff deleted the env-handling branch March 14, 2019 14:11
@davidanthoff
Copy link
Contributor Author

Thanks! Could we also tag a new release?

@jlperla
Copy link

jlperla commented Mar 27, 2019

The one thing that was unclear here was whether it was possible to also put a manifest.toml in there, instead of just a project.toml?

@davidanthoff
Copy link
Contributor Author

If there is a Manifest.toml aound as well, it should be picked up automatically with this.

@NHDaly
Copy link
Member

NHDaly commented Apr 13, 2019

So, one complication that we've seen because of this. We have IJulia in our project's Project.toml file, so when users do Pkg.instantiate(), they pick it up. This then installs a Julia kernel into jupyter, with --project=@. set. But that means that if they try to open a julia kernel from jupyter started in any other directory, it crashes because it can't find IJulia! D:

Our solution, for now, is that in our project intialization script we're going to also just make sure and install IJulia into the --shared environment without asking, so that they can always run a julia kernel..

Before, we didn't have this problem because we were setting JULIA_PROJECT=/path/to/project before starting jupyter, but i guess the --project=@. flag takes precedence, it seems.

I'm not sure any of this is a problem, just wanted to report our observations!

@StefanKarpinski
Copy link
Member

Maybe stick IJulia in the load path somewhere?

@NHDaly
Copy link
Member

NHDaly commented Apr 13, 2019

Maybe stick IJulia in the load path somewhere?

That's an interesting idea. Is that something we could do in the kernel installation? Like add LOAD_PATH=$dirname(pathof(IJulia))) or whatever in the kernel file?

@KristofferC
Copy link
Member

You can set environment variables in the kernel file. If LOAD_PATH[1] is a path to IJulia, it will always be loaded from there. If it is instead is the last entry of LOAD_PATH it will be loaded from there if it wasn't found some other way.

@stevengj
Copy link
Member

stevengj commented Apr 15, 2019

Could the kernel.jl file (remember, Julia is launched with julia .... /path/to/IJulia/src/kernel.jl) simply pushfirst!(LOAD_PATH, "@__DIR__/..") before running import IJulia?

@NHDaly
Copy link
Member

NHDaly commented Apr 15, 2019

Could the kernel.jl file (remember, Julia is launched with julia .... /path/to/IJulia/src/kernel.jl) simply pushfirst!(LOAD_PATH, "@__DIR__/..") before running import IJulia?

I think this is a great approach. It's also much more proactive than the WARNING that's currently in there:

let
ijulia_kernel_file = joinpath(dirname(pathof(IJulia)), "kernel.jl")
this_kernel_file = @__FILE__
if ijulia_kernel_file != this_kernel_file
@warn "this kernel.jl is different from the one provided by IJulia" this_kernel_file ijulia_kernel_file
end
end

The one issue I can see is that if you Pkg.update("IJulia"), it would then keep using the old version unless you manually re-triggered installation, via either Pkg.build("IJulia") or IJulia.installkernel(). Whereas now, if you update IJulia it will use the newer version but print that warning.

From what I understand, though, by putting it as the last entry of LOAD_PATH, as @KristofferC explains, we can get the best of both worlds by respecting any updates in the loaded Manifest files, but still working correctly if no IJulia is present in the current environment.

I like it. I'll open a PR with this plan! Thanks for the help everyone! :)

@NHDaly
Copy link
Member

NHDaly commented Apr 15, 2019

Hmm, so I opened #834 which does indeed resolve loading IJulia. However, now it just errors when loading IJulia's dependencies, which in retrospect I should have seen coming...:

[ Info: Recompiling stale cache file /Users/nathan.daly/.julia/compiled/v1.1/IJulia/nfu7T.ji for IJulia [7073ff75-c697-5162-941a-fcdaad2a7d2a]
ERROR: LoadError: ArgumentError: Package ZMQ [c2297ded-f4af-51ae-bb23-16f91089e4e1] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.
...
in expression starting at /Users/nathan.daly/.julia/dev/IJulia/src/IJulia.jl:36
ERROR: LoadError: Failed to precompile IJulia [7073ff75-c697-5162-941a-fcdaad2a7d2a] to /Users/nathan.daly/.julia/compiled/v1.1/IJulia/nfu7T.ji.
...

Maybe the right answer here is just better documentation / error messages so that it's clear from the error that the user just needs to also install IJulia in their top-level environment?

@marius311
Copy link
Contributor

marius311 commented Apr 15, 2019

I appreciate the motivation and the amount of discussion surrounding this, but just thought I'd offer my opinion on why I think this is a bad idea:

  1. If I've got Jupyterlab going and am opening a bunch of different notebooks, even though they're all in the same "window" they may now all silently be in differently environments depending on whether certain files are present way up in the directory tree, possibly quite far away from the actual notebook file. (I say "silently" because unless I Pkg.status() I'll never see I'm not in the root environment). If I'm a less experienced user I may run a lot into "I thought I just added that package, why isn't it found?" due to not knowing I added it but in a different environment.

  2. More importantly, it's just not the default with Julia itself, so its extra confusing that IJulia behaves one way and the REPL another.

So although it does seem useful as an option, due to the above reasons I have this turned off, and IMO this shouldn't be a default. When needed, I can just use JULIA_PROJECT, which in general seems more flexible since its not hard-coded into the kernel.

@marius311
Copy link
Contributor

Perhaps even more to the point, I'd be closer to agreeing with @StefanKarpinski that

The reason that julia itself doesn't do this by default is that merely starting julia in a directory is not a sufficiently strong signal from the user that they trust the content of that directory and want to let it control what code they run, but it seems like starting a notebook server in a given directory is probably a strong enough indication from the user that they want to do just that.

but that isn't what this PR does. Instead, it activates the nearest parent Project.toml of a given notebook, regardless of where you've launched the Jupyter server.

My use pattern, which I'd guess is pretty typical, is I launch a single Jupyterlab server from my home directory which I use throughout my session (sometimes remote). In doing so I definitely did not mean to give any indication that I want the package activation behavior to be different than base Julia.

@NHDaly
Copy link
Member

NHDaly commented Apr 25, 2019

The reason that julia itself doesn't do this by default is that merely starting julia in a directory is not a sufficiently strong signal from the user that they trust the content of that directory and want to let it control what code they run, but it seems like starting a notebook server in a given directory is probably a strong enough indication from the user that they want to do just that.

but that isn't what this PR does. Instead, it activates the nearest parent Project.toml of a given notebook, regardless of where you've launched the Jupyter server.

I think this is a really good point that @marius311 makes.

@davidanthoff, I think we made a mistake in the implementation of this PR. The reason we opened this PR was so that standard IJulia and MyBinder would have the same behavior, and in jupyterhub/repo2docker#612 we changed MyBinder to automatically start all notebooks with the Project.toml from the top-level of the repo. That is definitely not the behavior we implemented in this PR, as @marius311 points out: instead each notebook you open runs in its own local Project, irrespective of any "top-level" project.

Another problem I've run into since we made this change is that the commandline flag --project= takes precedence over the environment variable JULIA_PROJECT=, so we cannot even override this behavior when we want to without rewriting the kernel file.


I've been thinking about this some, and here are a couple solutions I see:

  1. Revert this PR. This will cause running IJulia with a local project environment #750 to start happening again, but we can just do a better job informing users to set JULIA_PROJECT=. before starting jupyter if they want to access their local Project.
  2. Actually implement the behavior expected by @marius311: have all kernels always load the project where Jupyter was launched from (or a parent dir). Here is the best way I can think of to do that:
    • From what I can tell, the PWD environment variable is set to the path where you launched jupyter from (at least on my mac). We can have the kernel.jl file always call Pkg.activate(Base.current_project(ENV["PWD")) to set the equivalent of --project=@. for the directory where jupyter was launched from.
    • And ideally we would check and only do that if JULIA_PROJECT isn't in ENV.
  3. Do a better version of Fix IJulia not found when installed, but not in user's Environment #834, so that at least you will be always be able to load IJulia, even if nothing else is fixed.:
    1. Add the IJulia directory containing kernel.jl to LOAD_PATH, but also add all of IJulia's dependencies as well (via dirname(pathof(...)))
    2. or maybe we can just add the path to the current project.toml to the LOAD_PATH when installing the kernel.jl. But then if that project is changed this would break...
    3. or maybe we should just install a Project.toml file into the kernelspec directory with just the current version of IJulia installed? I like this approach.

I actually think I would vote for a combination of proposals 2 and some version of 3 (I like 3.iii).

I like 2, because it implements what I think is closer to the expected behavior: launching jupyter from a Project directory causes all kernels to run in that project directory (vs the behavior before this PR, where all kernels ran in v1.1). I like 3, because it fixes #750: even if you install IJulia from a Project and not in the top-level v1.1 environment, it will still run outside that project; and even if you launch jupyter from within a project that doesn't have IJulia installed, it would still run.

@NHDaly
Copy link
Member

NHDaly commented Apr 25, 2019

I think my proposal for number 2 would look like removing the --project=@. and adding something like this to the top of kernel.jl:

if !haskey(ENV, "JULIA_PROJECT")
    # Activate the equivalent of jupyter notebook's `$(pwd)/@.`
    Pkg.activate(Base.current_project(ENV["PWD"))
end

@marius311
Copy link
Contributor

Any further thoughts on this? I think proposals 1 or 2 would be fine.

@marius311
Copy link
Contributor

marius311 commented Jul 2, 2019

@NHDaly @davidanthoff could you or anyone else please revisit this and revert or fix it? Just spent an hour trying to figure out what was borked with a new Julia 1.2 IJulia install only to realize my notebook wasn't in the environment I thought it was because while I had fixed my 1.1 kernel, the new 1.2 install used JULIA_PROJECT=@. as a default as per this PR, and it had been long enough that I forgot about this entirely. To summarize,

  • While it might make sense for IJulia to activate an environment in the folder where Jupyter was started, this PR does not do that. Instead it activates whatever the nearest environment happens to be (possibly many folders up) on a per-notebook basis. So you end up with a single Jupyterlab session with N tabs open which might be in N different environments, which is really difficult to keep track of. I don't see how this is a good default, especially for an average user who may not want to think about environments at all.

  • And to boot, Jupyter does not show you the active environment anywhere, so the above might happen and you'll never know it.

I'm happy to submit a revert PR instead of just complaining (although a proper "fix" is beyond me), but I figured this was the group that needed to approve it anyway, hence the post here.

@davidanthoff
Copy link
Contributor Author

To be honest, I still think the behavior in this PR is the best solution we can get right now for supporting env in IJulia. Also, at least for me, it has been working great over the last couple of months.

I'm not following the issues here on the repo, but I think one major determinant for any potential revert should be how many complaints we got. I count maybe 3-4 folks in this issue that would like to revert. Did we get any other complaints about the new behavior in the forms of issues or on discourse? My sense had been that this PR has been smooth sailing mostly and that there hasn't been much negative feedback, but I might have that wrong because I didn't specifically follow the issues here.

Maybe a good process would be to open an issue here that proposes to revert things, then advertise that on discourse and slack and try to get thumbs up and down on the proposal and then decide based on that?

I think a broader issue is still that it would be great if one could configure an env per notebook, and have a UI that shows the active env. That would all obviously require coordination with the Jupyter crowd, but I think that is essentially really the UI that would make sense.

I don't think any approach based on the folder where Jupyter was started can work: while Jupyterlab for example shows a "root" folder, other clients like nteract don't even have any such concept.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants