-
-
Notifications
You must be signed in to change notification settings - Fork 412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Per notebook MANIFESTS #673
Comments
maybe with the contents API |
I believe that what's needed is a "environment protocol": i.e. instead of needing to actually have a project file and/or manifest file present, or a package directory, or load path array, one just needs to implement the environment protocol. Then the IJulia package can implement the protocol for notebooks that have environment information stored in them and voila, each notebook has its own environment. However, I think that work is a 1.x kind of thing: we now generally understand what the protocol needs to look like; the next step is to factor out the protocol part in such a way that the three kinds of environments that we already support are implementations of this protocol; after that we allow a notebook to implement the environment protocol as well. The main thing to consider at this point is how to allow for extension in the future. Where is the hook? Do we have a The contents API seems like it may be a good way to stash the manifest information, but we don't really need something that emulates a file system—using a JSON store would actually be easier. |
A long-run solution is great to automatically embed the manifest/etc. But is it possible to have a short-term patch requiring a manual call to load something in the notebook itself? That is, something along the lines of
Or maybe this is already possible with some of the |
🤷♂️ maybe? |
Can't you just |
To make sure I understand this, you think I may just be able to put a Pkg.activate(".")
using MyLib If that is correct, I can try to have someone test it when IJulia is sufficiently stable with 0.7 |
You need a Project file as well. But yes if you do
and then go wild with adding packages, those will be recorded in
to install all the packages at the version you used them. |
If opening a notebook can instantiate arbitrary How about adding a simple function that uploads Pkg.activate("https://gist.github.com/.../...") to a notebook cell? Of course, Alternatively, I guess you can use cell attachments to bundle |
I don't know jupyter all that well, but isn't the security controlled by how it is contained? You can load local files, run shell stuff, etc if it lets you? Certainly being able to instantiate a local manifest is not the long-run solution, and will not work for all scenarios, but I don't think it is a security hole. |
I now have this in my notebooks
|
Storing the Manifest + Project inside the notebook and have a button that does that would come a long way. There shouldn't be any security problems with that, it is just a convenience layer? |
Their security model is:
So I don't think you can register any UI elements like a button to instantiate a project from the notebooks. Thought I guess that's possible via front-end extension. I just thought using |
This is a pretty fundamental feature, so integrating it nicely into the frontend for every julia notebook seems like the right way to do it. |
If you are willing to write a front-end extension I think that's great! I have no intention of stopping it. |
The notebook does include a certain amount of notebook-wide metadata, detailing the language and kernel. e.g.
It may be possible to insert and read the manifest information from there. As far as a security model goes, one solution could be a confirmation dialog before installing any new package versions via |
Well, I asked on the Jupyter gitter: it seems like this is not possible via the current protocol, so if we wanted something along those lines we would need to do it via a jupyter extension. |
It does not work when IJulia kernel and Jupyter server run in different machines. |
At JupyterCon I spoke with a few Jupyter folks and their take was that trying to put this kind of metadata into notebooks was not the right direction to go—they've tried this with images and other things in the past and have come to feel that the "unit of distribution" should be a git repo, not a single notebook file. So it seems like the way to go here might be to have IJulia automatically activate the project in the git repo that it's in. After all, you are running the code in the notebook, so presumably you trust it. (As compared to just starting a Julia process in a directory, which may or may not mean that you trust the content of the directory enough to execute it.) |
IJulia doesn't know what notebook file (if any) it is executing — that information is not provided to the kernel. |
If we go this way, I'd still like a way to package everything into a single file that you can email to somebody or share on JuliaBox (also have separate environments for every notebook on juliabox). If we I just want to share some code with somebody, I don't think we can expect the workflow to be "Go clone this git repo". |
I agree. Jupyter notebooks need to be able to be used self-contained in some sense. Even the Jupiter interface is often around the "Upload" notebook interface. What about the ability to activate from a URL? You could give it the project file and/or manifest, and it would enable copying jupyter around. And if someone wanted to run the notebook in whatever global project they had in their current jupyter, they wouldn't need to use those cells? |
I'm just reporting what the Jupyter people (@Carreau if I recall correctly) told me which is that they are moving away from trying to make notebooks self-contained because it has not worked out as hoped. The simplest solution would seem to be serving a zip or tar file continaing a set of notebooks, resources used by the notebook and in our case, project and manifest files. |
Yes, we tend to try to think of (1 unit == 1 repository).The notebook as unit, espescially since you can now connect many notebook to same kernel make not much sens. We haven't really figured out how to make all of the completely work, but generally trying to shove more into a notebook does not work. As said before a repository does not always work, but I don't think we can get a "one size fits all". There is always this tension between being able to manipulate things on the filesystem, and having everything being opaque and managed by Jupyter. You could of course have an extension for jupyter that show "bundles" as an actual tree of files, but then you can't cd into it. Maybe something along a fuse driver that expose a single file at some path, and repo structure in another ? |
@fperez would be interested in this discussion BTW, and I think we had pictures of a whitebord with all the different axes of what people want from notebook files. |
My experience is that embedding data in notebooks is a lost cause. e.g. the attachments feature is basically useless, since:
|
Is there a reason not to enable on url based project and Manifest files? In a github based implementation, you could point it to the raw file, or a local url. And notebooks copied around would then work. Does that break the Jupyter security model? |
FYI, we've tagged a release of |
To be clear, this first implementation is for a light repo with package and manifest. Which provides a solution for tightly controlled lecture notes /etc. The gist approach, which would be better for less formal setups, could be added as well if anyone is interested |
I think #673 (comment) is missing a |
Instantiate builds the packages that got downloaded so don't think that is required. |
I seem to recall cases when that didn't happen, but maybe that was just because the build had failed during an earlier |
@tkoolen FYI, the way we avoid rebuilding every time is to either (a) precompile the resources, for git refs that point to moving targets like |
* Reenable all notebook tests now that required packages all support Julia 1.0. * Put each notebook in a separate directory, with its own Project.toml and Manifest.toml, to make running the notebooks more straightforward (see JuliaLang/IJulia.jl#673). * Separate Project.toml and Manifest.toml files for optional visualization parts of the notebooks (not tested by CI, since this would introduce a cyclic test dependency). * Fix #501, Symbolic Double Pendulum not working (work around JuliaPy/SymPy.jl#245 and JuliaPy/SymPy.jl#244).
Notebook fixes. * Reenable all notebook tests now that required packages all support Julia 1.0. * Put each notebook in a separate directory, with its own Project.toml and Manifest.toml, to make running the notebooks more straightforward (see JuliaLang/IJulia.jl#673). * Separate Project.toml and Manifest.toml files for optional visualization parts of the notebooks (not tested by CI, since this would introduce a cyclic test dependency). * Fix #501, Symbolic Double Pendulum not working (work around JuliaPy/SymPy.jl#245 and JuliaPy/SymPy.jl#244). Update doc links, readme in notebooks directory. Just rely on notebook-specific manifests for test dependencies other than ForwardDiff and NBInclude. Add RigidBodySim pointer. Better way to handle URDF links with the name 'world' Makes it so that root_frame(mechanism) is no longer named "". Fix copyto! performance for SegmentedVector. Fixes performance momentum_matrix! in ForwardDiff notebook.
Notebook fixes. * Reenable all notebook tests now that required packages all support Julia 1.0. * Put each notebook in a separate directory, with its own Project.toml and Manifest.toml, to make running the notebooks more straightforward (see JuliaLang/IJulia.jl#673). * Separate Project.toml and Manifest.toml files for optional visualization parts of the notebooks (not tested by CI, since this would introduce a cyclic test dependency). * Fix #501, Symbolic Double Pendulum not working (work around JuliaPy/SymPy.jl#245 and JuliaPy/SymPy.jl#244). Update doc links, readme in notebooks directory. Just rely on notebook-specific manifests for test dependencies other than ForwardDiff and NBInclude. Add RigidBodySim pointer. Better way to handle URDF links with the name 'world' Makes it so that root_frame(mechanism) is no longer named "". Fix copyto! performance for SegmentedVector. Fixes performance momentum_matrix! in ForwardDiff notebook.
Notebook fixes. * Reenable all notebook tests now that required packages all support Julia 1.0. * Put each notebook in a separate directory, with its own Project.toml and Manifest.toml, to make running the notebooks more straightforward (see JuliaLang/IJulia.jl#673). * Separate Project.toml and Manifest.toml files for optional visualization parts of the notebooks (not tested by CI, since this would introduce a cyclic test dependency). * Fix #501, Symbolic Double Pendulum not working (work around JuliaPy/SymPy.jl#245 and JuliaPy/SymPy.jl#244). Update doc links, readme in notebooks directory. Just rely on notebook-specific manifests for test dependencies other than ForwardDiff and NBInclude. Add RigidBodySim pointer. Better way to handle URDF links with the name 'world' Makes it so that root_frame(mechanism) is no longer named "". Fix copyto! performance for SegmentedVector. Fixes performance momentum_matrix! in ForwardDiff notebook.
Notebook fixes. * Reenable all notebook tests now that required packages all support Julia 1.0. * Put each notebook in a separate directory, with its own Project.toml and Manifest.toml, to make running the notebooks more straightforward (see JuliaLang/IJulia.jl#673). * Separate Project.toml and Manifest.toml files for optional visualization parts of the notebooks (not tested by CI, since this would introduce a cyclic test dependency). * Fix #501, Symbolic Double Pendulum not working (work around JuliaPy/SymPy.jl#245 and JuliaPy/SymPy.jl#244). Update doc links, readme in notebooks directory. Just rely on notebook-specific manifests for test dependencies other than ForwardDiff and NBInclude. Add RigidBodySim pointer. Better way to handle URDF links with the name 'world' Makes it so that root_frame(mechanism) is no longer named "". Fix copyto! performance for SegmentedVector. Fixes performance momentum_matrix! in ForwardDiff notebook.
Perhaps there's a very simple solution to this problem: treat the desired embedded environment metadata as code in the first executable cell. The question then becomes how to make it unobtrusive in the standard jupyter UI. It appears the UI doesn't do line wrapping, so there might also be a simple answer to that as well: base64 encode the toml files into a single line each. The nice thing about this is that it's a solution for scripts which need to "come with their environment" just as much as jupyter notebooks. Then we'd just need a package Would this work or have I missed something? |
I tried implementing this; there's a few goatchas but it looks like it will work. Gotchas include:
Generally there seems to be some impedance mismatch with |
Another datapoint: I'd like to be able to send people links to colab notebooks with in-built environments, but the unit they use is a file :) |
I think my proposed solution/workaround would be ok for that. Would you be interested in it becoming a registered package? I'd need to think a bit more about the workflow and API, and probably involve Pkg people to know whether it's going to work out, or is fundamentally broken in some way. But I'm not sure whether to do that extra work yet. |
@Keno @c42f If you have been using a solution like this, there is another use case to consider: getting notebook users able to update the project and manifest when necessary. This has proven to be very important for our set of lecture notes, otherwise people effectively start copying around notebooks and using copies of them to edit for assignments. After doing this for the last 8 months, my gut says that metadata in a notebook could become hellish to maintain and lead to all sorts of user issues wondering why they have the wrong versions of packages. I used to be of the opinion that hidden metadata was the right way to go, but have reversed my stand completely. On the other hand, I will never reverse my stand that notebooks have to execute self-contained from a single file and that copies around toml files is a terrible idea. For what it is worth, the approach we implemented from people's (i..e @vchuravy 's) suggestion in https://github.com/QuantEcon/InstantiateFromURL.jl/ has been very successful. Basically,
Take a look at https://github.com/QuantEcon/quantecon-notebooks-jl/blob/master/kalman.ipynb as an example, but basically all that is needed is the using InstantiateFromURL
activate_github("QuantEcon/QuantEconLecturePackages", tag = "v0.9.6"); at the top of the page. The project and manifest are versioned in https://github.com/QuantEcon/QuantEconLecturePackages Now, for those who don't need to have a mini repository, @vchuravy had the initial idea that this sort of package could have a simple utility to setup a We didn't need it ourselves and couldn't put in the development time, but I think it is exactly the sort of thing that is needed for more lightweight package management. ... all of that is to say: before starting on any new solution, please see if the workflow in this package is solid and feel free to submit PRs for new features. If enough people vet this solution, a variation on it might make sense in Pkg.jl or at least a more formally maintained package. |
@jlperla That sounds like a great workflow for your use case. My reservation is that it's not self contained and requires supporting infrastructure which can't easily be updated by the end users. This is probably a good feature in your case where you're running a class with homogeneous package requirements. On the other hand, I'm helping a group of somewhat nontechnical PhD students with heterogeneous data management and analysis tasks. My thought is that I should be able to give them jupyter notebooks (and normal scripts!) which have embedded self-contained environments. I'd also like them to be able to update package requirements as their needs change. But at the same time, have them well defined and embedded within the notebooks so that package requirements are somewhat resistant to user error (eg emailing a script and forgetting to add the Project and Manifest files). |
Emailing project and manifest files around simply does not work. I am completely with you. And they make put the wrong ones with the wrong files.
I understand the goal of a "self-contained environment", but I would decouple that from a self-contained file. Here are some usage scenarios:
These are the tip of the iceberg.... As I said, I used to think that this stuff belonged in the notebook but changed my tune completely after seeing usage scenarios.
Having these things centrally managed is extremely helpful. But I understand that having a full repo for the set of project/toml is a little heavy for most uses. This is exactly why @vchuravy had originally suggested using a gist with some tools (which I will try to summarize below). For us, having a consistent set of versions to bump was very nice but things don't need to have a full and controlled repository. Basically, I think he had in mind QuantEcon/InstantiateFromURL.jl#18 as a formalization of #673 (comment)
.... or something along those lines. |
I've been using notebooks + toml in gists for a while, and while it works, there are some hassles
|
Not sure if this helps, but the InstantiateFromURL package grabs repo tarballs (which don’t require an API key), and we store them (names salted with SHA hash) in a hidden directory from where the script is run. Could be different on the gist side, though. |
I agree, and those sorts of scripts built into a package seem to be what Valentin was getting at. I think it is a perfect case for a light package (which could ultimately become a feature of Pkg3 itself). I am hesitant to say that we should have it in "jupyter" or IJulia since this is a more general problem than just jupyter notebooks. If anyone wants to work on gist features, @arnavs and I would be happy to merge them into the |
These are all good points but come with strong assumptions that:
Consider instead that you are helping a group of nontechnical colleagues (students and lab staff) with their individual projects, each of which has different package requirements. This situation is a very different use case and I don't see how |
Hence the suggestion to have gist based workflows from some people with simple publishing tools. We didn't build it because of lack of time and not knowing requirements since we didn't need it. My points are primarily about the difficulty of having relatively non-technical people manage project and manifests within the jupyter files themselves and all the things that can go wrong. The other thing to consider is that the students can use a base set of packages and then install additional ones with the by the build commands at the top of their own notebooks. But I could be wrong... Maybe there is some sort of technology that could make managing embedded package information within a notebook seamless and manageable. But it is hard to imagine without deep integration of both IJulia and Pkg3 (which there seems to be little appetite for). |
@c42f FYI
I thought about how to address it. Here is an idea: put the following code with a hypothetical function using IJuliaPkg
use_packages(
[
"Plots",
"DifferentialEquations",
],
) which adds the packages in a plain environment, encode using IJuliaPkg
use_packages(
[
"Plots",
"DifferentialEquations",
],
project = $ENCODED_PROJECT,
) using using IJuliaPkg
use_packages(
[
"Plots",
"DifferentialEquations",
"PyPlot",
],
project = $ENCODED_PROJECT,
) and then hit shift+enter which updates |
@tkf thanks, that's an excellent point. I had just assumed overwriting a code cell from the kernel was impossible! With this in mind I think it's possible to have a self contained solution. |
Closed by #820. |
Now, that 0.7 is getting closer, it may make sense to start thinking about how notebooks interact with the new package manager. I had discussed with @StefanKarpinski and @KristofferC that it would be great if notebooks could embed a MANIFEST and thus if you send somebody a notebook they could automatically load everything with the correct versions. Doing something like this would require figuring out where to store the information, how to hook it up to Pkg3 and probably require some UI work as well.
The text was updated successfully, but these errors were encountered: