-
Notifications
You must be signed in to change notification settings - Fork 13
Proposal for kernel provisioning and gateway investigations #3
Conversation
/cc @blink1073, @sccolbert since you both indicated some interest on the original hackpad |
Cheers for the ping! |
Thanks for the ping @parente, do you envision any specific changes that need to be made to |
@blink1073 Probably not. It seems like it has good separation for kernels to work independent of the rest of the notebook services. |
Kernel specs have a different meaning when you're potentially launching kernels within containers with others stuff installed in them (e.g.., it's not just "python3", but "jupyter/python3-kernel" with matplotlib, scipy, ...). If a client going to use the kernelspecs API to discover what it can launch, the response format for that endpoint might need to change. (Or we shouldn't repurpose that endpoint for kernel-containers.) I can also imagine wanting to pass additional information to the provisioner like "allocate these CPU, disk, RAM, ... resources to the kernel you launch". These are the two that came to mind. Part of the investigation here is to find if there's more or if these are even valid concerns. |
It seems like all of that could be handled through another end point, which creates a set of kernelspecs conforming to the current API. |
Another application of this sort of thing for even single-computer users is to create kernels executing in various conda or virtual environments. For example, such a provisioning service might enable the user to pick from kernels inside of available conda environments, and automatically update as new environments are created, etc. This is sort of like your "kernel command+environment/dependencies" example above. |
If we're talking notebooks as clients, portability of kernel spec references becomes a bit of concern too. If my notebook runs against the "jupyter/python3-with-full-scipy-stack" container on some provider, but it's only captured as "python3" in the metadata, it's not enough for reproducibility. Flipping it, if the kernel name is captured as "jupyter/python3-with-full-scipy-stack" then it may be difficult for anyone else to re-run my notebook (i.e., how do I get that env?) But these are problems with notebooks today too. The kernelspec captures the language, but nothing official captures all the other dependencies that make the notebook work. |
That's a bingo! |
We're quickly evolving to a full hashdist-like dependency list! |
hashdist + computational resource in this case /cc @ahmadia |
I don't think we want to write an entire packaging tool here. But perhaps a kernel could be: "hashdist/49ab4bdeff3c"+computational resources. Let something like hashdist or conda do it's job to reproduce an environment (possibly with arbitrary metadata in the kernel spec, where they could store distribution-specific metadata). |
In fact, I think I saw somewhere a command that will inject a list of conda dependencies into notebook metadata. |
Yes, there are unofficial tools for both hashdist and conda to inject dependencies into the metadata of the Notebook. Right now they are loosely connected with the kernels. One thing that isn't clear to me is how we can expose the available conda/hashdist/etc. environments as kernels to IPython as part of our installation process. |
And I've written a script that builds a hashdist profile with a kernel and registers it as a kernelspec. I also have a tool for registering kernels from conda/virtualenvs. Both of these are IPython-specific, and not generalized to other kernels. I do think that this sort of thing belongs at a level below the kernelspec. That is, as far as Jupyter is concerned it's just a kernelspec like any other, and it's another tool that's responsible for taking some spec and building a kernel for it. Ideally, to me, this is all using existing specs - Dockerfiles, hashdist, conda envs, requirements.txt, etc. and we don't make ourselves responsible for defining yet another environment spec. @rgbkrk @freeman-lab how much of this is overlapping with binder? Are there things we can re-use? |
|
||
1. Using jupyter_client, jupyter_core, and pieces of jupyter/notebook (e.g., MappingKernelManager, etc.) to construct a headless kernel gateway that can talk to a cluster manager (e.g., Mesos). | ||
2. Implementing a websocket to 0mq bridge that can be placed in any Docker container that already runs a kernel, to allow web-friendly access to that kernel. | ||
3. Adding a new jupyter_client.WebsocketKernelManager that can be plugged into Jupyter Notebook or consumed by other tools to talk to kernels frontend by a websocket to 0mq bridge. (See use case #3 below). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I disagree that the server talking upstream via websockets is the right approach. I think it should be either:
- the client talks directly to the kernel service via websocket, which may not be on the same host as the notebook server, or
- the server talks to the kernel provider via zmq, even though it's remote. zmq isn't a localhost-only protocol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the client talks directly to the kernel service via websocket, which may not be on the same host as the notebook server, or
If by kernel service you mean a kernel in a container with a websocket-to-zmq bridge (w2z?), then, yes, that's one of the planned experiments. This approach separates out kernel provisioning from kernel communication after provision, which is attractive. However, it punts the problem of managing comm with an number of running kernels out of scope unless there's a third component like the configurable-http-proxy for tmpnb, one that the provisioner informs about running kernels. That or the admin of the kernel service must bring his/her own proxying scheme.
All of the above is fine, but I think having an all-in-one gateway service that does the provisioning and the w2z bridging in one component might provide an easier walk-up-and-try it prototype in the short term. Granted, it is more monolithic and certainly has its own scaling problems, but I see it as a valuable for proving the concept and enabling folks to start thinking about "how could I use this?"
the server talks to the kernel provider via zmq, even though it's remote. zmq isn't a localhost-only protocol.
We've certainly talked to within-cluster remote kernels using zmq before. But, from experience, when we've started toying with clients being very remote from kernels (e.g., client on my laptop, kernel in an IaaS), kernels being offered as services by cloud providers, and applications that use kernels written by new audiences (e.g., web developers), we see Websockets having a number of advantages:
- Proxying websockets is simpler than zmq with robust tools like nginx, haproxy, etc.
- Multiplexing/demultiplexing websockets over a single proxy port is easier than zmq (e.g., to reduce port footprint for security)
- End-to-end encryption for Websockets across potentially multiple proxy hops from client to kernel has solutions familiar to DevOps folks
- Websockets are more familiar than zmq to the web developer audience we're looking to enable to build new kinds of applications that use kernels
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would just add that creating clients that communicate directly to kernels (in languages other than Python) through 0MQ is not a trivial exercise. The Jupyter codebase already does a good job of abstracting through WebSockets. Why not leverage that?
And I agree... a WebSockets interface is going to be more friendly to app developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mainly think the first option should be better - client talks directly to the external kernel service via websockets. What I don't think we should do is make the existing server a websocket client of other web services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be talking cross each other with our definition of client and server and for which scenario.
I mainly think the first option should be better - client talks directly to the external kernel service via websockets.
If by client you mean, for example, a JS app using jupyter-js-services, then yes. Or did you have another specific client in mind?
What I don't think we should do is make the existing server a websocket client of other web services.
Do you mean the Jupyter Notebook Python server here, specifically? If so, how would it take advantage of the remote service without talking to it via websockets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this requires the kernel provider to have knowledge of notebook flags, correct? (assuming the provider becomes the websocket endpoint for the outside world)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this requires the kernel provider to have knowledge of notebook flags, correct?
It's what I currently deal with in tmpnb. I'm assuming we'll have a simple flag on the kernel provisioner or other options, not the notebook flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the game plan for securing the cross-origin websocket connection?
I think answering this question and others is part of the exploration TBD in the incubator. I certainly don't have a solid game plan yet. I think initial reference implementations can deal with the open access case and from there we can start to work on things like security.
That said, I can imagine having the provider support options for authenticating and authorizing requests for kernel provisioning (Does Pete get to request another kernel on my system and has he used up his allotment?) as well as kernel connectivity (Is this Pete connecting to his kernel via a websocket?) through common mechanisms (auth headers, API key, ...) But I can also imagine punting this responsibility to other components, like a front proxy that controls access to the APIs and specific kernel websocket routes based on login. Both seem viable at face-value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right, for the authed version we need to do auth like we do in the notebook, though likely API key based.
@parente this is an awesome effort! And very compatible with what we’re up to. Broadly, the work so far on Binder provides at least one way to go from an environment specification to a container that can be deployed. Completely agree with @minrk that we don’t want to invent a new spec, and we’ve trying to support as many existing ones as possible. The API we’re trying to iron out with @rgbkrk is meant to standardize both:
Currently the only “application” is the notebook, but we’d love to get to a point where Binders can target others, like light-weight web apps. And that should integrate really nicely with what’s described here, in particular, the websocket to 0mq bridge. Instead of building all our images off of a monolithic base containing many kernels, we’d love to break them out into one image per kernel + dependencies, and then communicate with many of them through the gateway. In this model, Binder could be used to specify and deploy a pool of kernel containers, but the notebook (in our current setup) would be replaced with the gateway, which could then be the endpoint for a wider variety of client applications. In other words, if this gateway existed, Binder could start using it. Developing this wasn’t on our immediate roadmap, but we could totally help prototype / code! Hope that helped clarify! w/ @andrewosh |
|
||
1. A _client_ (e.g., Thebe) that includes JavaScript code from Jupyter Notebook to request and communicate with kernels | ||
|
||
2. A _spawner_ (e.g., tmpnb) that provisions gateway servers to handle client kernel requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would not jupyterhub also fall into this category of a spawner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not as familiar with it, but since you said it, yes, it probably does. :)
While there are still lots of fun technical things to discuss as we move along, I'll formally express my +1 on the proposal. |
|
||
## Audience | ||
|
||
* Jupyter Notebook users who want to run their notebook server remote from their kernel compute cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really important usage case, awesome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Overall, I think this is a super important proposal that I heartily support. Given the the existing notebook server already has:
It would be helpful to describe 1) what the proposal will add to that (multitenancy, auth, etc) and 2) how the existing stuff will be reused. In particular, if the core of the REST API and websocket stuff is the same, it would be great to have a single code base to maintain for that stuff. Maybe the right solution is to even transition the single user notebook server over to using this API directly. It sounds like you are thinking in these terms given the idea to reuse the JS client side of this stuff (jupyter-js-services) - just good to clarify how this stuff will interplay with the existing stuff at the level of code and APIs. Another question is how this stuff will interact with jupyterhub. Jupyterhub is getting a ton of usage and it would be very helpful in the long run to separate the serving of notebook from the kernels for jupyterhub as well. |
A lot of nice discussions here... but let's go to the proposal acceptance: more than 👍 |
Can we +1 on principle and refine the exact technical details once it is accepted ? |
While there may be technical details to be worked out, that's the point of a project evolving in incubation... On the principle of this project, I am actually very excited, so count me enthusiastically in. Thanks!!! |
I don't have an answer on how the interplay will work quite yet, but I agree with you that these unknowns are worth calling out in the proposal. They're the reason for doing these investigations in an incubator. Likewise, I agree that listing what additional features we're looking to add above and beyond whatever currently exists is worth noting, even if the exact implementation is not yet known (e.g., API auth, multihost kernel scaling). I'll update the proposal with these edits soon. |
@freeman-lab thanks for the clarification up above. Good to hear that some pieces of this proposal sound useful for Binder. I figure we can iron out the details of what holds value for Binder to launch and/or if there's a way to agree on a set of common APIs for launching things once this incubator exists. |
I know it is a technical detail, but I think that it might hurt the
These factors should not in any way prevent the incubation proposal from Just wanted to mention how these implementation details might affect But again - these are details +1 overall. On Sat, Sep 19, 2015 at 7:00 PM, Peter Parente notifications@github.com
Brian E. Granger |
Actually, that would be me now. To be fair to @parente and team, I suggested Go for one big reason. Adding a static binary to a pre-configured Docker image and launching it is trivial compared to trying to install a Python dependency into an unknown system. Another approach is to package up a special virtualenv or conda environment into an image with our own pathing. Going forward, if wrapping the existing Handlers is really all we need, it's not hard to package it up. It's the configuration, security and scaling issues on the outside that are going to be more difficult. As if it weren't obvious, I'm 👍 to iterating on this within a repository 😄. |
We would like to declare consensus and accept this proposal. Congrats! We will create a repo here shortly and add everyone to it. |
Proposal for kernel provisioning and gateway investigations
Kernel gateway incubation proposal.