Skip to content
This repository has been archived by the owner on Sep 3, 2024. It is now read-only.

Proposal for kernel provisioning and gateway investigations #3

Merged
merged 1 commit into from
Sep 21, 2015

Conversation

parente
Copy link
Member

@parente parente commented Sep 15, 2015

Kernel gateway incubation proposal.

@parente
Copy link
Member Author

parente commented Sep 15, 2015

/cc @blink1073, @sccolbert since you both indicated some interest on the original hackpad

@sccolbert
Copy link

Cheers for the ping!

@blink1073
Copy link

Thanks for the ping @parente, do you envision any specific changes that need to be made to jupyter-js-services?

@rgbkrk
Copy link
Member

rgbkrk commented Sep 15, 2015

@blink1073 Probably not. It seems like it has good separation for kernels to work independent of the rest of the notebook services.

@parente
Copy link
Member Author

parente commented Sep 15, 2015

Kernel specs have a different meaning when you're potentially launching kernels within containers with others stuff installed in them (e.g.., it's not just "python3", but "jupyter/python3-kernel" with matplotlib, scipy, ...). If a client going to use the kernelspecs API to discover what it can launch, the response format for that endpoint might need to change. (Or we shouldn't repurpose that endpoint for kernel-containers.)

I can also imagine wanting to pass additional information to the provisioner like "allocate these CPU, disk, RAM, ... resources to the kernel you launch".

These are the two that came to mind. Part of the investigation here is to find if there's more or if these are even valid concerns.

@blink1073
Copy link

It seems like all of that could be handled through another end point, which creates a set of kernelspecs conforming to the current API.

@jasongrout
Copy link

Another application of this sort of thing for even single-computer users is to create kernels executing in various conda or virtual environments. For example, such a provisioning service might enable the user to pick from kernels inside of available conda environments, and automatically update as new environments are created, etc. This is sort of like your "kernel command+environment/dependencies" example above.

@parente
Copy link
Member Author

parente commented Sep 15, 2015

If we're talking notebooks as clients, portability of kernel spec references becomes a bit of concern too. If my notebook runs against the "jupyter/python3-with-full-scipy-stack" container on some provider, but it's only captured as "python3" in the metadata, it's not enough for reproducibility. Flipping it, if the kernel name is captured as "jupyter/python3-with-full-scipy-stack" then it may be difficult for anyone else to re-run my notebook (i.e., how do I get that env?)

But these are problems with notebooks today too. The kernelspec captures the language, but nothing official captures all the other dependencies that make the notebook work.

@rgbkrk
Copy link
Member

rgbkrk commented Sep 15, 2015

The kernelspec captures the language, but nothing official captures all the other dependencies that make the notebook work.

That's a bingo!

@jasongrout
Copy link

We're quickly evolving to a full hashdist-like dependency list!

@rgbkrk
Copy link
Member

rgbkrk commented Sep 15, 2015

hashdist + computational resource in this case

/cc @ahmadia

@jasongrout
Copy link

I don't think we want to write an entire packaging tool here. But perhaps a kernel could be: "hashdist/49ab4bdeff3c"+computational resources. Let something like hashdist or conda do it's job to reproduce an environment (possibly with arbitrary metadata in the kernel spec, where they could store distribution-specific metadata).

@jasongrout
Copy link

In fact, I think I saw somewhere a command that will inject a list of conda dependencies into notebook metadata.

@ahmadia
Copy link

ahmadia commented Sep 15, 2015

Yes, there are unofficial tools for both hashdist and conda to inject dependencies into the metadata of the Notebook. Right now they are loosely connected with the kernels. One thing that isn't clear to me is how we can expose the available conda/hashdist/etc. environments as kernels to IPython as part of our installation process.

@minrk
Copy link

minrk commented Sep 16, 2015

And I've written a script that builds a hashdist profile with a kernel and registers it as a kernelspec. I also have a tool for registering kernels from conda/virtualenvs. Both of these are IPython-specific, and not generalized to other kernels.

I do think that this sort of thing belongs at a level below the kernelspec. That is, as far as Jupyter is concerned it's just a kernelspec like any other, and it's another tool that's responsible for taking some spec and building a kernel for it. Ideally, to me, this is all using existing specs - Dockerfiles, hashdist, conda envs, requirements.txt, etc. and we don't make ourselves responsible for defining yet another environment spec.

@rgbkrk @freeman-lab how much of this is overlapping with binder? Are there things we can re-use?


1. Using jupyter_client, jupyter_core, and pieces of jupyter/notebook (e.g., MappingKernelManager, etc.) to construct a headless kernel gateway that can talk to a cluster manager (e.g., Mesos).
2. Implementing a websocket to 0mq bridge that can be placed in any Docker container that already runs a kernel, to allow web-friendly access to that kernel.
3. Adding a new jupyter_client.WebsocketKernelManager that can be plugged into Jupyter Notebook or consumed by other tools to talk to kernels frontend by a websocket to 0mq bridge. (See use case #3 below).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I disagree that the server talking upstream via websockets is the right approach. I think it should be either:

  1. the client talks directly to the kernel service via websocket, which may not be on the same host as the notebook server, or
  2. the server talks to the kernel provider via zmq, even though it's remote. zmq isn't a localhost-only protocol.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the client talks directly to the kernel service via websocket, which may not be on the same host as the notebook server, or

If by kernel service you mean a kernel in a container with a websocket-to-zmq bridge (w2z?), then, yes, that's one of the planned experiments. This approach separates out kernel provisioning from kernel communication after provision, which is attractive. However, it punts the problem of managing comm with an number of running kernels out of scope unless there's a third component like the configurable-http-proxy for tmpnb, one that the provisioner informs about running kernels. That or the admin of the kernel service must bring his/her own proxying scheme.

All of the above is fine, but I think having an all-in-one gateway service that does the provisioning and the w2z bridging in one component might provide an easier walk-up-and-try it prototype in the short term. Granted, it is more monolithic and certainly has its own scaling problems, but I see it as a valuable for proving the concept and enabling folks to start thinking about "how could I use this?"

the server talks to the kernel provider via zmq, even though it's remote. zmq isn't a localhost-only protocol.

We've certainly talked to within-cluster remote kernels using zmq before. But, from experience, when we've started toying with clients being very remote from kernels (e.g., client on my laptop, kernel in an IaaS), kernels being offered as services by cloud providers, and applications that use kernels written by new audiences (e.g., web developers), we see Websockets having a number of advantages:

  • Proxying websockets is simpler than zmq with robust tools like nginx, haproxy, etc.
  • Multiplexing/demultiplexing websockets over a single proxy port is easier than zmq (e.g., to reduce port footprint for security)
  • End-to-end encryption for Websockets across potentially multiple proxy hops from client to kernel has solutions familiar to DevOps folks
  • Websockets are more familiar than zmq to the web developer audience we're looking to enable to build new kinds of applications that use kernels

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just add that creating clients that communicate directly to kernels (in languages other than Python) through 0MQ is not a trivial exercise. The Jupyter codebase already does a good job of abstracting through WebSockets. Why not leverage that?

And I agree... a WebSockets interface is going to be more friendly to app developers.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly think the first option should be better - client talks directly to the external kernel service via websockets. What I don't think we should do is make the existing server a websocket client of other web services.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be talking cross each other with our definition of client and server and for which scenario.

I mainly think the first option should be better - client talks directly to the external kernel service via websockets.

If by client you mean, for example, a JS app using jupyter-js-services, then yes. Or did you have another specific client in mind?

What I don't think we should do is make the existing server a websocket client of other web services.

Do you mean the Jupyter Notebook Python server here, specifically? If so, how would it take advantage of the remote service without talking to it via websockets?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this requires the kernel provider to have knowledge of notebook flags, correct? (assuming the provider becomes the websocket endpoint for the outside world)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this requires the kernel provider to have knowledge of notebook flags, correct?

It's what I currently deal with in tmpnb. I'm assuming we'll have a simple flag on the kernel provisioner or other options, not the notebook flag.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the game plan for securing the cross-origin websocket connection?

I think answering this question and others is part of the exploration TBD in the incubator. I certainly don't have a solid game plan yet. I think initial reference implementations can deal with the open access case and from there we can start to work on things like security.

That said, I can imagine having the provider support options for authenticating and authorizing requests for kernel provisioning (Does Pete get to request another kernel on my system and has he used up his allotment?) as well as kernel connectivity (Is this Pete connecting to his kernel via a websocket?) through common mechanisms (auth headers, API key, ...) But I can also imagine punting this responsibility to other components, like a front proxy that controls access to the APIs and specific kernel websocket routes based on login. Both seem viable at face-value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, for the authed version we need to do auth like we do in the notebook, though likely API key based.

@freeman-lab
Copy link

@parente this is an awesome effort! And very compatible with what we’re up to.

Broadly, the work so far on Binder provides at least one way to go from an environment specification to a container that can be deployed. Completely agree with @minrk that we don’t want to invent a new spec, and we’ve trying to support as many existing ones as possible.

The API we’re trying to iron out with @rgbkrk is meant to standardize both:

  • how to turn a set of configuration specs (e.g. requirements.txt, conda environment, external services, etc) into a container as per above
  • how to deploy and inspect deployments of containers, including pre-allocated pools of containers for “hot” launches

Currently the only “application” is the notebook, but we’d love to get to a point where Binders can target others, like light-weight web apps. And that should integrate really nicely with what’s described here, in particular, the websocket to 0mq bridge.

Instead of building all our images off of a monolithic base containing many kernels, we’d love to break them out into one image per kernel + dependencies, and then communicate with many of them through the gateway. In this model, Binder could be used to specify and deploy a pool of kernel containers, but the notebook (in our current setup) would be replaced with the gateway, which could then be the endpoint for a wider variety of client applications. In other words, if this gateway existed, Binder could start using it. Developing this wasn’t on our immediate roadmap, but we could totally help prototype / code!

Hope that helped clarify!

w/ @andrewosh


1. A _client_ (e.g., Thebe) that includes JavaScript code from Jupyter Notebook to request and communicate with kernels

2. A _spawner_ (e.g., tmpnb) that provisions gateway servers to handle client kernel requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would not jupyterhub also fall into this category of a spawner?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not as familiar with it, but since you said it, yes, it probably does. :)

@minrk
Copy link

minrk commented Sep 18, 2015

While there are still lots of fun technical things to discuss as we move along, I'll formally express my +1 on the proposal.


## Audience

* Jupyter Notebook users who want to run their notebook server remote from their kernel compute cluster
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a really important usage case, awesome!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@ellisonbg
Copy link
Contributor

Overall, I think this is a super important proposal that I heartily support.

Given the the existing notebook server already has:

  1. A 0mq->websocket adapter layer for the kernel's messge spec
  2. A REST API for starting kernels and sessions

It would be helpful to describe 1) what the proposal will add to that (multitenancy, auth, etc) and 2) how the existing stuff will be reused. In particular, if the core of the REST API and websocket stuff is the same, it would be great to have a single code base to maintain for that stuff.

Maybe the right solution is to even transition the single user notebook server over to using this API directly.

It sounds like you are thinking in these terms given the idea to reuse the JS client side of this stuff (jupyter-js-services) - just good to clarify how this stuff will interplay with the existing stuff at the level of code and APIs.

Another question is how this stuff will interact with jupyterhub. Jupyterhub is getting a ton of usage and it would be very helpful in the long run to separate the serving of notebook from the kernels for jupyterhub as well.

@damianavila
Copy link
Member

A lot of nice discussions here... but let's go to the proposal acceptance: more than 👍

@Carreau
Copy link
Member

Carreau commented Sep 20, 2015

Can we +1 on principle and refine the exact technical details once it is accepted ?

@fperez
Copy link

fperez commented Sep 20, 2015

While there may be technical details to be worked out, that's the point of a project evolving in incubation...

On the principle of this project, I am actually very excited, so count me enthusiastically in. Thanks!!!

@parente
Copy link
Member Author

parente commented Sep 20, 2015

It sounds like you are thinking in these terms given the idea to reuse the JS client side of this stuff (jupyter-js-services) - just good to clarify how this stuff will interplay with the existing stuff at the level of code and APIs.

I don't have an answer on how the interplay will work quite yet, but I agree with you that these unknowns are worth calling out in the proposal. They're the reason for doing these investigations in an incubator. Likewise, I agree that listing what additional features we're looking to add above and beyond whatever currently exists is worth noting, even if the exact implementation is not yet known (e.g., API auth, multihost kernel scaling).

I'll update the proposal with these edits soon.

@parente
Copy link
Member Author

parente commented Sep 20, 2015

@freeman-lab thanks for the clarification up above. Good to hear that some pieces of this proposal sound useful for Binder. I figure we can iron out the details of what holds value for Binder to launch and/or if there's a way to agree on a set of common APIs for launching things once this incubator exists.

@ellisonbg
Copy link
Contributor

I know it is a technical detail, but I think that it might hurt the
incubators chance of being incorporated into the project if it uses Go:

  1. Much of the logic already exists in our current python code base and
    could easily be reused to reduce the burden of having to maintain two
    versions. Our entire deployment architecture isn't going to move away from
    python anytime soon.
  2. Choosing Go limits who can work on it as no one in our existing core
    team knows Go very well (that I know of).

These factors should not in any way prevent the incubation proposal from
moving forward or there being experimentation with Go. I do also understand
that it would be advantageous to not have to install Python in all of the
kernel containers though so it also might help incorporation if it
dramatically easy the adoption and deployment of this stuff. This benefit
would have to be balanced with the costs.

Just wanted to mention how these implementation details might affect
eventual incorporation.

But again - these are details +1 overall.

On Sat, Sep 19, 2015 at 7:00 PM, Peter Parente notifications@github.com
wrote:

@freeman-lab https://github.com/freeman-lab thanks for the
clarification up above. Good to hear that some pieces of this proposal
sound useful for Binder. I figure we can iron out the details of what holds
value for Binder to launch and/or if there's a way to agree on a set of
common APIs for launching things once this incubator exists.


Reply to this email directly or view it on GitHub
#3 (comment)
.

Brian E. Granger
Associate Professor of Physics and Data Science
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger@calpoly.edu and ellisonbg@gmail.com

@rgbkrk
Copy link
Member

rgbkrk commented Sep 20, 2015

  1. Choosing Go limits who can work on it as no one in our existing core
    team knows Go very well (that I know of).

Actually, that would be me now. To be fair to @parente and team, I suggested Go for one big reason. Adding a static binary to a pre-configured Docker image and launching it is trivial compared to trying to install a Python dependency into an unknown system. Another approach is to package up a special virtualenv or conda environment into an image with our own pathing.

Going forward, if wrapping the existing Handlers is really all we need, it's not hard to package it up. It's the configuration, security and scaling issues on the outside that are going to be more difficult.

As if it weren't obvious, I'm 👍 to iterating on this within a repository 😄.

@ellisonbg
Copy link
Contributor

We would like to declare consensus and accept this proposal. Congrats! We will create a repo here shortly and add everyone to it.

ellisonbg added a commit that referenced this pull request Sep 21, 2015
Proposal for kernel provisioning and gateway investigations
@ellisonbg ellisonbg merged commit 628f0ac into jupyter-incubator:master Sep 21, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.