-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coordinate use-cases / infra around JupyterHubs+other applications on Kubernetes #382
Comments
Thanks for raising this @choldgraf, I think this is a useful conversation to have. From Dask's perspective, we maintain three Helm charts:
Looking through the other projects here that use Dask it seems some of them depend on these charts:
All of these projects go much further than Given that many folks also seem to be deploying prometheus/grafana I think it is our place to ensure that our integration with prom is the best that it can be, and that we expose as many useful metrics as we can. But it would be out of scope for us to provide prometheus itself in our charts. I have occasionally considered creating some official Terraform or vendor specific IaC config like CloudFormations for Dask. These would probably be more similar to the So to answer some of the direct questions:
|
I absolutely think this makes sense. One of the first steps might be describing what everyone is doing to see where there's room for consolidation. I know @consideRatio has been doing work in z2jh to help with using jupyterhub as a dependency, but that's also a likely area of work for us: what are the challenges when integrating jupyterhub as a dependency? z2jh now a very complicated chart with loads of moving parts. I'm not sure I would recommend anyone deploy jupyterhub on kubernetes without it at this point, or at least know that you are very much on your own if you do. Is anyone doing this now? One of the things I think we can do is publish some basic grafana charts for monitoring jupyterhub. That's certainly a place where there's lots of probably unnecessary copying and pasting that could be importing. I'm not sure what the role of something that bundles jupyterhub, prometheus, etc. would be - a 'distribution' chart that's more complete? I'm not so sure. I think documentation / examples might be the right level, plus publishing one or more example grafana dashboards. Maybe it's time for another online JupyterHub workshop - we had one of these a couple years ago bringing in especially supercomputer folks to discuss things like BatchSpawner. Perhaps this could be a topic for one later this spring? |
Publishing Grafana dashboards is a nice idea. We had one shared in dask/distributed#3136 but perhaps we should go further and put some in the Dask documentation. |
I think part of this discussion should be about who Z2JH is aimed at. There was an informative discussion on That particular issue was resolved, but to me it's still not clear whether there's overall agreement on the current aims of Z2JH. |
@minrk check out https://github.com/yuvipanda/jupyterhub-grafana! I've some more changes to it that I should push. |
I usually deploy many JupyterHubs per cluster, and one 'support' chart per cluster. The support chart (like this) usually installs:
Maybe we can publish this? |
Maybe @MridulS can say something with respect to the GESIS perspective - some coordination could indeed be useful |
I know supercomputing was mentioned more in passing here but there are things going on with Kubernetes there including a few JupyterHub deployments. I think that community would be interested in participating in the conversation. |
A few thoughts in response to others above:
Totally agree - I think our approach of "document the pattern well first, and only when absolutely necessary create a tool to automate it" has been quite useful at keeping the tech modular. I'd also be curious to hear from @costrouc on the experience in wrapping all of this complexity into a single package w/ feature flags and such. Perhaps that is a good data point to see what this looks like on the "solve it with tech" vs. "solve it with docs" question. Maybe a good first step would be to document how to deploy a Kubernetes cluster w/:
Now that I've typed that, I guess that's basically just "the Pangeo Model". What if we made it a little documentation site, we could call it The Pangeo Way 😅. I think the challenge there is extending that model to new applications. If people wanted to add on new charts etc, how could they do so gracefully? Is there a way to capture that complexity with documentation?
Just for now, I think the plan is to revert to upstream, but this is a short-term hack :-) https://github.com/2i2c-org/pilot-hubs/blob/master/hub-templates/daskhub/values.yaml#L2
That is a great idea @minrk ! And @rcthomas we should definitely bring the HPC world in as well. |
This idea sounds great. I worry a little about namechecking Pangeo in the title though. The Pangeo community is awesome and has done a lot to bring geoscientists together around these tools. However this stack is useful way beyond geosciences and in my experience, folks outside of it have dismissed it because they do not identify with that community. |
☝🏻 We occasionally see a similar problem with "The Turing Way" ("My boss won't let me contribute because it's under the Turing Institute's umbrella"). Would it be too snubbing of Pangeo to call it "The Jupyter Way", or should we come up with something completely different? |
Maybe "The Jupyter Hub Way" would work? And be more specific. I don't think it would be snubbing Pangeo, especially if there was a history page containing references to Pangeo. |
The problem of the name / organizational ownership precluding contribution is a serious one that we need to think about. All the organizations named above have invested effort in developing some sort of product that is branded in some way (Pangeo, Qhub, DaskHub, 2i2c hubs, etc.) Although we all believe in collaboration, we all have strategic incentives to maintain our brand. This is an issue for Pangeo, but, from my perspective, we would happily trade that brand recognition for a more functional / maintainable code base. The tradeoff for for-profit companies may be different. So one key question is, what is the name of the thing we would all feel comfortable rallying around? IMO, "The Jupyter Hub Way" doesn't quite have the zing we need to inspire people. I'm particularly interested to hear from the Quansight folks (@costrouc / @dharhas) what sort of name / structure / organization might entice you to upstream some of the things you've built. Conversely, maybe the most convenient name for the thing is in fact, "qhub," since they have put significant work into documenting / packaging it? Related to this question of naming is the question of governance. |
I sat in on the JupyterLab RTC meeting today, and they noted that one major upcoming challenge is authorization. @echarles also provided some helpful context here. It sounds like this might be another helpful use-case to coordinate, related to the "multi-application deployments" question, because if you're deploying a Dask Gateway cluster, then you're also probably interested in making sure it's only used by the people you want to use it. |
I think it's very common for Dask Gateway to defer authentication to Jupyter Hub. |
I think there's lots of good, nitty gritty work on authorization integrations, especially building on the RBAC work, to add authorization controls and integrations for things - i.e. user X has collaboration permissions on user Y's server, and user Y can launch a cluster with up to N cpus, etc. Authentication is the coarsest "who are you" layer of that, but I think we can help on some pressing issues here for nicer ways to manage permissions. It's all mostly technically possible right now, but we can definitely smooth it out a lot with some effort. |
Thanks everybody for the feedback about branding and such, that is a good point @sgibson91 / @rabernat - I think it's important that we not come across as favoring a specific project in community tooling / standards. Also agreed that, while Jupyter is a good multi-stakeholder community for this, "The JupyterHub Way" is some combination of not punchy enough and also not specific enough. Here's a proposal that initially focuses around documentation improvements and additions. It is inspired a bit by the divio documentation framework and I think could help us structure the docs in a way that is more easily extensible to these new use-cases.
In the process of doing this, two things might become clear:
This feels like a substantial amount of work, and potentially a good topic for a CZI EOSS application. We've discussed this a bit in #380 - but I think that creating a high-quality pattern like this would be of immense value across the community. Do others agree? |
"JupyterHub conceptual intro" jupyterhub/jupyterhub#2726 seems very relevant here, and practically a pre-requisite for anyone who wants to do advanced configuration. I know I should get around to it again, but I haven't found time. Anyone else who would like to help is more than welcome! |
@choldgraf I like your suggestions of logically splitting up the docs a bit more. A potential added benefit is that others may feel more empowered to write their own guides since there's no longer one central guide. I think it's also worth looking at ways to make it easier to test the docs. If someone opens a docs PR against the Z2JH guide you either need to setup the full environment, then copy and paste each step to validate the change, or (more likely) just assume it works. Could we e.g. do something with a jupyter bash notebooks so you can extract the commands and run them in CI, or do something with executable books. It'd be a pretty neat demonstration of the Jupyter ecosystem 😃. I started playing with Katacoda this week https://katacoda.com/manics/scenarios/jupyterhub-kubernetes |
I'm a bit late here overall (both to this conversation, also my overall efforts have changed and I spent less time on Jupyter anyway - this would have been much more interesting to me a few years ago, which is too bad). This also seems mostly focused on kubernetes, which I don't have much to add to. But... out of curiosity, does anyone run jupyterhub in kubernetes where UIDs ( |
Following up on @rcthomas' comment FYI @zonca @danielballan |
yes! this would be very useful for me as well. For example: |
@zonca - I'm curious in those kinds of documentation, how much you end up needing to write specific to JetStream, vs. how much of it can "assume you have K8S and a basic JupyterHub and the rest is the same across cloud platforms". also those tutorials are 👌👌👌 - it would be great if you could contribute some of that content to a future |
it depends, most of my effort is in installing Kubernetes itself, and that is all customized to Jetstream, but that is outside of the scope presented here. |
With respect to customizations at the JupyterHub level (inspired by the comment about "specific to JetStream"), we maintain a few and have a few custom services that might be interesting, but I've never really felt like there was much demand or a just right place for me to document them. It would be great to have a proper place for those kind of things. |
@choldgraf this is possible today. https://gateway.dask.org/authentication.html#using-jupyterhub-s-authentication
@rkdarst Typically this does not come up. User directories are generally k8s volumes stored in block or NFS storage and mounted to each Jupyter session. So everyone is the I think in situations like you mention, or on HPCs/Supercomputers, the user level namespacing in Docker/containerd isn't particularly helpful as there is already mature user management. I guess this is why tools like Singularity are useful, they've cherry picked the container features that are useful like images, but ignored the features that are already mature on those systems like user and network namespacing. |
QHub uses full linux permissioning so we can have shared folders for different linux groups etc. We use libnss-wrapper to make this work. |
This proposal wasn't funded, but others were, which is great! On to the next call... |
Background
In several different spaces now we have seen people build technology to facilitate deploying JupyterHubs on Kubernetes alongside other applications. A few examples:
(I would also be curious to hear what @arnim or @bitnik or @manics or @sgibson91 have done with their deployments, if they've had similar challenges).
Similarities and differences
I think there are a few pieces that we tend to see across all these deployments: JupyterHub, BinderHub, Dask Gateway, Prometheus, Grafana. Then there are common needs across all of them: environment and user management, connections between these applications, etc.
The main differences are in the implementation - some use Terraform, some use Helm, some use a combination of the two. Some are more generic and user-facing, while others are bespoke for a specific group or project.
So it seems like many projects (including the JupyterHub project itself) have a common need around deploying JupyterHub deployments in a flexible manner, with minimal human intervention, alongside other applications.
Question
Is there an opportunity to coordinate and streamline some development and share infrastructure for this use-case across these projects. I think there would be value in a community-driven project that addresses this use-case, and that was flexible that some or all of these other projects could utilize this, rather than re-creating similar technology.
Would love to hear what people think. If we could identify a strong enough need, I bet we could also turn this into a funding opportunity to support work along these lines.
The text was updated successfully, but these errors were encountered: