Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add language server (LSP) support in notebook server and jupyterlab-monaco #26

Closed
wants to merge 2 commits into from

Conversation

gnestor
Copy link

@gnestor gnestor commented Oct 24, 2018

Add language server (LSP) support to notebook server/jupyter-server and jupyterlab-monaco

Problem

There is an m * n complexity problem of providing a high-level of support for any programming language in any editor, IDE, or client endpoint (e.g. jupyterlab, jupyter notebook, nteract).

The Language Server protocol is a protocol for programming languages to provide code intelligence such as auto complete, go to definition, and find all references to client applications such as code editors (like VS Code) and code managers (like Sourcegraph and Sourcegraph for Github).

Once a language server is created for a programming language (e.g. Python language server), it becomes an m + n problem because each client application must only build an integration with the server (e.g. Python for VS Code) vs. building the code intelligence parts and the integration.

The overall complaint is that the JupyterLab/Jupyter Notebooks code editing experience is missing many of the luxuries of modern code editors and IDEs, like static code analysis for features like autocomplete for unexecuted code, type checking, linting, formatting, go to definition, and find all references. For a popular review of these complaints, see slides 60-71 from I Don't Like Notebooks.

Proposed Enhancement

The monaco-editor is an
open-source text editor written in Javascript and used in VS
Code
.

jupyterlab-monaco is a JupyterLab
extension that allows users to edit documents using monaco-editor.

monaco-editor ships with language features for
TypeScript, JavaScript, CSS, LESS, SCSS, JSON, HTML. In order to support other
languages (e.g. Python), it must connect to a language server (e.g. Python language
server).

TODO

  • Create a notebook server extension to allow clients to connect with language servers
    • Provide HTTP interface for clients to communicate with language servers via websockets
      • e.g. /lsp/python/ or /lsp/r/.
  • Add a new set of classes to
    @jupyterlab/services
    that provide a client-side interface to those endpoints?
  • Add LSP support to monaco-editor in jupyterlab-monaco
    • Using classes from @jupyterlab/services?

Notes

Libraries

Reference implementations

References

@ellisonbg @ian-r-rose @jasongrout

@gnestor gnestor changed the title Add proposal for LSP support in jupyterlab-monaco Add proposal for language server support in notebook server, jupyterlab, and jupyterlab-monaco Oct 24, 2018
@bollwyvl
Copy link

bollwyvl commented Oct 24, 2018 via email

Copy link
Contributor

@blink1073 blink1073 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks! I agree with @bollwyvl that a long term goal should be to be editor-agnostic, but using an existing implementation in the short term is nice too 😄

@gnestor
Copy link
Author

gnestor commented Oct 25, 2018

Agreed. I think that the @jupyterlab/services classes will act as an interface between the server extension and the lab extension:

jupyterlab-monaco -> @jupyterlab/services -> lsp server extension -> language server(s)

or

@jupyterlab/codemirror -> @jupyterlab/services -> lsp server extension -> language server(s)

Open source Sourcegraph is very interesting! This part looks relevant: https://github.com/sourcegraph/sourcegraph/blob/master/src/backend/lsp.tsx#L3

I guess that VS Code is another valuable reference implementation although at first glance I can't find a LSP proxy equivalent.

Finally, if we embrace LSP, we'd want to make the Notebook format
discoverable to language servers. No thoughts on how that might be
accomplished generally, though there might be some worthy experiments to be
done with python, as its machinery is fairly well exposed.

@bollwyvl Can you clarify? Would this provide intellisense when editing a notebook file in the editor (JSON)? If so, this may be a valuable resource: https://github.com/sourcegraph/sourcegraph-extension-api

@ellisonbg
Copy link
Contributor

Great, thanks for writing this up @gnestor

  • I too agree that it would be nice to have the core capabiilties in JLab be editor agnostic.
  • Given its scope, I think this server extension should go into a repo in the main jupyter org, alongside the new jupyter_server work. Thoughts on what we should call it? jupyter_language_server? jupyter_lsp?
  • I think the next step would be to spec out the REST API using swagger (in that new repo).

@gnestor gnestor changed the title Add proposal for language server support in notebook server, jupyterlab, and jupyterlab-monaco Add language server (LSP) support in notebook server and jupyterlab-monaco Jan 15, 2019
@gnestor
Copy link
Author

gnestor commented Jan 15, 2019

I too agree that it would be nice to have the core capabilities in JLab be editor agnostic.

I agree. However, I checked out https://github.com/brijeshb42/codemirror-lsp and unforauntely, it actually invisibly mounts a monaco-editor instance to the DOM and uses it to interface codemirror with language servers. So, even the codemirror integration is using monaco-editor under the covers. I think that we should depend on monaco-editor to get this working in jupyterlab and once a solid codemirror integration is available, we can step back and design a generic interface that both monaco-editor and codemirror can use.

Given its scope, I think this server extension should go into a repo in the main jupyter org, alongside the new jupyter_server work. Thoughts on what we should call it? jupyter_language_server? jupyter_lsp?

I say jupyter_language_server.

@willingc
Copy link
Member

alongside the new jupyter_server work

@ellisonbg Since we haven't had a larger team meeting in a while, what's the new jupyter_server work?

@gnestor
Copy link
Author

gnestor commented Jan 16, 2019

@willingc This is something that @Zsailer has been working on. I believe the intention is to decouple it from the notebook repo and standardize the way that clients interface with it (e.g. by inheriting it or by providing server extensions). @ellisonbg and @Zsailer can provide more context.

https://github.com/jupyter/jupyter_server

@rgbkrk
Copy link
Member

rgbkrk commented Jan 16, 2019

Thank you so much for raising this up as something generic to use across all frontends, this makes me pleased to review and contribute.

@parente
Copy link
Member

parente commented Jan 16, 2019

Jupyter Server Design and Roadmap Workshop is one of the community workshops accepted and planned for 2019: https://blog.jupyter.org/jupyter-community-workshops-a7f1dca1735e. I believe @lresende submitted the proposal, but I hope @Zsailer and others interested in jupyter_server will be able to participate.

@Zsailer
Copy link
Member

Zsailer commented Jan 16, 2019

@willingc @gnestor @parente

I believe the intention is to decouple it from the notebook repo and standardize the way that clients interface with it (e.g. by inheriting it or by providing server extensions)

Yes! I've been working hard on revamping jupyter_server enhancement proposal. As soon as I finish my full draft, I'll ping everyone on this thread. I'm hoping to have this ready very soon. I hope you all like it! 😃.

believe @lresende submitted the proposal, but I hope @Zsailer and others interested in jupyter_server will be able to participate.

👍 Yup! @lresende, @kevin-bates and I worked together to make the proposal happen.

To peak your interest, here are some highlights from my current draft:

  1. Split the server specific code and notebook frontend into two separate repos.
  2. Split extensions and server configurations into separate files, following a conf.d approach.
  3. Add migration app to make the transition easier on users.
  4. Add "namespacing" under the static endpoints (thanks @bollwyvl) for extensions to server static files.
  5. Create a new Jupyter application class for server extensions. This type of application allows you launch+config extensions from the command line (firing up a jupyter server along the way). This is similar to how jupyterlab works now.
  6. Turn classic notebook into server extension.

Obviously there is a lot of conversation that needs to happen once this new draft is posted, as this affects many projects around the jupyter ecosystem. I look forward to discussing this will you all soon. 😃

For some semi-recent discussion on this topic, see this issue.

@bollwyvl
Copy link

bollwyvl commented Jan 18, 2019 via email

@krassowski
Copy link
Member

krassowski commented Aug 20, 2019

I pursued a simple and dirty solution:

  • reuse as much of the lsp-editor-adapter by inheritance, replacing its components (autocompletion, tooltips) with the native JupyterLab widgets; it slowly evolves into a fully independent codebase
  • interfacing with the server is done via a web socket connection using jsonrpc-ws-proxy
  • for the notebook integration, create a virtual document being a concatenation of all the cells (maintaining the line number-code cell mapping and allowing to filter out cells starting with an IPython cell magic - this will be configurable) and implement the CodeMirror interface subset needed for the adapter to work; the code is not cleaned up yet but the demo works ok.

@TylerLeonhardt
Copy link

Hi there! I work on the PowerShell language server. We're unique in that since PowerShell is also a shell (in addition to a language), the interactive/REPL experience is also very important. To support this experience, we add on a message to the Language Server Protocol that we call evaluate. This message includes with it the code that you want to run... while the code is evaluating, if anything is written to the output/stdout, a message is fired called output that contains... well... the output that was just emitted.

The reason I bring this is up is because between us and Jypiter, we are more than one scenario for the language server protocol to support a native evaluate message of sorts and I'm wondering if that would be enough for you folks to support a "kernel" that is simply a language server...

There might be some conversation that could be had between Jupyter & LSP to better support each other. This plays nicely with both Jypyter adoption skyrocketing in the LSP space (python ext for vscode Jupyter support, Azure Data Studio top notch Notebooks experience) and the want to bring LSP goodness (all the symbols support and whatnot) to Jupyter scenarios.

@bollwyvl
Copy link

bollwyvl commented Dec 3, 2019

@TylerLeonhardt Interesting stuff! Specific to your case, you may wish to reach out to the folks on the powershell kernel!

More broadly, there's a bit of impedance between the JSON RPC LSP model and the mixed state model of the Jupyter Kernel Message spec. Can you interrupt an evaluate? Can you send another evaluate while the first one is still running? What if a language server wants to send something other than strings? What if it wants to ask you something on stdin? What if mutiple clients want to evaluate? Can you ask if an evaluate is still running? JKM was built execution-first, so we've gone down all these rabbit holes, and now can't change many of our rules: JKM has 32 message types, we've been trying to deprecate some of them, and few, if any, kernels implement all of them.

We're working some of these issues on jupyterlab-lsp (mentioned above), considering multiple clients talking to multiple servers, extension into the multi-user space, "middleware" to support language (server) quirks, etc. The implementation is getting more robust, and we hope at some future time to be able to propose it for inclusion into Jupyter(Lab|Server) core. But, last i checked, LSP defines about 120 message types, and seems to be growing at a fair clip. I think we handle maybe 10 at this point. So there's a lot to do!

a "kernel" that is simply a language server...

Indeed, I've been tinkering with the opposite direction, making a Language Server out of a kernel!

I think, today, one could definitely write a one-size-fits-most wrapper kernel that, with a little bit of JSON(-E) skullduggery, could wrap up an existing Language Server JSON RPC in enough metadata to make it play nice with the rest of the jupyter tooling without spreading too much headache in either direction. But as I understand it today, a second language server process would be required, as most Language Servers are not written to accept multiple clients with different capabilities, and no doubt a kernel would be a good deal different than, say, an editor.

There might be some conversation

In the short term, there's some funding available for just such outreach, and we might propose something as part of jupyterlab-lsp.

Hinted at in that issue, I think JKM, LSP would benefit better human- and machine-interoperable specifications, without reference implementation hangovers. The best model to follow here is probably DAP, but I would prefer something a bit more PR-able and composable than a monolithic JSON file (ahem notebooks).

want to bring LSP goodness (all the symbols support and whatnot) to Jupyter scenarios.

I've got a heavily WIP branch which adds (hierarchical) symbols, and indeed, it's really quite nice. We've got some plumbing to do be able to land that, but should be able to start going after those 100-odd missing messaged!

@krassowski
Copy link
Member

I just wanted to highlight that a new issue on the jupyterlab-lsp repository future in jupyterlab/team-compass, which may be of interest of the wider Jupyter community (especially of those who have engaged in this conversation on Language Server Protocol adoption).

Please feel invited to chime in and give your thoughts over here, even if your use case/interest is not specific to JupyterLab, but relevant to the the lsp extensions that we have developed.

@MSeal
Copy link
Contributor

MSeal commented Jun 9, 2020

Thanks for putting so much into the PoC and documentation here and elsewhere. Seems like there's momentum on this JEP atm to discuss more seriously. I think we need to assess if Jupyter is willing to support a second protocol for the JEP side of things here. A lot of good work was put into prototyping this out and making a working example and I don't want to discount that work but as I understand it, it would be a serious shift for a lot of libraries to adopt a new protocol and I would want to have an idea of remaining work and migration plans (not in full but at least a high level) if the community were to move forward with lsp based language interfaces.

In an attempt to better understand, here's my take on the proposal so far?

LSP brings:

  • Much better editor experience options
  • Consistency with other editor protocol patterns
  • Reduced friction in adopting new languages in Jupyter
  • More maintainers that write good code and are proactive in supporting their technologies

Cost of Core LSP adoption:

  • Kernel clients would need to opt into LSP pattern until it became defacto standard procotol
  • Jupyter maintainers would need to grow into LSP experts to help maintain services / debug issues
  • Jupyter protocols would need to be updated to account for new message patterns / requests, even if it's an isolated protocol addition
  • The major systems in Jupyter would need migration plans to adopt over time
  • Likely some major version changes needed in places to fully support?

Unknowns (for me at least):

  • How kernels would need to be changed to fully adopt LSP
  • What changes to the kernel protocol would explicitly be made to support LSP
  • Would anything changed here prevent headless execution (saw references across the linked references to running LSP interpreter as a widget for example which may not play nice without a browser or extra work for dual evaluation)
  • How much work would be needed in each of
    -- jupyter_client
    -- jupyter lab (seems like the PoC have most of this scoped out?)
    -- nbclient
    -- notebook (server)
    -- nteract
  • Is there still common ascent for adoption or any strong opinions against
  • What's the bad parts of LSP adoption tradeoff outside of making the initial code changes (maintenance burden for number of LSP commands?)

@blois
Copy link

blois commented Jun 9, 2020

As an observer, I'm also curious about overlap with the debugger- it sounds like both of these have backend components to replicate the notebook code as files on disk? Can these be unified?

Another potential cost of the LSP is supporting full notebook Python syntax- cell magics are called out in #26 (comment) but line magics are a bit more complicated. I've thought that a kernel API which provided the result of inputtransformer along with a source map of the transformations would be needed to avoid otherwise effectively deprecating line magics, but there may be alternate shortcuts.

@krassowski
Copy link
Member

@blois both magics (cell and line) are supported in jupyterlab-lsp, but the implementation (as all transformations) is on the frontend. Have a look here: https://github.com/krassowski/jupyterlab-lsp/tree/master/packages/jupyterlab-lsp/src/magics. The idea was that one could install additional extensions providing magics support for specific kernels/packages (such as rpy2), as many magics are defined in third part extensions and are not exclusive to a single kernel. Having JupyterLab in mind I designed it to be frontend (typescript) oriented, but now I see that for the wider community it might have been useful to have all of this logic on backend (not saying that this would be a perfect solution either)...

Kernels could provide simple transformers as JSON-packages (regular expression + name), but the problem is that one cannot transform more complex things with regular expressions alone...

@krassowski
Copy link
Member

krassowski commented Jun 9, 2020

@blois @MSeal just want to highlight that in the jupyterlab-lsp (PoC) implementation of LSP we went for the kernel agnostic approach, which has a benefit of working even if your kernel dies or resets. In fact in the very first PoC version the LSP servers were running from the local machine and we were hypothesising allowing users to connect to to third-party LSP server provider (like sourcegraph). This is very different from the debugger team approach.

Edit: of course, even if we were designing the final version to work without the kernel, the kernel could still provide extra features (such as the magics transformations etc).

Edit 2: I probably should highlight that what I and @bollwyvl referred to in the other thread as reusing kernel comms is distinct from the changing kernels comms (and actually does not require any changes in the jupyter comms specs at the moment)

@bollwyvl
Copy link

bollwyvl commented Jun 9, 2020 via email

@choldgraf
Copy link
Contributor

choldgraf commented Jun 10, 2020

Just a note from Jupyter Book's perspective, we support an enhanced flavor of markdown called MyST markdown that has all of the syntax needed for Sphinx features. We've already got an LSP implementation of the MyST markdown syntax (and a corresponding vscode plugin for it. It would be great to have LSP within Jupyter tools so that users could benefit from this there as well. (not to mention that there are LSPs for rST etc as well).

All of this is to say that I think in some cases, LSP is not just for developers, but also for people writing content with markup languages.

@bollwyvl
Copy link

@choldgraf vscode plugin for it.

Looking at that code, i was getting excited, as I've been looking for a robust, easy-to-install markdown language server. But alas, it appears to be An extension to An LSP Client, not An LSP Server that would work with Any LSP Client. Anyhow, it's also not published on npm, so it's hard for anybody outside the vscode walled garden to reuse it... and somebody could namesquat it right now, and you'd have no recourse.

To move any of that closer to An Jupyter Client, I'd probably start with a codemirror mode, which all of the Jupyter clients could then use. I charted a really dark path to using your tmbundle directly (spoiler: it involves wasm), but i haven't had the heart to warm it up in a while.

As for the multi-language embedding (neat!): that's what the magic/tranclusion stuff above is on about. The challenge is whether we get to a place where those kinds of rules can be declared and shared, rather than having to be reimplemented everywhere.

LSPs for rST

Again, on the lookout for an easy to install/extend one!

@captainsafia
Copy link
Member

nteract: does it want to be an IDE?

With regard to LSP-support, this is something that I would expect to have as part of the nteract desktop app (and the core SDK). I believe that the features that an LSP provides are useful, even in a non-IDE important.

On the other hand, things like debugging support don't make sense to include in the desktop app but we do have an issue to track adding support in the SDK so that other apps using our JS packages can integrate debugging.

What's the bad parts of LSP adoption tradeoff outside of making the initial code changes (maintenance burden for number of LSP commands?)

The LS protocol is evolving so both clients and kernels have to be mindful about staying on top of the protocol. IMO, its a fair price to pay for the benefits of consistency with other tools in the editor ecosystem.

@bollwyvl
Copy link

Thanks @captainsafia.

LSP-support, this is something that I would expect to have

Would the existence of a reference "LSP Proxy Kernel" to any number of Language Servers that spoke LSP on comm-per-server help move nteract closer to offering that to users? Or is it more likely to reuse all the vscode stuff on the node side and do a custom bridge to the chromium side?

@captainsafia
Copy link
Member

Would the existence of a reference "LSP Proxy Kernel" to any number of Language Servers that spoke LSP on comm-per-server help move nteract closer to offering that to users? Or is it more likely to reuse all the vscode stuff on the node side and do a custom bridge to the chromium side?

I suspect that the costs for both would be about the same.

All of the experimentation I have done has been around using VS Code's Node APIs to connect to a language server that I manually launch.

An LSP proxy kernel might introduce some other challenges that would typically be handled by the VS Code's LS APIs.

The implementation cost for nteract aside, I'm in favor with any implementation that is relatively client-agnostic. Having access to Node-specific APIs shouldn't be a deterrent to clients that might not be implemented in node. An LS-over-comms approach might make more sense for that.

@westurner
Copy link

westurner commented Jun 13, 2020 via email

@lresende
Copy link
Member

Once a language server is created for a programming language (e.g. Python language server), it becomes an m + n problem because each client application must only build an integration with the server (e.g. Python for VS Code) vs. building the code intelligence parts and the integration.

Considering the proposal will also start exposing portions of the JupyterLab logic as part of language servers, this would actually also help VS Code to adopt/expose these functionalities on their environment. Have we thought about the consequences to the Lab ecosystem, particularly when we consider the discussion we had in the JupyterLab vision for the next few years.

@bollwyvl
Copy link

start exposing portions of the JupyterLab logic

Thus far, we've tried to avoid co-mingling anything client-related in the server end of things: it depends on jupyter_server, basically works with notebook.

However, as the websocket pipe is not standardized to the extent that stdio and tcp are, it's dubious whether any other existing LSP clients would be able to reuse that part of it. And no serious takers have emerged to write an implementation for classic, nteract, etc.

Of course, it's not strictly possible to entirely divorce the language server discovery from client configuration, of course, and for any given language server, one will see see their long page of how to configure things for, usually, VSCode, neovim, then one or two other things, depending on the community of interest.

consequences to the Lab ecosystem

If you can expound on this some, that would help. But basically, we're trying to reuse existing, externally maintained capabilities, rather than hoping folks with re-implement them, per-syntax, just for Jupyter clients. An example is code formatting: while we the client we've implemented doesn't support it yet, once it can create and accept the proper LSP messages, then any syntax with a language server will be able to use it... and it will likely work exactly the same way as as any other system.

The more we can work at the protocol level and get better language features without worrying about "which client get it (first)," the better we will be able to offer users robust, open source tools.

@lresende
Copy link
Member

But basically, we're trying to reuse existing, externally maintained capabilities, rather than hoping folks with re-implement them, per-syntax, just for Jupyter clients.

Exactly, in Elyra's case, we were prototyping getting pipeline validation, syntax-check, etc all done via LSP, which then just also worked in VSCode, making the switch to VSCode, which in some cases provides a much richer environment, very easy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.