Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement passthrough configuration #25

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

akhmerov
Copy link
Contributor

@akhmerov akhmerov commented May 2, 2021

Closes #17, #24

So far this is an RFC. Does the overall idea seem reasonable, especially as far as removing of most defaults goes? I propose to remove those, so that the defaults can go to thebe repository directly.

TODO:

  • decide whether the approach is sound
  • decide what to do with kernel options
  • update the docs
  • update the tests

Copy link
Member

@choldgraf choldgraf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty reasonable to me!

}},
predefinedOutput: true
}}
{ json.dumps(thebe_config) }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we'd wanna do any data validation or anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think validation is a great idea, however the best approach to implement validation would be jupyter-book/thebe#398.

@akhmerov
Copy link
Contributor Author

akhmerov commented May 2, 2021

Kernel options configuration: decision needed

Right now we are reading the kernel name from the document metadata, like so:

meta = app.env.metadata.get(docname, {})
kernel_name = meta.get("thebe-kernel")
if kernel_name is None:
if meta.get("kernelspec"):
kernel_name = json.loads(meta["kernelspec"]).get("name")
else:
kernel_name = "python3"

The approach seems sound, however schema contains extra information.

  // Options for requesting a kernel from the notebook server
  kernelOptions: {
    name: "python3",
    kernelName: "python3",
    path: "."
    // notebook server configuration; not needed with binder
    // serverSettings: <skipped>
  },

I propose to use meta["thebe-kernel"] as a passthrough dictionary with name, kernelName and path. An alternative would be to consider string a special case, which then refers to name (or should it be kernelName?), however I think a direct passthrough is more reliable.

Comment on lines -79 to -83
cm_language = kernel_name
if "python" in cm_language:
cm_language = "python"
elif cm_language == "ir":
cm_language = "r"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Codemirror language is in principle kernel-dependent, however I would like to remove this bit. I think that thebe should configure the codemirror language by talking to the kernel instead. @choldgraf do you agree?

Also currently jupyter-book/thebe#241 seems to render this irrelevant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the main challenge here is that kernels can have all kinds of wacky names (eg, conda-python-3) , while codemirror only uses "canonical" names for languages (eg, "python") so this logic will need to.be somewhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once thebe is talking with the kernel and it's running, thebe can read from codemirror_mode kernel metadata key, without any config. This would avoid flaky heuristics, and would therefore be a more reliable approach. Or am I missing something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would require thebe to wait for the kernel to be ready and then update the DOM once that finishes, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this is poor UX. I would still prefer to make the logic more reliable. In principle, notebook metadata should already have the codemirror mode specified directly. Any idea how we could get it?

An alternative approach is to assume that the necessary kernel is reachable locally in the environment we use to build the docs, and we could inspect that kernel for metadata. Would that be OK?

@choldgraf
Copy link
Member

Quick thought on document metadata - we want this to work with notebooks that are read into sphinx (and have their metadata brought into the page). So we don't want thebe-specific metadata fields at the document level (unless it's for some functionality that is only specific to thebe)

@akhmerov
Copy link
Contributor Author

akhmerov commented May 2, 2021

Quick thought on document metadata - we want this to work with notebooks that are read into sphinx (and have their metadata brought into the page).

Wouldn't it be reasonable for the conversion tool to prepare these notebooks for thebe? Note how thebe needs more information than what the kernel knows about, namely path.

@choldgraf
Copy link
Member

What conversion tool are you talking about?

@akhmerov
Copy link
Contributor Author

akhmerov commented May 2, 2021

I see several scenarios:

  • A specific tool that generates doctrees from notebooks (myst-nb or nbsphinx) can write this metadata.
  • The user authoring the documents can manually provide the metadata.
  • Finally, another extension (e.g. jupyter-sphinx) can infer the metadata and provide it before sphinx-thebe reads it.

My point is mainly that thebe kernelOptions isn't the same as notebook metadata that other tools define, and some information may be lost if we try infer kernelOptions rather than provide another passthrough to thebe.

@choldgraf
Copy link
Member

My feeling is that other tools shouldn't have to rework their own metadata just to work with thebe, if that metadata already exists. Instead we should try to infer it with thebe since it doesn't rely on us getting other packages to change their behavior - that seems harder to me. But maybe thebe-config can be a way to set it explicitly, and if this isn't set then we try to infer it from document metadata?

@akhmerov
Copy link
Contributor Author

akhmerov commented May 2, 2021

My feeling is that other tools shouldn't have to rework their own metadata just to work with thebe, if that metadata already exists.

I totally agree, but it wasn't actually clear to me whether this metadata already exists. Perhaps you can help me out here—below is what I found.

Overall we need to determine 3 configuration options:

 // Options for requesting a kernel from the notebook server
  kernelOptions: {
    name: "python3",
    kernelName: "python3",
    path: "."
    // ...
}

kernelName

Right now sphinx-thebe reads kernelName from the doc metadata, and I believe this is where myst-nb places it. Is that correct?
Inferring the kernel name from the notebook metadata may face limitations down the line (jupyter-book/thebe#343), but it's certainly fine for now. If it was just this field, I'd be happy to leave this logic intact ✔️.

name

This one I don't actually understand. I see that thebe passes it to kernelManager from @jupyterlab/services, so it seems important, and different from kernelName. What is this actually? Can we somehow infer it? Right now it's left blank. Is it because it's unimportant and can be ignored?

path

This is the location where the kernel needs to be started. It seems important since the notebooks may use external assets, and I can imagine this path being document-dependent. For me this is the main uncertain part. Is it safe enough to assume it's the doctree original path? Do we know how to reach it on binder?

@choldgraf
Copy link
Member

I think that the most important is kernelName and name is not strictly required (but don't quote me on this). e.g., there's thebe docs that use kernelName without name.

for path I actually do think we will need sphinx-specific information. We need to know the path to the Sphinx root, relative to the repository root (e.g., docs/). Then we can use the page metadata to get the path to the page from the Sphinx root, and then create our own path variable as <path-to-sphinx-root>/<path-to-page>.

So I think we can infer kernelName from the document metadata, and then create the path field using an optional configuration variable along w/ the path to the page. Does that make sense?

@choldgraf
Copy link
Member

hey @akhmerov - is this something you're interested in seeing through to the finish line? Another issue popped up (#38 that would be solved by this as well)

@akhmerov
Copy link
Contributor Author

I will come back to this soonish™ (going through the beginning-of-semester pileup).

@joergbrech
Copy link
Contributor

I think that the most important is kernelName and name is not strictly required (but don't quote me on this). e.g., there's thebe docs that use kernelName without name.

I believe the thebe documentation is wrong here, see jupyter-book/thebe#349, #59. Since thebe 0.6.0, the kernel is apparently spawned based on name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Pass-through configuration directly to Thebe
3 participants