Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kernelspec JSON schema #105
Kernelspec JSON schema #105
Changes from 1 commit
1765886
707e492
5c97fde
a55d6ed
ac52496
85949fd
16e52a8
6c9be20
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sold yet on repo-per-spec, and think the "big tent" repo, with common tooling and interlinked documentation will be more sustainable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that a repo-per-spec might be overkill. However, I would like to have some separation of concerns. The repo for the kernel protocol will also hold the roadmap and additional documentation about the "release" process of the protocol. Also the kernel protocol and the widget protocol are somehow orthogonal, I would find it confusing to have them in the same repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would agree with this. I think it would be more sustainable and easier for folks to follow if we had a single "Jupyter specs" repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we gather specs by topics or "thematics"? Specifications for the kernel protocol, the kernelspec and the connection file could live together in the
kernel_protocol
repo. Everything related to the widgets specification (for instance) could live in another one. This way we avoid the multiplication of specs repos, while keeping some separation of concerns. Also this would be consistent with the current situation: a kernel has to implement the kernel protocol, provide a kernel spec and handle a conection file, while supporting the widget protocol is optional.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, widgets are maybe an exception... but the knowledge of how
comm
works is not. At the end of the chain, something like the Jupyter Server API references almost everything each other... kernelspec just happens to be very low in the order, with only perhaps aknown-mimetypes.json
below it, as that is not a schema-level construct likeuri
.When changes occur, often these things will reference each other, and documentation and generated type packages should reflect the compatibility of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How comm works is part of the Kernel protocol, thus I agree the Widgets protocol references the Kernel protocol, but not vice versa. And I took the Widget protocol as an example, but we can consider other things, like the LSP, which is totally unrelated. I see this as a software distribution, we have low layers upon which are built upper layers, and there should not be circular dependencies. The Jupyter Server API you mentioned is a super high layer that references everything, however lower layers should not reference it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Notebook format, precisely nbformat v4, in metadata there is a
kernelspec
property, with schema defined here. Previously (in nbformat v3) this was named askernel_info
(side note: format description is outdated as it still mentionskernel_info
).Should the JEP also give mandate to update nbformat schema so that it refer to this new kernelspec a source of truth (using definition link)? In
nbformat
v4language
is not required.Can we describe advantages of moving kernelspec schema out of nbformat in the JEP to explain the reasoning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know actually, would it make sense to store the arguments used to launch the kernel in the nbformat for instance? If all the mandatory fields of the kernel spec make sense for the notebook format, then we should probably update it so that it refers this new kernelspec. Otherwise, maybe the name could be reverted to
kernel_info
or something else to avoid confusion? But that would be out of the scope of this JEP.The kernelspec is really about describing a kernel, and it is totally orthogonal to the notebook format. Kernel authors should not have to refer to the ntebook format spec to know how to implement a kernelspec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the "kernelspec" entry in nbformat.v4 is a misnomer. It is not a kernelspec—sure, it shares some of the same fields, but there is a subtle difference here.
The kernelspec file does not have a "name" field. The (locally) unique name for a kernelspec comes from the directory on disk where the kernel.json file is located—it is not stored in the kernelspec file.
In the nbformat, this "name" field is necessary to locate the kernelspec on disk (this has always been brittle, because it means this information becomes useless the moment the notebook leaves the current context).
It really should be called something like "kernelspec_info".
This would be great for notebooks that always run in the same Python/conda/virtual environment (this field isn't helpful for notebooks that move around, but that's a separate issue). I would be in favor of changing this field, but I think this is out-of-scope for this JEP. We should make a follow-up JEP to update nbformat if we'd like to pursue that.
I believe this is out-of-scope for the current JEP.
That said, they aren't entirely orthogonal. Yes, the kernelspec shouldn't depend on the nbformat, but the nbformat loosely depends on the kernelspec today. I think that's what @krassowski is alluding to... if we can drop nbformat's current conflicting definition of the kernelspec and point at this definition, we gain some standardization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend some top-level
definitions
to make individual pieces easier to reference in other schema.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't find a way of gathering pieces, they all appear independent to me. Having a definition per field would just add complexity and not ease their reference in other schemaa, right? Sorry for the naive question, but I'm not used to JSON schemas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The kernelspec, and specific fields within it, need to be referenced by other schema.
Have a look at, e.g. nbformat. If the kernelspec is formalized, then conventionally specs would like to point to specific meanings without pointing at a concrete locations:
vs
This becomes more relevant, once more complex patterns are used such as
additionalProperties
orpatternProperties
(useful for e.g. mimetypes).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed explanation, I will try to make the appropriate changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if this stays, should have some pattern. but as mentioned, hoisting this to and
$id
might be more robust over time.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this should be included in the first version of the kernelspec.
I believe the first published version should only include the spec currently documented in the protocol documentation.
Then, we can quickly publish a second version with kernel_protocol_version added as a key. That way, we don't immediately invalidate everyone kernelspecs.
Also, creates a circular dependency on JEP #66.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this actually isn't true, since "kernel_protocol_version" wouldn't be a required field. That said, we still would need to settle the naming in #66 to merge this. I think "kernel_protocol_version" is a fine name for the field, so maybe this isn't any issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually let's say that #66 depends on this one and will be updated accordingly (if needed) when this one is accepted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is not widely-used yet, could this already be further constrained to enforce namespacing: