Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added cell-id proposal #62

Merged
merged 19 commits into from
Sep 25, 2020
Merged

Added cell-id proposal #62

merged 19 commits into from
Sep 25, 2020

Conversation

MSeal
Copy link
Contributor

@MSeal MSeal commented Aug 27, 2020

We propose to make cell-id an official part of the notebook format standard. The original outlining the pre-proposal is here: #61

@MSeal
Copy link
Contributor Author

MSeal commented Aug 27, 2020

We had a recurring call setup to discuss this proposal next Monday (8/31) at 9:30 AM Pacific (your timezone). Prior notes and call information is captured at https://hackmd.io/AkuHK5lPQ5-0BBTF8-SPzQ.

@ellisonbg
Copy link
Contributor

Wow, thank you for working on this @MSeal and @willingc - this is an extremely well written and researched JEP. Will dive into the details and review.

62-cell-id/cell-id.md Outdated Show resolved Hide resolved
@jhamrick
Copy link
Member

This is awesome, and something nbgrader done unofficially forever but that I've always wished was officially supported. Thanks for working on this!

62-cell-id/cell-id.md Outdated Show resolved Hide resolved
Copy link
Contributor

@manics manics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few small typos.

62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
willingc and others added 5 commits August 28, 2020 06:25
Co-authored-by: Simon Li <orpheus+devel@gmail.com>
Co-authored-by: Simon Li <orpheus+devel@gmail.com>
Co-authored-by: Simon Li <orpheus+devel@gmail.com>
Co-authored-by: Simon Li <orpheus+devel@gmail.com>
@Zsailer
Copy link
Member

Zsailer commented Aug 31, 2020

@MSeal and @willingc, Thank you for writing such a clear and well-articulated proposal. This was enjoyable to review.

Though I don't have formal approval power, you've got my vote 😃. Thank you for pushing this forward!

62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
willingc and others added 2 commits August 31, 2020 14:30
Co-authored-by: Jessica B. Hamrick <jessica.b.hamrick@gmail.com>
Co-authored-by: Jessica B. Hamrick <jessica.b.hamrick@gmail.com>
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Show resolved Hide resolved
62-cell-id/cell-id.md Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
4. UUIDs are one valid, simple way of ensuring uniqueness, but not necessary.
- Lots of large random strings in notebooks can be frustrating
- 128-bit UUIDs are also vast overkill for the level of uniqueness we need within a notebook with <1000 candidates for collisions. They make for opaque URLs, noise in the files, etc
- Human-readable strings are preferable defaults for ids that will be used in links / visible
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that, given we're recommending this behavior here, that we should implement it in the nbformat library (see https://github.com/jupyter/nbformat/pull/189/files#diff-61deae4d804a6b90b65fc417db8f0d09R38 )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be the follow-up to do so for that PR to be mergable (along with any other suggested changes) once the JEP is accepted.

62-cell-id/cell-id.md Outdated Show resolved Hide resolved
@afshin
Copy link
Member

afshin commented Sep 5, 2020

I am 👍 on this proposal. The only note I would like to raise is: perhaps if we are making a backward-incompatible change, we should be jumping to nbformat 5.0 instead of 4.5.

@minrk
Copy link
Member

minrk commented Sep 8, 2020

perhaps if we are making a backward-incompatible change

This is a backward-compatible change. 4.5 notebooks with cell ids can be read with current stable release of 4.4. The cell ids will be discarded when writing 4.4.

Copy link
Member

@minrk minrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry, forgot that my review was still pending.

👍 to this proposal, with various minor wording comments.

4. What if you cut-paste, and paste a second time?
- On paste give the pasted cell a different ID if there's already one with the same ID as being pasted. In this case the second paste needs a new id
5. How should loaders handle notebook loading errors?
- On notebook load, if an older format update and fill in ids. If an invalid id format for a 4.5+ file, then raise a validation error like we do for other schema errors. We could auto-correct for bad ids if that's deemed appropriate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible detail: we actually warn on validation errors, we don't raise. It's basically a hint that assumptions may not be satisfied and errors may occur, but we don't halt execution.

Screen Shot 2020-09-01 at 12 55 51

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh that's unfortunate that the application doesn't treat validation errors as errors, I'm not sure we'd change this proposal but can discuss in the PR if we want to adjust what is raised when errored. I was trying this out in classic and it does just warn, then save the file with an invalid format after :/

62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
62-cell-id/cell-id.md Outdated Show resolved Hide resolved
@afshin
Copy link
Member

afshin commented Sep 8, 2020

This is a backward-compatible change. 4.5 notebooks with cell ids can be read with current stable release of 4.4. The cell ids will be discarded when writing 4.4.

Thanks for the clarification, Min.

@willingc
Copy link
Member

Gentle nudge to the Steering Council. Thanks ☀️
@blink1073 @Carreau @damianavila @ellisonbg @fperez @ivanov @jasongrout @rgbkrk @Ruv7 @SylvainCorlay @takluyver

@fperez
Copy link
Member

fperez commented Sep 14, 2020

Thanks for the excellent JEP @MSeal and @willingc! I'm overall very much +1 on the idea, I only have one quick note - it might be worth mentioning what are the expected implications for interaction with notebook clients that may discard this field when saving, esp. any clients out there that might not rely on our own nbformat for file management. I don't know if it's worth at least a note of what the implications for users might be, and I don't know the current state of clients like pycharm or vscode in that regard.

But that's a very minor point, adding my official +1 to the overall proposal, and no worries if you feel it won't add much useful info. Just a suggestion. Thanks again for the great work!

@Carreau
Copy link
Member

Carreau commented Sep 14, 2020

Approved. Thanks for working on this.

@willingc
Copy link
Member

@damianavila @jasongrout @rgbkrk @Ruv7 @SylvainCorlay @takluyver Another gentle nudge since we would like to approve this ASAP before @MSeal goes on paternity leave. 👶

@choldgraf
Copy link
Contributor

just a procedural note - while we don't AFAIK have governing documents for JEPs that require a steering council vote, we do have fairly governing documents for changing governance itself, and those suggest that after 4 weeks (e.g., in 6 days) if 2/3rd of the votes are in favor then we can consider the motion passed. Not sure whether the latter applies to this situation but just adding it here as a reminder since it was a recent change to the governance docs

@willingc
Copy link
Member

Last call: @damianavila @jasongrout @Ruv7 @SylvainCorlay @takluyver

I plan to accept the JEP tomorrow (Thursday COB Pacific Time) and @MSeal and I will incorporate clarifications as discussed above. Thanks to everyone for your insights and feedback 💐 .

@damianavila
Copy link
Member

Thanks for the recurrent ping @willingc. I checked this one before (weeks ago) and I did not accepted it 🤦.

@willingc willingc merged commit bfcf834 into jupyter:master Sep 25, 2020
@willingc willingc linked an issue Sep 25, 2020 that may be closed by this pull request
@choldgraf
Copy link
Contributor

Yay 🌈

@MSeal
Copy link
Contributor Author

MSeal commented Apr 1, 2021

FYI jupyter/nbformat#217 was opened changing the default id generation approach based on jupyter/nbformat#216. I'm going to leave this PR open for ~24 hours so there's a chance for any additional input before we release. I recognize this pattern choice was not chosen in this original JEP and therefore is subverting the decision to not go with Option C for id generation somewhat, but the severity of the issue warrants action and there's enough consensus on the issue thread from active folks in the community that this seems a warranted step to avoid the potential insensitive to the nature of the generated ids. This doesn't impact the spec of the implementation of the feature in anyway, just the fall-through id generation approach for notebooks saved without application chosen ids.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: 4.5 Format Cell ID