Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jupytext should refuse to open out of date paired text representations #63

Closed
mwouts opened this issue Sep 9, 2018 · 12 comments
Closed
Labels
help wanted Extra attention is needed

Comments

@mwouts
Copy link
Owner

mwouts commented Sep 9, 2018

Cf. Hari's question on medium: is it safe to uninstall jupytext ?

Obviously uninstalling is safe. But reinstalling after a while may not be. As text representations have priority over ipynb file, the user that has worked on the ipynb version without jupytext will get the out of date version from the text representation, when he decides to reinstall.

To avoid such a situation (which can be solved by: closing the notebook without saving, and deleting the text file), we should check file timestamps, and refuse to load inputs from text when text seems to be out of date compared to the ipynb file.

@boazbk
Copy link

boazbk commented Sep 10, 2018

This can also be an issue if one edits the same notebook from different machines, some might have jupytext installed and some might not. I would suggest you put in the meta data of both the ipynb and text file some hash/version number that would be a more reliable way than just looking at the filesystem filestamp. When the ipynb's version is higher than the textfile's, it should take precedence. (Or maybe there can be some dialog to ask if they should be merged together or not, if this is feasible.)

@mwouts
Copy link
Owner Author

mwouts commented Sep 10, 2018

Hello @boazbk , yes I agree, we need to find a reliable way to decide whether we can override the ipynb inputs with that of the text file. In which cases do you think timestamps would fail to identify most recent file? I've been told that make simply uses timestamps. And also timestamps seem convenient as they require no specific action by the user.

As for the action to take when text file timestamp is older than ipynb, I am in favor of refusing to open the paired notebook, with the following message: remove either file, or edit the text file to confirm that it is indeed up to date.

@mwouts mwouts added the help wanted Extra attention is needed label Sep 10, 2018
@mwouts
Copy link
Owner Author

mwouts commented Sep 10, 2018

@cclauss , thanks for your previous contributions. Would you have recommendations on this precise question? Are you aware of any better approach than timestamps?

@cclauss
Copy link

cclauss commented Sep 10, 2018

I have no knowledge on this topic. Sorry.

@boazbk
Copy link

boazbk commented Sep 11, 2018

@mwouts I was thinking of cases where for example the notebook is in dropbox or something like that, and was also hoping that it would be possible at least on the ipynb file to update this without user interaction, but I'm not a programmer.

@mwouts
Copy link
Owner Author

mwouts commented Sep 11, 2018

Notebooks returned by the content manager have a last_modified field which is a timestamp. This seems to be the timestamp used by Jupyter when warning the user that she/he is about to overwrite a more recent version of the notebook on disk, see this comment.

@mwouts
Copy link
Owner Author

mwouts commented Sep 11, 2018

@boazbk: well, I think you're right: using timestamps is not perfect. Actually Jupyter itself has a few issues on the subject - a few people report that the warning on overwriting is triggered with no obvious reason. Therefore I plan to

  • implement and test the check
  • and offer a configuration option... for no checking!

I do think the check is important. Overriding the inputs with an out of date content is not very nice to the user!

@mwouts
Copy link
Owner Author

mwouts commented Sep 11, 2018

I have implemented the check. Error message tries to be as explicit as possible. For instance:

[W 00:15:17.325 NotebookApp]
    jupyter.ipynb (last modified 2018-09-11 22:10:31.426083+00:00)
    seems more recent than jupyter.Rmd (last modified 2018-09-11 21:37:20.648049+00:00)
    Please either:
    - open jupyter.Rmd in a text editor, make sure it is up to date, and save it,
    - or delete jupyter.Rmd if not up to date,
    - or increase check margin by adding, say,
        c.ContentsManager.outdated_text_notebook_margin = 5 # in seconds # or float("inf")
    to your .jupyter/jupyter_notebook_config.py file

mwouts added a commit that referenced this issue Sep 11, 2018
@mwouts mwouts mentioned this issue Sep 11, 2018
@mwouts
Copy link
Owner Author

mwouts commented Sep 11, 2018

@boazbk , would you mind to try the new version? In order to trigger the message above I expect that you will have to edit (with a text editor) the ipynb file, to make it more recent than the text notebook.

pip install jupytext==0.6.4
# + restart jupyter notebook

Thanks!

@boazbk
Copy link

boazbk commented Sep 13, 2018

Will try soon, thanks!

@allenyllee
Copy link

I use git to version control my .ipynb and .py pairs.

Sometimes, I need to use git stash to temporally save all file changes, but in a situation that the .ipynb had some change while .py doesn't, i.e., without any code change, just execute .ipynb file then the meta data will be changed, but the paired .py doesn't change, when I restore from this stash, the .ipynb's timestamp will be the current time, but the paired .py's timestamp remain unchanged. This will trigger error loading notebook like below:

xxx.ipynb (last modified 2021-06-02 05:42:06.520687+00:00) seems more recent than xxx.py (last modified 2021-06-02 05:06:01.211872+00:00) Please either: - open xxx.py in a text editor, make sure it is up to date, and save it, - or delete xxx.py if not up to date, - or increase check margin by adding, say, outdated_text_notebook_margin = 5 # default is 1 (second) to your jupytext.toml file

It's very annoying. Maybe we should save the timestamp and file hash of last sync into the meta data of all paired files, e.g. the last_sync key in below:

#   jupytext:
#     formats: ipynb,py:percent
#     text_representation:
#       extension: .py
#       format_name: percent
#       format_version: '1.3'
#       jupytext_version: 1.11.2
#     last_sync:
#       timestamp: 2021-06-02 05:42:06.520687+00:00
#       file_hash: XXXXXXXX
#   kernelspec:
#     display_name: Python 3
#     language: python
#     name: python3

When we open an .ipynb file, it'll check the paired .py's timestamp and file_hash under the last_sync section, if match, just work; otherwise, pop the error message.

@mwouts
Copy link
Owner Author

mwouts commented Jun 4, 2021

Hi @allenyllee , this is a very old issue so I won't reopen it, but still I think we can do something for you use case - please see #799.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants