Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support text Jupyter notebooks created with Jupytext #8800

Open
owenlamont opened this issue Nov 21, 2023 · 10 comments
Open

Support text Jupyter notebooks created with Jupytext #8800

owenlamont opened this issue Nov 21, 2023 · 10 comments
Labels
notebook Related to (Jupyter) notebooks wish Not on the current roadmap; maybe in the future

Comments

@owenlamont
Copy link
Contributor

owenlamont commented Nov 21, 2023

I have a use case for Ruff and Ruff formatter that is a bit related to some of the other Markdown / Docstring feature requests but specifically I hoped to run Ruff and Ruff formatter on Jupyter notebooks that had been exported to markdown with Jupytext.

The company I'm at prefer converting notebooks to Markdown as it makes the notebook diffs much easier to read on Bitbucket (which doesn't support any notebook rendering/diffing like GitHub).

At first I noticed I could add markdown as a target file format for Ruff formatter and linter which got my hopes up that this would just work:

  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.1.6
    hooks:
      - id: ruff-format
        types_or: [python, pyi, jupyter, markdown]
      - id: ruff
        args: [--fix, --exit-non-zero-on-fix]
        types_or: [python, pyi, jupyter, markdown]

But when I ran Ruff I see it is failing to parse the markdown properly - I had hoped it would just run on the python comment code blocks in the same way it would parse Jupyter notebook cells and ignore all the other markdown content but its obviously trying to parse all the markdown, e.g.

image
@dhruvmanila
Copy link
Member

Hey, we currently don't have support for linting / formatting Python code in markdown blocks (#8237, #3792). I'll close this in favor of the markdown issue for bookkeeping purposes as I think that should solve this but correct me if I'm wrong here.

@dhruvmanila dhruvmanila closed this as not planned Won't fix, can't repro, duplicate, stale Nov 21, 2023
@dhruvmanila
Copy link
Member

Hmm, actually it would be a bit different as for markdown we wouldn't need to have the concatenated source code from all code blocks but if it's a notebook converted to markdown then I think it should have context from other code blocks? @owenlamont Do you think this is true?

Another solution currently that I can think of is to lint / format before converting it to markdown. I'm not sure how feasible this would be given my lack of knowledge about your setup.

@dhruvmanila dhruvmanila reopened this Nov 21, 2023
@owenlamont
Copy link
Contributor Author

Hi @dhruvmanila - yeah it would have to have the concatenated source code - I can see Ruff still tracks which code was in which Jupyter cell when raising warnings so if it could treat comment blocks exactly as Jupyter cells are treated that would be ideal.

As a work-around it could be exported to ipynb, linted and formatted, then re-exported to markdown - but that would be onerous. When working with Jupytext the notebook never gets persisted (in any permanent/visible way) as an ipynb - it gets loaded from Markdown and saved back to Markdown.

The ideal solution (from my perspective) would be to parse the YAML front matter of the Markdown, identify this as a Juptext generated Markdown, then recognise the code blocks need to be concatenated and treated as notebook cells. I totally understand though if this use case is too niche to justify the effort though. I can't speak much as to how many people use this format - as a repo jupytext is relatively popular (around 6k users - I recognise some relatively prominent Jupyter developers as contributors).

@dhruvmanila dhruvmanila added the wish Not on the current roadmap; maybe in the future label Dec 1, 2023
@tvatter
Copy link

tvatter commented Jan 26, 2024

There's a similar request for quarto notebooks (#6140), and generally for Python code included in Markdown code blocks (#3792).

@dhruvmanila dhruvmanila added the notebook Related to (Jupyter) notebooks label Feb 12, 2024
@dhruvmanila dhruvmanila changed the title Feature request: Lint and format Jupyter notebooks exported to markdown with Jupytext Support text Jupyter notebooks created with Jupytext Apr 26, 2024
@davidorme
Copy link

I think I'm looking at the same issue. We also use Myst Markdown notebooks in several projects and have been using the jupytext ability to pipe code through black to apply Python formatting:

$ jupytext --pipe black docs/source/users/pmodel/c3c4model.md 
[jupytext] Reading docs/source/users/pmodel/c3c4model.md in format md
[jupytext] Executing black -
reformatted -

All done! ✨ 🍰 ✨
1 file reformatted.
[jupytext] Writing docs/source/users/pmodel/c3c4model.md in format md:myst

We have that as part of a pre-commit hook to ensure that the code in our notebooks is properly formatted.

  - repo: https://github.com/mwouts/jupytext
    rev: v1.16.2
    hooks:
    - id: jupytext
      args: [--pipe, black]
      files: docs/source 
      additional_dependencies:
        - black==24.4.2 # Matches hook

I haven't been able to work out exactly what happens, but the jupytext.cli module provides a pipe_notebook function that is used to round trip something (I think it must be just the code cell contents?) through black.

@davidorme
Copy link

OK - so jupytext converts the notebook to percent format, which is a python file with the markdown content stored as comments. black can then run on the code alone and the format can be converted back.

@tovrstra
Copy link

tovrstra commented Jul 17, 2024

Probably related: Ruff does currently not see py:percent files as notebooks. For example, for py:percent files, #8590 is still unresolved. (I'm not sure if this comment belongs here. I'm happy to open a new issue for it.)

@dhruvmanila
Copy link
Member

Yes, that's true. Currently, Ruff only supports the official Jupyter Notebook format and none of the other formats.

@bluss
Copy link

bluss commented Jul 17, 2024

@tovrstra py percent belongs here because #11160 was redirected here.

@davidorme
Copy link

Just to add details about our use case.

jupytext allows you to pair notebook formats so that you can have a synchronised MyST .md and .ipynb files. It seems like one solution here is that ruff formats the .ipynb version and then jupytext can resynchronise the files.

However, we're not using paired notebooks and only have the MyST .md versions in our repo. That does mean that the files don't contain the outputs of running code, so we don't get the nicely rendered versions in GitHub, but we also avoid quite a lot of commit chaff from trivial changes to .ipynb files and only commit changes to the notebook source. So I guess with the existing functionality, we could convert files to .ipynb, format using ruff, convert back to MyST .md and delete the .ipynb formats.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
notebook Related to (Jupyter) notebooks wish Not on the current roadmap; maybe in the future
Projects
None yet
Development

No branches or pull requests

6 participants