Support text Jupyter notebooks created with Jupytext #8800

owenlamont · 2023-11-21T03:44:14Z

I have a use case for Ruff and Ruff formatter that is a bit related to some of the other Markdown / Docstring feature requests but specifically I hoped to run Ruff and Ruff formatter on Jupyter notebooks that had been exported to markdown with Jupytext.

The company I'm at prefer converting notebooks to Markdown as it makes the notebook diffs much easier to read on Bitbucket (which doesn't support any notebook rendering/diffing like GitHub).

At first I noticed I could add markdown as a target file format for Ruff formatter and linter which got my hopes up that this would just work:

  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.1.6
    hooks:
      - id: ruff-format
        types_or: [python, pyi, jupyter, markdown]
      - id: ruff
        args: [--fix, --exit-non-zero-on-fix]
        types_or: [python, pyi, jupyter, markdown]

But when I ran Ruff I see it is failing to parse the markdown properly - I had hoped it would just run on the python comment code blocks in the same way it would parse Jupyter notebook cells and ignore all the other markdown content but its obviously trying to parse all the markdown, e.g.

dhruvmanila · 2023-11-21T14:45:02Z

Hey, we currently don't have support for linting / formatting Python code in markdown blocks (#8237, #3792). I'll close this in favor of the markdown issue for bookkeeping purposes as I think that should solve this but correct me if I'm wrong here.

dhruvmanila · 2023-11-21T14:48:52Z

Hmm, actually it would be a bit different as for markdown we wouldn't need to have the concatenated source code from all code blocks but if it's a notebook converted to markdown then I think it should have context from other code blocks? @owenlamont Do you think this is true?

Another solution currently that I can think of is to lint / format before converting it to markdown. I'm not sure how feasible this would be given my lack of knowledge about your setup.

owenlamont · 2023-11-21T20:41:34Z

Hi @dhruvmanila - yeah it would have to have the concatenated source code - I can see Ruff still tracks which code was in which Jupyter cell when raising warnings so if it could treat comment blocks exactly as Jupyter cells are treated that would be ideal.

As a work-around it could be exported to ipynb, linted and formatted, then re-exported to markdown - but that would be onerous. When working with Jupytext the notebook never gets persisted (in any permanent/visible way) as an ipynb - it gets loaded from Markdown and saved back to Markdown.

The ideal solution (from my perspective) would be to parse the YAML front matter of the Markdown, identify this as a Juptext generated Markdown, then recognise the code blocks need to be concatenated and treated as notebook cells. I totally understand though if this use case is too niche to justify the effort though. I can't speak much as to how many people use this format - as a repo jupytext is relatively popular (around 6k users - I recognise some relatively prominent Jupyter developers as contributors).

tvatter · 2024-01-26T09:10:51Z

There's a similar request for quarto notebooks (#6140), and generally for Python code included in Markdown code blocks (#3792).

davidorme · 2024-06-06T10:39:09Z

I think I'm looking at the same issue. We also use Myst Markdown notebooks in several projects and have been using the jupytext ability to pipe code through black to apply Python formatting:

$ jupytext --pipe black docs/source/users/pmodel/c3c4model.md 
[jupytext] Reading docs/source/users/pmodel/c3c4model.md in format md
[jupytext] Executing black -
reformatted -

All done! ✨ 🍰 ✨
1 file reformatted.
[jupytext] Writing docs/source/users/pmodel/c3c4model.md in format md:myst

We have that as part of a pre-commit hook to ensure that the code in our notebooks is properly formatted.

  - repo: https://github.com/mwouts/jupytext
    rev: v1.16.2
    hooks:
    - id: jupytext
      args: [--pipe, black]
      files: docs/source 
      additional_dependencies:
        - black==24.4.2 # Matches hook

I haven't been able to work out exactly what happens, but the jupytext.cli module provides a pipe_notebook function that is used to round trip something (I think it must be just the code cell contents?) through black.

davidorme · 2024-06-07T09:16:38Z

OK - so jupytext converts the notebook to percent format, which is a python file with the markdown content stored as comments. black can then run on the code alone and the format can be converted back.

tovrstra · 2024-07-17T06:58:09Z

Probably related: Ruff does currently not see py:percent files as notebooks. For example, for py:percent files, #8590 is still unresolved. (I'm not sure if this comment belongs here. I'm happy to open a new issue for it.)

dhruvmanila · 2024-07-17T07:44:28Z

Yes, that's true. Currently, Ruff only supports the official Jupyter Notebook format and none of the other formats.

bluss · 2024-07-17T13:19:38Z

@tovrstra py percent belongs here because #11160 was redirected here.

davidorme · 2024-07-17T14:45:24Z

Just to add details about our use case.

jupytext allows you to pair notebook formats so that you can have a synchronised MyST .md and .ipynb files. It seems like one solution here is that ruff formats the .ipynb version and then jupytext can resynchronise the files.

However, we're not using paired notebooks and only have the MyST .md versions in our repo. That does mean that the files don't contain the outputs of running code, so we don't get the nicely rendered versions in GitHub, but we also avoid quite a lot of commit chaff from trivial changes to .ipynb files and only commit changes to the notebook source. So I guess with the existing functionality, we could convert files to .ipynb, format using ruff, convert back to MyST .md and delete the .ipynb formats.

dhruvmanila closed this as not planned Won't fix, can't repro, duplicate, stale Nov 21, 2023

dhruvmanila reopened this Nov 21, 2023

dhruvmanila added the wish Not on the current roadmap; maybe in the future label Dec 1, 2023

tvatter mentioned this issue Jan 26, 2024

Apply ruff to markdown code blocks #3792

Open

tvatter mentioned this issue Jan 26, 2024

Add support for Quarto notebooks #6140

Open

dhruvmanila added the notebook Related to (Jupyter) notebooks label Feb 12, 2024

dhruvmanila mentioned this issue Apr 26, 2024

Support for py:percent jupyter notebooks #11160

Closed

dhruvmanila changed the title ~~Feature request: Lint and format Jupyter notebooks exported to markdown with Jupytext~~ Support text Jupyter notebooks created with Jupytext Apr 26, 2024

davidorme mentioned this issue Jun 6, 2024

Simplify the jupyter kernelspec setup ImperialCollegeLondon/pyrealm#247

Merged

7 tasks

This was referenced Oct 16, 2024

Adding jupytext to pre commit to apply black to myst notebook code ImperialCollegeLondon/virtual_ecosystem#607

Merged

Fix notebook code formatting ImperialCollegeLondon/pyrealm#329

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support text Jupyter notebooks created with Jupytext #8800

Support text Jupyter notebooks created with Jupytext #8800

owenlamont commented Nov 21, 2023 •

edited

Loading

dhruvmanila commented Nov 21, 2023

dhruvmanila commented Nov 21, 2023

owenlamont commented Nov 21, 2023

tvatter commented Jan 26, 2024

davidorme commented Jun 6, 2024

davidorme commented Jun 7, 2024

tovrstra commented Jul 17, 2024 •

edited

Loading

dhruvmanila commented Jul 17, 2024

bluss commented Jul 17, 2024

davidorme commented Jul 17, 2024

Support text Jupyter notebooks created with Jupytext #8800

Support text Jupyter notebooks created with Jupytext #8800

Comments

owenlamont commented Nov 21, 2023 • edited Loading

dhruvmanila commented Nov 21, 2023

dhruvmanila commented Nov 21, 2023

owenlamont commented Nov 21, 2023

tvatter commented Jan 26, 2024

davidorme commented Jun 6, 2024

davidorme commented Jun 7, 2024

tovrstra commented Jul 17, 2024 • edited Loading

dhruvmanila commented Jul 17, 2024

bluss commented Jul 17, 2024

davidorme commented Jul 17, 2024

owenlamont commented Nov 21, 2023 •

edited

Loading

tovrstra commented Jul 17, 2024 •

edited

Loading