Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support parsing of SQL code blocks in Markdown files #598

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

michael-the1
Copy link

@michael-the1 michael-the1 commented Jun 3, 2024

Closes #593

There are two caveats that would be good to consider:

  1. Because we parse to AST -> rebuild from AST, we can actually lose some trivial whitespace characters. mistletoe keeps all the important whitespace, but can toss out some unnecessary whitespace, such as trailing whitespace or double newlines. A good example is in the README.md of this very repo:
 ~/sqlfmt
❯ poetry run sqlfmt README.md
1 file formatted.
0 files left unchanged.
README.md formatted.

 ~/sqlfmt
❯ git diff
diff --git a/README.md b/README.md
index 7d292f9..5a3f457 100755
--- a/README.md
+++ b/README.md
@@ -9,7 +9,7 @@


 sqlfmt formats your dbt SQL files so you don't have to. It is similar in nature to black, gofmt,
-and rustfmt (but for SQL).
+and rustfmt (but for SQL).

 1. **sqlfmt promotes collaboration.** An auto-formatter makes it easier to collaborate with your team and solicit contributions from new people. You will never have to mention (or argue about) code style in code reviews again.
 2. **sqlfmt is fast.** Forget about formatting your code, and spend your time on business logic instead. sqlfmt processes hundreds of files per second and only operates on files that have changed since the last run.
@@ -138,7 +138,6 @@ Config for other integrations is detailed in the docs linked below:
 - [SQLFluff](https://docs.sqlfmt.com/integrations/sqlfluff)
 - [VSCode](https://docs.sqlfmt.com/integrations/vs-code)

-
 ## The sqlfmt style
 The only thing you can configure with sqlfmt is the desired line length of the formatted file. You can do this with the `--line-length` or `-l` options. The default is 88.

I think this is acceptable since it does not affect readability and should not affect output (i.e. when it's rendered to HTML).

  1. Nested sql blocks are not formatted. I think this is acceptable. See example:
``````markdown

# Hello

```sql
SELECT 1 -- this is not formatted
```

``````

@tconbeer
Copy link
Owner

tconbeer commented Jun 6, 2024

Thank you for this! I'm mostly away from my computer this week but I'll try to review next week.

Copy link
Owner

@tconbeer tconbeer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, thank you!

Just a couple of nits, honestly. Really well done with this. I really, really appreciate the tests in particular. GHA is down right now, so the checks won't run, but assuming those pass I think we can get this merged really soon.

Comment on lines 64 to 66
from mistletoe import Document
from mistletoe.block_token import BlockToken, CodeFence
from mistletoe.markdown_renderer import MarkdownRenderer
Copy link
Owner

@tconbeer tconbeer Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's wrap this in try ... except ImportError and either return the source or re-raise a SqlfmtError to create a nicer error message for users who don't have the extra installed. I know right now we shouldn't hit that codepath due to the Mode property, but would be good to safeguard against that changing in the future.

Copy link
Author

@michael-the1 michael-the1 Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the try ... except guard. I went ahead and subclassed SqlfmtError since I didn't see any bare raise SqlfmtError anywhere.

@michael-the1 michael-the1 requested a review from tconbeer June 12, 2024 09:20
@tconbeer tconbeer mentioned this pull request Jul 9, 2024
@tconbeer
Copy link
Owner

tconbeer commented Jul 9, 2024

@michael-the1 Sorry I dropped the ball on this. I'd love to get this finalized. Looks like there are quite a few failing tests and lints. Do you want to work through those? Let me know if you get stuck.

@michael-the1
Copy link
Author

michael-the1 commented Jul 10, 2024

@tconbeer Life can get busy which I totally understand.

The reason the tests are failing is because mistletoe is not installed in the test workflows. Should I add it to the dev or test group in pyproject.toml? Feels a bit... not neat 😅

@michael-the1
Copy link
Author

michael-the1 commented Jul 11, 2024

Ah, I see now that, in the Github workflow, dependencies are installed with poetry install --sync --no-interaction -E jinjafmt --only main,test. I'll change this to also install markdownfmt.

I also see some other errors in the static analysis and the primer workflows. I'll take a look at those tomorrow.

@michael-the1
Copy link
Author

michael-the1 commented Jul 19, 2024

@tconbeer I think I fixed all issues, but with two notes:

  1. mypy was giving some errors due importing mistletoe.
    • mypy --disallow-any-unimported will fail with the new mistletoe import. I removed the option from the Makefile.
    • I also set ignore-missing-imports for mistletoe (see docs). The alternative is to change the type hints to Any and work from there, but it's not my preference since we'd change the code and hurt readability just to wrangle with mypy.
  2. The code coverage check was failing, but I think it's related to a repository setting - something with the CC_TEST_REPORTER_ID env variable not being set (see workflow logs)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for formatting SQL code blocks in Markdown
2 participants